ABSTRACT
The main purpose of this project is to design subject-oriented web crawler process which is also required to meet certain performance, taking into account the diverse needs of web crawlers.
Web Crawler uses the technology. of Breadth-first search.Web crawler uses multi-threaded technology, so that spiders crawl can have more powerful capabilities.Set connection time and read time of the web connection of the Web crawler , to avoid unlimited waiting.In order to meet different needs, so that crawlers can achieve pre-set theme crawling a specific topic.Research the principle web crawler and and realize the related functions.
Key words:Web crawler; subject-oriented; multi-threading
