How To Build A Web Crawler?

webI was reading an article the other day and I came across the term “web crawler”. The context in which it was used got me a little curious about the design of a web crawler. A web crawler is a simple program that scans or “crawls” through web pages to create an index of the data it’s looking for. There are several uses for the program, perhaps the most popular being search engines using it to provide web surfers with relevant websites. Google has perfected the art of crawling over the years! A web crawler can pretty much be used by anyone who is trying to search for information on the Internet in an organized manner. It is referred to by different names like web spider, bot, indexer etc. Anyway, that article got me thinking about building a web crawler. I just wanted to fiddle with it and see how much time it will take to get something working on my machine. It turned out to be quite easy!   Continue reading