Search
 New @ Now
Products
 FnTs in Business  FnTs in Technology
For Authors
 Review Updates
 Authors Advantages
 Download Style Files
 Submit an article
 

Web Crawling



Author(s): Christopher Olston;Marc Najork

Source:
    Journal:Foundations and Trends® in Information Retrieval
    ISSN Print:1554-0669,  ISSN Online:1554-0677
    Publisher:Now Publishers
    Volume 4 Number 3,

Document Type: Article
Pages: 72 (175-246)
DOI: 10.1561/1500000017

Abstract:

This is a survey of the science and practice of web crawling. While at first glance web crawling may appear to be merely an application of breadth-first-search, the truth is that there are many challenges ranging from systems concerns such as managing very large data structures to theoretical questions such as how often to revisit evolving content sources. This survey outlines the fundamental challenges and describes the state-of-the-art models and solutions. It also highlights avenues for future work.