Crawler and IR Bibliography

This list should be considered a must-read for people who work on the LARM Crawling part. Add entries if you think they fall under this criterion. If you do so, add a link to CiteSeer if applicable.

Add summaries of the papers at will!

Crawling Strategies and Techniques

== Focused Crawling ==

(I haven't reviewed these yet)

Descriptions of Existing Crawlers

Crawler Implementations

Offline (Books)

Structure and Dynamics of the Web

Page Structure

Web Structure

Accessing the Web Graph


Mining the Hypertext


More about PageRank


Index Maintenance


Exploiting Web Structure

Other Bibliographies

Further Reading...

