Publication

WIRE: an open-source Web information retrieval environment

Source:

Workshop on Open Source Web Information Retrieval (OSWIR), Compiegne, France, p.27--30 (2005)

URL:

http://www.dcc.uchile.cl/%7Eccastill/papers/castillo_05_web_information_retrieval_environment.pdf

Keywords:

crawling

Abstract:

In this paper, we describe the WIRE (Web Information Retrieval Environment) project and focus on some details of its crawler component. The WIRE crawler is a scalable, highly configurable, high performance, open-source Web crawler which we have used to study the characteristics of large Web collections.