Publications > Caching search engine results over incremental indices

Caching search engine results over incremental indices

Publication

Jul 19, 2010

Abstract

Web search engine must update its index periodically to incorporate changes to the Web. We argue in this paper that index updates fundamentally impact the design of search engine result caches, a performance-critical component of modern search engines. Index updates lead to the problem of cache invalidation : invalidating cached entries of queries whose results have changed. Naive approaches, such as flushing the entire cache upon every index update, lead to poor performance and in fact, render caching futile when the frequency of updates is high. Solving the invalidation problem eciently corresponds to predicting accurately which queries will produce dierent results if re-evaluated, given the actual changes to the index.

To obtain this property, we propose a framework for developing invalidation predictors and dene metrics to evaluate invalidation schemes. We describe concrete predictors using this framework and compare them against a baseline that uses a cache invalidation scheme based on time-to-live (TTL). Evaluation over Wikipedia documents using a query log from the Yahoo! search engine shows that selective invalidation of cached search results can lower the number of unnecessary query evaluations by as much as 30% compared to a baseline scheme, while returning results of similar freshness. In general, our predictors enable fewer unnecessary invalidations and fewer stale results compared to a TTL-only scheme for similar freshness of results.

Download

Venue:

SIGIR 2010

Type:

Conference/Workshop Paper

Authors:

Roi Blanco
Edward Bortnikov
Ronny Lempel
Flavio P Junqueira
Luca Telloli
Hugo Zaragoza

BibTeX

@inproceedings{ author = {Roi Blanco and Edward Bortnikov and Ronny Lempel and Flavio P Junqueira and Luca Telloli and Hugo Zaragoza}, title = {Caching search engine results over incremental indices}, booktitle = {Proceedings of SIGIR 2010}, year = {2010} }

- Help
- About our ads

Caching search engine results over incremental indices

Publication

Abstract

SIGIR 2010

Conference/Workshop Paper

Roi Blanco

Edward Bortnikov

Ronny Lempel

Flavio P Junqueira

Luca Telloli

Hugo Zaragoza

BibTeX