Mining Broad Latent Query Aspects from Search Sessions
Source:
Fifteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France (2009)
URL:
http://www.ideal.ece.utexas.edu/~kunal/papers/kdd2009-broadaspects.pdf
Abstract:
Search queries are typically very short, which means they are often
underspecified or have senses that the user did not think of. A
broad latent query aspect is a set of keywords that succinctly represents
one particular sense, or one particular information need, that
can aid users in reformulating such queries. We extract such broad
latent aspects from query reformulations found in historical search
session logs. We propose a framework under which the problem of
extracting such broad latent aspects reduces to that of optimizing a
formal objective function under constraints on the total number of
aspects the system can store, and the number of aspects that can be
shown in response to any given query. We present algorithms to find
a good set of aspects, and also to pick the best k aspects matching
any query. Empirical results on real-world search engine logs show
significant gains over a strong baseline that uses single-keyword reformulations:
a gain of 14% and 23% in terms of human-judged
accuracy and click-through data respectively, and around 20% in
terms of consistency among aspects predicted for “similar” queries.
This demonstrates both the importance of broad query aspects, and
the efficacy of our algorithms for extracting them.