Mining associations from web query logs

Publication
Jan 1, 2006
Abstract

Abstract: A web searcher successfully querying for "skis" might also benefit from the results for "ski gloves", since these items are associated with the same task. However, after a successful query for "skis", "snowboard" may not be useful: these two queries refer to substitutable items, and a user who has the first often has no need for the second. Algorithms for identifying related phrases and queries have typically been agnostic with respect to substitutability and associativity, and identify a mix of both. In this paper we focus on mining associated-intent queries, by distinguishing them from substitutable-intent queries. We describe an algorithm which derives user intent associations from search query session logs, based on the assumption that there are three types of relationship between queries in sessions: similar queries, associated queries and unrelated queries. Our approach is to first remove the similar relationship, to help the associative relationship surface out of the noise. To evaluate our method, we labeled query pairs coming from this algorithm, as well as coming from an algorithm which focuses on producing similar relationship. We found that our method was successful at increasing the proportion of associated query suggestions.

  • ECML PKDD Workshop on Web Mining, Berlin, Germany

BibTeX