Clique Analysis of Query Log Graphs
Source:
SPIRE, Springer, Melbourne (2008)
Abstract:
In this paper we propose a method for the analysis of very large graphs obtained
from query logs,
using query coverage inspection. The goal is to extract semantic relations
between queries and their terms. We take a new approach to successfully and effi
ciently cluster
these large graphs by analyzing clique overlap and
{\em a priori} induced cliques. The clustering quality is evaluated with an exte
nsion of
the modularity score. Results obtained with real data show that the identified
clusters can be used to infer properties of the queries and interesting semantic
relations between them and their terms.
The quality of the semantic relations is evaluated both using a tf-idf based sco
re and data from
the Open Directory Project. The proposed approach is also able to identify and f
ilter out
multitopical URLs, a feature that is interesting in itself.
Download: