Issues with Privacy Preservation in Query Log Mining
Source:
Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques, Chapman and Hall/CRC Press (2009)
Abstract:
In this chapter we present and analyze the current state of the art
in query log privacy preservation. We focus on two complementary
issues: the privacy of users that submit queries, and the privacy
of websites featured in search results. We study vulnerabilities that
arise in query log publishing, specifically in Web search engine logs,
and discuss the effects that these have on the parties involved. Our
analysis gives an overview of anonymization techniques that have been
attempted and their weaknesses at preventing attacks on query log
data. Furthermore, our research studies the implications for public
data produced by query log data mining applications, and how it poses
a risk of involuntary private data disclosure.