Penguins in Sweaters, or Serendipitous Entity Search on User-generated Content

Oct 29, 2013

In many cases, when browsing the Web users are searching for specific information or answers to concrete questions. Sometimes, though, users find unexpected, yet interesting and useful results, and are encouraged to explore further. What makes a result serendipitous? We propose to answer this question by exploring the potential of entities extracted from two sources of user-generated content - Wikipedia, a user-curated online encyclopedia, and Yahoo Answers, a more unconstrained question/answering forum - in promoting serendipitous search. In this work, the content of each data source is represented as an entity network, which is further enriched with metadata about sentiment, writing quality, and topical category. We devise an algorithm based on lazy random walk with restart to retrieve entity recommendations from the networks. We show that our method provides novel results from both datasets, compared to standard web search engines. However, unlike previous research, we find that choosing highly emotional entities does not increase user interest for many categories of entities, suggesting a more complex relationship between topic matter and the desirable metadata attributes in serendipitous search.

  • ACM International Conference on Information and Knowledge Management (CIKM 2013)
  • Conference/Workshop Paper