There is no disputing it: When it comes to driving user engagement and retention for media content online, nothing beats a personalized approach. At Yahoo Labs, we’re dedicated to presenting the most relevant and engaging content to each of Yahoo’s users with a robust scientific approach, immediately and at scale. And though we’ve been doing this for some time across Yahoo’s homepage and various properties, this year, we really set out to up our game. Our result–in conjunction with the Recommends Engineering and Product team–is called Yahoo Recommends.
Universal relevance: Certain types of content such as breaking news stories (e.g., a virus outbreak) are things that most users want to know about regardless of their topical interests. We try to measure this by the trendiness and popularity of a news item. We track dozens of time series on how users are engaging with each piece of content in the module, on a publisher site, on social media, and on search engines.
Contextual relevance: Users see our recommendations in different contexts. For example, if a user is currently reading an iPhone 6 review, it would certainly be meaningful to recommend other articles about smartphones or Apple products. In order to identify contextually-relevant content, we leverage our internal knowledge graph to measure content relatedness as well as a collaborative filtering style approach to uncover novel patterns about “people who read X who would also read Y.”
Personal relevance: As one of the top destinations on the Web, there is significant overlap in the user base between Yahoo and our partner sites. This puts us in a unique position of not suffering much from the cold start problem for user understanding. However, we also need to address a new challenge of catering to changing user interests as people move about from one site to another. For example, a person is interested in smartphones; for gadget reviews her top choice is cnet.com, but when it comes to news about the mobile industry, techcrunch.com becomes her go-to destination. In order to show the right content at the right place, we need to understand both her interests, and when and where she should be served a particular item. To do that, we leverage a type of machine-learned model known as matrix/tensor factorization to incorporate context awareness into user understanding.
In the end, nearly hundreds of millions of raw signals are produced, which are blended by a machine-learned ranking (MLR) function. The blending has to have an element of adaptive learning since the same signal may have different effects in different publisher modules. For example, the recency of content matters more for a sports news site than for a food recipe site. Therefore, we developed a powerful MLR platform using a distributed online learning algorithm which incrementally and rapidly self-updates hundreds of ranking models by learning from the logged events for each Recommends module.
Yahoo Recommends tests the limits of context-aware recommendations. The wealth of signals and data, the dynamic nature of the content that needs to be recommended, and the plethora of publisher modules takes personalization to a whole new level.