Viewport Time: A Robust, Scalable Method for Measuring User Attention in Online News Reading

What is perhaps the best way to measure user attention for online news reading? In a new post from Director of Research Mounia Lalmas, she argues for viewport time.

Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior

Who spends the most online? When do people shop most online? How do people spend their money online? Our research scientists explored these shopping behavior questions in their paper, "Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior," published this week at WSDM in San Francisco.

Yahoo’s New Research Model

Recently we announced our efforts to make Yahoo a more focused company. This focus will let us accelerate the pace of innovation to make our products even better. We saw these changes as an opportunity to better align our research efforts, while preserving Yahoo’s culture of exploration and inquiry. As a result, we are reorganizing Yahoo Labs and moving forward with a new approach to research at Yahoo.

Stimulating Research Innovation with 100 Million Images

The Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset and the Multimedia Commons initiative are featured in the February edition of the Communications of the ACM magazine. The dataset and initiative aim to open and share community-contributed dataset features and ground truth annotations.

Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers

We are very proud to announce the public release of the largest-ever machine learning dataset to the research community. The Yahoo News Feed dataset stands at a massive ~110B events (13.5TB uncompressed) of anonymized user-news item interaction data, collected by recording the user-news item interactions of about 20M users from February 2015 to May 2015.

The 32 Days of Christmas: Using Community Patterns to Enhance Temporal Search Queries

Just in time for the holidays, we've researched the best way to improve the search ranking and relevance of temporal photos on the photo-sharing site Flickr.

Explore Anthelion, Our Open Source Focused Crawler

Recently we publicly released Anthelion, a focused crawler for semantic annotations in Web pages that steers in the direction of HTML pages–which are annotated with markup languages like RDFa, Microformats, and Microdata–to GitHub.

Yahoo Donates Servers to UCSD, UMass, USC for Research

As part of our continued effort to support cutting-edge scientific research in academia, Yahoo recently gifted 400 servers to three leading university computer science and engineering departments.

Science Powering Product: Yahoo Answers

Our Answering Complex Queries group in Haifa developed an automatic quality scoring system for the Yahoo Answers CQA site. Answers are now ranked by relevance. They explain how their science powers the product.

Birds, Apps, and Users: Scalable Factorization Machines

The Yahoo Labs Personalization Science team has developed a new, scalable recommendation system that allows researchers to model complex interaction features including side information on users, items and context, without sacrificing scalability.