Yahoo’s New Research Model

Recently we announced our efforts to make Yahoo a more focused company. This focus will let us accelerate the pace of innovation to make our products even better. We saw these changes as an opportunity to better align our research efforts, while preserving Yahoo’s culture of exploration and inquiry. As a result, we are reorganizing Yahoo Labs and moving forward with a new approach to research at Yahoo.

Stimulating Research Innovation with 100 Million Images

The Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset and the Multimedia Commons initiative are featured in the February edition of the Communications of the ACM magazine. The dataset and initiative aim to open and share community-contributed dataset features and ground truth annotations.

Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers

We are very proud to announce the public release of the largest-ever machine learning dataset to the research community. The Yahoo News Feed dataset stands at a massive ~110B events (13.5TB uncompressed) of anonymized user-news item interaction data, collected by recording the user-news item interactions of about 20M users from February 2015 to May 2015.

The 32 Days of Christmas: Using Community Patterns to Enhance Temporal Search Queries

Just in time for the holidays, we've researched the best way to improve the search ranking and relevance of temporal photos on the photo-sharing site Flickr.

Explore Anthelion, Our Open Source Focused Crawler

Recently we publicly released Anthelion, a focused crawler for semantic annotations in Web pages that steers in the direction of HTML pages–which are annotated with markup languages like RDFa, Microformats, and Microdata–to GitHub.

Yahoo Donates Servers to UCSD, UMass, USC for Research

As part of our continued effort to support cutting-edge scientific research in academia, Yahoo recently gifted 400 servers to three leading university computer science and engineering departments.

Science Powering Product: Yahoo Answers

Our Answering Complex Queries group in Haifa developed an automatic quality scoring system for the Yahoo Answers CQA site. Answers are now ranked by relevance. They explain how their science powers the product.

Birds, Apps, and Users: Scalable Factorization Machines

The Yahoo Labs Personalization Science team has developed a new, scalable recommendation system that allows researchers to model complex interaction features including side information on users, items and context, without sacrificing scalability.

Omid Architecture and Protocol

Our Search Systems team in Haifa, along with Yahoo Search in Sunnyvale, brings you the second installment of their blog series on Omid, an open source transaction processing system for Apache HBase.

Yahoo Donates Servers to Stony Brook University to Advance Computing Research

I am spending this year on sabbatical at Yahoo Labs after 27 years at Stony Brook. It is interesting to experience the life at a major technology company, and I am sure that much of what I learn here will inform my teaching and research when I return there in Fall 2016.