Yahoo Expands Its M45 Cloud Computing Initiative

Nov 18, 2010

Yahoo Expands Its M45 Cloud Computing Initiative, Adding Top Universities to Supercomputing Research Cluster

Today, as scientists at the top US universities extend their research initiatives to the new frontiers of computing, Yahoo is proud to announce the expansion of its M45 academic research initiative to include four additional marquee universities: Stanford, the University of Washington, the University of Michigan at Ann Arbor, and Purdue. These schools join Carnegie Mellon University, Cornell University, the University of California at Berkeley and the University of Massachusetts Amherst on the supercomputing cluster, which brings a unique Internet-scale computing environment to academic researchers. Originally launched in November 2007, Yahoo and its M45 program are providing universities the opportunity to conduct research otherwise impossible without the power and speed of a supercomputing resource, which consists of approximately 4,000 processors. Hadoop, the open source technology at the epicenter of big data and cloud computing, is the core data analysis technology used across Yahoo, and is used by all the universities participating in the M45 research initiative. Yahoo benefits from university contributions to the Hadoop code base as well as through insights from cutting edge research initiatives conducted on Yahoo’s M45 supercomputer. Examples of academic research conducted on the M45 include two of the world’s largest knowledge acquisition research projects – Never Ending Language Learning System (NELL) at Carnegie Mellon University and KnowItAll at the University of Washington. One of the hottest research trends at universities today is the merging of mobile computing and cloud computing. By expanding the M45 platform to support both mobile and cloud computing research, Yahoo is enabling top tier universities to tackle some of the industry’s most critical computing challenges. Additional university research projects facilitated on Yahoo’s M45 supercomputer include
  • Carnegie Mellon – performing research in large-scale graph mining (graphs with billions of nodes), text search and analysis, statistical natural language processing, analysis of media Internet traffic, statistical machine translation, learning to read the Web and file systems research.
  • Cornell – exploring the use of advanced methods from computer science to help solve environmental and broader sustainability challenges. As part of its Citizen Science research program, Cornell researchers are exploring methods to enable a smart phone application to enter bird observation data into the cloud.
  • Purdue – combining the power of mobile and cloud computing to develop a context-aware navigation system for blind and visually impaired people. Researchers also plan to investigate topics in cloud data privacy, information retrieval and the automatic off-loading of computation from mobile devices to the cloud.
  • Stanford – merging large-scale cloud computing techniques and advanced statistical machine learning methods to analyze the vast amount of text, image and network data now available, on the web and elsewhere. The goal is to achieve a new level of understanding of the semantics latent in various media, attaining systems with greater artificial intelligence.
  • University of California at Berkeley – analyzing social networks and studying population genetics, as well as testing new architectures for collecting traffic data from GPS-equipped mobile phones and estimating traffic conditions in real-time, analyzing climate-change satellite data, prototyping new scientific applications and improving cluster scheduling and reliability.
  • University of Massachusetts Amherst – investigating efficient inference of the relations among documents and passages in a large collection of books.
  • University of Michigan at Ann Arbor – profiling and understanding MapReduce job execution, optimizing energy consumption and performance for heterogeneous MapReduce workloads, performing large-scale language model analysis and investigating how to offer a geo-distributed service that can be easily accessed through mobile phones for content distribution, computation offloading and other services.
  • University of Washington – studying scalable scientific data management and large-scale knowledge acquisition, and building a large knowledge base that can be queried both by Web and mobile phone.

Yahoo is also a founding member of the Open Cirrus Cloud Computing Testbed and the Open Cloud Consortium, both facilitating scientific research in the cloud. Combined, these programs bring the collective wisdom of many of the world’s top scientific researchers to Yahoo’s industry-leading research portfolio at Yahoo Labs.