News

Yahoo! Reaches for the Stars with M45 Supercomputing Project


Joining Forces

Faculty and students at the nation’s best universities are hungry for an Internet-scale computing environment, but it’s almost impossible to find this kind of computing power on a university campus.

Not any more. Yahoo! is bringing large-scale supercomputing to the academic research community through its newly launched M45 project. Named after a well-known open star cluster, M45 is a 4,000-processor supercomputer that’s one of the fifty most powerful systems in the world. The goal of the project: help academic researchers tackle some of the most complicated computing tasks known to humanity.

Why is the M45 initiative so unique? Unlike other companies and traditional supercomputing centers, which provide academics with computers for running software applications related to coursework, Yahoo!’s program pushes the boundaries of large-scale systems software research.

“This is a first-of-its-kind effort in the industry,” says Ron Brachman, VP of Worldwide Research Operations at Yahoo! Research and Head of Yahoo! Academic Relations. “By making the system open for experimentation and research at all levels, we are helping the worldwide research community get to the next level in its understanding of large-scale computing systems.”

The media has showered attention on the M45 project. It has been featured in Scientific American, Business Week, and ZDNet, among other outlets.

Carnegie Mellon University will be the first academic institution to benefit from the computing cluster, which boasts 3 terabytes of memory, 1.5 petabytes of storage, and a peak performance of more than 27 trillion calculations per second.

To bring this massive raw computing power to users' fingertips, the system runs a suite of open-source distributed computing software. The software features a fault-tolerant, distributed storage and computing platform called Hadoop, coupled with a user-friendly parallel programming environment called Pig. These technologies enable users to process massive amounts of information with relative ease. Working with the Apache Software Foundation, Yahoo! has played a central role in the development of Hadoop and Pig.

Yahoo! is excited to join forces with the academic community and share its technical leadership in distributed computing research and development. “Given our interest in open collaboration, we can all engage in research on a common software base,” Brachman says.

With the growing popularity of Hadoop, Yahoo! and Carnegie Mellon also plan to co-host a Hadoop Summit in the first half 2008, inviting major users, such as Facebook and the University of California, Berkeley, to participate in this open, collaborative community.

Carnegie Mellon will be the first of many participants in the M45 project. “This a major first step on the road to facilitating a worldwide open-source software research program in real-world, supercomputing environments,” Brachman says. “We are eager to reach for the stars with Carnegie Mellon—and the entire academic computing research community.”

How To Gain Access To The Cluster
Before making the cluster available to other universities for systems software and applications research, we would like to make sure the cluster works well and will support in a secure way the different organizations that will be using the system. In addition, as part of the allocation process, we will likely ask universities to submit a proposal, describing the proposed systems software and applications research they would like to perform on the cluster, and the justification for the resources required. We anticipate starting the request-for-proposal (RFP) process in Q2 or Q3 of 2008. If you would like to receive notification when the RFP process begins, please send an email to academicrelations@yahoo-inc.com with your contact information.