Scientists Unfolding Protein Mystery, Fighting Disease with Windows Azure

REDMOND, Wash. – June 14, 2011 – Cloud computing is helping biologists uncloud one of nature’s biggest mysteries: proteins.

Microsoft has partnered with the University of Washington’s Baker Laboratory, one of the world’s top computational biology labs, to give scientists access to some high-caliber computing power. That, in turn, helps them explore and understand proteins, which could eventually lead to thwarting everything from Alzheimer’s to Malaria, and from cancer to salmonella.



Scientist Nikolas Sgourakis (left) is using Windows Azure to boost his protein folding research after father and son Marlin and Chris Eiben (center and right) helped establish a partnership between Microsoft and the University of Washington’s Baker Laboratory, the world’s top computational biology lab.

Proteins turn from random coils into functional three-dimensional structures by folding themselves. Once folded, proteins become the building blocks and workhorses of the body. Some of the most harrowing modern diseases are linked to this process and when proteins fail to fold correctly.

Having powerful, affordable computing available on-demand speeds up the rate at which scientists are able to get results from experiments, thereby increasing the pace of medical advancement.

For Nikolas Sgourakis, that is both his motivation and his goal.

“If you can understand protein structures, at some point you will be able to interfere and make sense of mutations that cause diseases,” said Sgourakis, a visiting scholar at the Baker Lab who is using computational modeling to try to help solve the mystery of what proteins look like up close. “This body of work will motivate additional studies and pave the way for very exciting work to be done in the field.”

The Microsoft-Baker Lab relationship has already yielded some powerful results in the area of protein folding in a partnership that got its start as a family affair.

Earlier this year Microsoft IT’s Marlin Eiben was looking for a project to help demonstrate the “sheer computing power” of Windows Azure. After searching high and low for possible projects, he got the idea for partnering with the Baker Lab from talking to his son, a research technologist there.

“We wanted a demonstration project that not only showed how Azure worked, but something that would make a significant difference,” Eiben said. “Through my son’s connection I was aware that this is cutting-edge science, and it seemed a natural application for massive computing power.”

After Eiben got the green light at Microsoft, father and son approached David Baker, the lab’s principal and namesake, who put them in touch with Sgourakis. Eiben also partnered with Dennis Gannon in the Microsoft Research Extreme Computing Group, who supplied the needed Windows Azure resources via Gannon’s cloud computing research engagement project.

Sgourakis earned his degree from the University of Athens in Greece and is doing post-doctoral work at the University of Washington. He is, along with many others at the Baker Lab, trying to use complex algorithms to determine the structure of proteins.



Sgourakis’s research seeks to unravel some of the mysteries behind the structure of proteins, and how bacteria such as salmonella transfer material into healthy cells using a “needle.” The spirals shown here are a preliminary model of a salmonella “needle”; scientists are using algorithms and cloud computing to attempt to get a high-resolution of its structure.

The structure and behavior of proteins is a bit of a mystery; they don’t “present well,” Sgourakis said. Even examining proteins with the most powerful microscopes and nuclear magnetic resonance x-rays gives only a general idea of what they look like, and in order to master those, scientists need to get closer.

Sgourakis is currently researching salmonella. When salmonella and other diseases infect someone, they do so through a structure called a “needle.” The disease uses the needle to open a hole in the healthy cell and to insert material. Sgourakis wants to know more about the structure of this needle. Experimental data exists, but is sparse, and he and others have developed algorithms that let them do millions of calculations to troubleshoot and fill in the gaps.

Enter Windows Azure.

To deploy and test the lab’s software on Windows Azure, Eiben enlisted the help of his Microsoft IT colleagues Pankaj Arora and Chris Sinco. The two architected, tested and helped scale out a solution. Because the lab was behind one firewall and Microsoft another, Arora even set up a server under the bed in his home to create and test the Azure deployment.

“What’s interesting about this is that it’s historically not the traditional Azure scenario. More traditional scenarios are Web startups, hosting content, websites and business applications. This really shows the versatility of Azure as a platform,” Arora said.

Scientists often need heavy computing power for short bursts while they run an experiment, but that kind of massive computational power is not always easy to come by – even temporarily. Often, they must either pay for expensive infrastructure, or depend on unreliable, slow volunteer computing power donated from Internet-connected computers around the world.

Biologists, especially those studying protein folding, have started to embrace an Internet-based public volunteer computing where people from around the world download software and allow the researcher to tap into their computing resources to help power experiments. It’s a similar model to the one used in exploring another next frontier, the project Search for Extraterrestrial Intelligence (SETI).

“Normally, this work would be shared by thousands of private machines owned by people who had donated computing time,” Eiben said. “Among these thousands, someone in Helsinki might offer time, and someone in Sao Paulo, but with Windows Azure Nikolas can get his results much faster and more reliably – you know, the person in Helsinki may shut his or her machine off and go on vacation for three weeks.”

Sgourakis’ “benchmark” Windows Azure experiment used 2.5 million calculations to essentially check the algorithms and process that he will use. He said the experiment – which used the equivalent of 2,000 computers running for just under a week – was a success. Everything checked out. The second test, the one to compute properties of the salmonella “needle,” runs this week and will take a similar number of computations.



Microsoft IT employee Pankaj Arora helped the Baker Laboratory go to the cloud, even running a server under his bed in his apartment during testing.

“We really need powerful computers to perform these computations. This is sort of pushing the limit of what can be done nowadays with this amount of experimental data,” Sgourakis said. “For us it’s highly needed, and to the community, solving a structure that has been pursued by many laboratories for many years is going to be very important. It’s going to be a major breakthrough.”

And will Sgourakis and his fellow researchers find the key to unlocking how proteins look and act in detail?

“I’m confident we will,” Sgourakis said. “I think if you ask the right questions, you get the right answers. I feel this is the right way to do things.”

Eiben said the Microsoft-Baker Lab partnership is a good example of providing an important tool to a “world-class researcher with a world-class problem.”

“This field requires massive amounts of computation power. Despite fact that it already has a lot, it needs more,” Eiben said. “This should accelerate knowledge being generated on how proteins are shaped. It means putting the jigsaw puzzle together, and with more and more things in place you can go faster and faster.”

Sgourakis said Windows Azure has been “invaluable” to his research.

“This interaction with Microsoft has shown me the power of the cloud. This whole idea behind computing on demand could be very useful for scientists like myself who don’t have the money for on-demand computing time, but who need to get answers right away,” he said. “It’s a most powerful tool that we need to perform our research, and having a resource like Azure available to do groundbreaking work right away is very encouraging to me.”