Biz & IT —

Crowdsourcing the cloud to find cures for rare and “orphaned” diseases

Volunteer computers—and tablets and phones—could cure diseases big pharma won't.

Crowdsourcing the cloud to find cures for rare and “orphaned” diseases

For over a decade, the University of California, Berkeley has used a virtual supercomputer built from borrowed processing time on over a million "volunteer" PCs across the Internet to process radio signals collected in the search for extraterrestrial intelligence. That supercomputer, called SETI@Home, has inspired a nonprofit to find treatments for diseases that pharmaceutical companies have ignored—diseases such as malaria, sleeping sickness, and Hodgkin's lymphoma—using the same platform to harness the power of the unused compute cycles on your PC (or tablet or smartphone).

The core technology used by SETI@Home, called the Berkeley Open Infrastructure for Network Computing (BOINC), has already been applied to a number of scientific endeavors beyond searching for extraterrestrials, including computational chemistry efforts on the IBM-sponsored World Community Grid to find cures for diseases such as AIDS (with the Scripps Research Institute's FightAIDS@Home project) and malaria. But the new effort, a non-profit called Quantum Cures, is bringing commercial software originally developed for Microsoft's Windows Azure cloud to the BOINC platform and isdriving it with a hybrid of open-source and commercial management software.

The computational technology behind Quantum Cures is a quantum mechanics/molecular mechanics (QM/MM) modeling system first developed by computational chemistry researchers at Duke University. Called Inverse Design, the software—commercially developed by TerraDiscoveries in partnership with Duke University and Microsoft—uses an engineering approach with the same name to search for potential drugs that will interact with proteins related to a disease.

Lawrence Husick, co-founder of Quantum Cures and TerraDiscoveries' founder and CIO, claims the TerraDiscoveries approach is a refinement of well-established "computational docking" modeling systems, which calculate the bonding affinity between the "target" protein, (such as a virus or mutated enzyme) associated with a disease and a candidate drug molecule. These systems commonly use electrostatic potential maps to determine how well-suited a particular molecule is for latching onto the target protein's active sites.

But Husick said that older electrostatic model-based systems were created with a large number of assumptions and generalizations built in because of the limits of the computing environments they were designed for. While they can perform fast scans of target proteins through large libraries of candidate molecules, docking models aren't always accurate because of the assumptions they make, and the results they generate don't always hold up well in real-world lab testing. The Inverse Design modeling system tries to reduce inaccuracies by adding a scoring function based on ab initio quantum chemistry methods developed by Professor Weitao Yang at Duke University. Inverse Design checks the quantum mechanical model and the molecular model of the "target" protein against millions of possible molecular "candidates," measuring their bonding affinity—the likelihood that the candidate will react with the protein and block its ability to attack the body's cells.

Husick told Ars in an interview that the Inverse Design software was originally developed to run in Microsoft's Windows Azure cloud. "For commercial projects, you simply spool up as many cores you need on the Azure cloud," Husick said. Each instance of the software takes the quantum mechanical molecular model of the target protein and a candidate molecule and calculates the potential bonding energy between the two. "We use a heuristic model to find local maxima in literally millions of molecules that we try in order to come up with a short list of good candidates, which are then further screened for 'druggable' properties. Can this be synthesized? Is this the sort of molecule that can be made into a drug that can be taken orally or by injection? Is it soluble enough? And, most importantly, does it have low toxicity—does it not bind to other unintended proteins in the body?"

For commercial pharmaceutical projects, this sort of supercomputing-on-demand project requires a considerable amount of compute time. "To run one target on the Azure cloud," Husick said, "where we can spin up between 4,000 and 6,000 CPUs with multiple cores—usually 8 cores each—it takes about 3 months of elapsed time to complete. Of course, you're paying per CPU hour there."

But the pharmaceutical industry isn't interested in paying for compute time—or any sort of research at all—on many rare and not-so-rare diseases, simply because the potential financial payoff for finding a drug is so low. While researchers working for universities, disease advocacy groups, and other nonprofit organizations have found thousands of target proteins for rare and "neglected" or "orphaned" diseases such as malaria, they have not had the resources to use software such as TerraDiscoveries' to look for good drug candidates.

That's where the Quantum Cures effort comes in. TerraDiscoveries is providing its software for free to the organization and is repackaging it for use on individual Windows, Mac OS, and Linux PCs as "screen saver" software. The software, which will be available starting in June for download, installs with user-level permissions and will allow individuals to set how much of their compute time is made available. Husick said that a version for Android and Apple's iOS will follow later in the year, allowing individuals to donate time on their mobile devices.

"What we send to the individual PC as a job is the computational structure of the protein target active site and the structure of a candidate molecule," said Husick. "We generate those candidates in the management system, so there is a never-ending list of potential candidates. And it's the job of that CPU to calculate the binding affinity between those two structures. So it's running the entire quantum mechanical structure simulation and returning an answer that's the energy of the bond between those."

A typical desktop PC will complete one modeling run in less than 12 hours, he said, depending on how much of a priority the software is given. "It may be you walk away from your computer at 6 at night and don't come back until 9 in the morning, and it may have completed one result," Husick said.

Listing image by Environmental Molecular Sciences Laboratory

Channel Ars Technica