>What is worse than academic groups getting scooped by DeepMind? The fact that the collective powers of Novartis, Merck, Pfizer, etc, with their hundreds of thousands (~million?) of employees, let an industrial lab that is a complete outsider to the field, with virtually no prior molecular sciences experience, come in and thoroughly beat them on a problem that is, quite frankly, of far greater importance to pharmaceuticals than it is to Alphabet. It is an indictment of the laughable “basic research” groups of these companies, which pay lip service to fundamental science but focus myopically on target-driven research that they managed to so badly embarrass themselves in this episode.
I wonder, is this because these methods are simply 'not good enough' to really have an application for medicine yet? I know nothing of the pharmaceutical sector, but saying they don't do basic research seems to stretch my world view given their vast profit baselines and government funding for exactly that purpose. Is there someone in the field who knows more?
For the general question of pharma investing in structure prediction, I think participants in CASP overestimate the importance of structure. It is nice to have and there certainly are structure-driven projects, but docking is so poor that often computational models of how a molecule binds, even when you have a structure of a protein, are unreliable and there are plenty of case studies of them sending teams in the wrong direction. This would only be worse in the case of AlphaFold since, as the post shows, GDT_HA is still quite poor.
From my experience in research, pharma has found that cellular models and phenotypic assays are far more meaningful for pushing projects forward. So, there is far more interest in applying machine learning to that data than for building protein structures. And those same methods can be applied to target-based projects regardless of whether you have structure. And regardless of how flexible your protein is. Huge portions of structure-based modeling has no ability to deal with protein flexibility, even if you know there are open and closed conformations of the protein or a loop that adopts half a dozen configurations.
Basically, academics working on folding often believe far too much in the importance of structure in drug discovery. The author appears to fall into that category.
The above graph is misleading in one way though because it is dependent on a specific metric, GDT_TS, which only measures gross topology. If we care about high resolution topology, which we certainly do for most practical applications, then a more appropriate metric is GDT_HA, and using it the picture looks a bit different:
[graph]
Still a good trendline, but much further down from a “solution”.
Another caveat is that both of these metrics measure global goodness of fit, which is important in terms of the basic scientific problem, but is often not indicative of functional utility. Local accuracy, for example the coordination of atoms in an active site or the localized change of conformation due to a mutation, is what is often sought when answering broader biological questions. Global metrics hide local discrepancy by diluting it in the sea of generally good agreement between experimental and predicted structures.
and
Even for MSA / family-level predictions, there is the question of desired accuracy, which hinges on the biological application. If one is predicting protein structures to ascertain their general fold for function classification, then high accuracy is unnecessary. If on the other hand the objective is to design small molecule drugs that bind proteins, which require ~1Å accuracy in the local pocket, it is unclear if we have made any detectable progress.
The author does make a point of discussing the question of what business does a team like DeepMind have researching the folding problem? The solution is of no apparent value to the parent company Alphabet, and yet they were still funded. Perhaps this has to do with the attitudes or values of "modern" tech companies? Historically, there seems to have been a cyclically nature to the volume of basic research in industry, peaking with Bell Labs, sinking with the rise of Welch, and now coming back with the Googs and Facebooks.
Well, one could put it cynically as Deep has developed a giant "cannon" that can be fired at different very hard but reasonably well defined problems. They've hit go, chess and similar things. Hitting proteins keeps their activity in the limelight. Essentially, DeepMind is demonstrating the value of the money Google paid to purchase it.
You seem to be implying that the world would be better if they didn't do this research, that doing it as a tech company to just apply the tech is immoral. Is that what you really mean?
So, have you refuted the argument in anyway or just downplayed it?
I would have preferred that text in the comment you responded to had less of a trash talking feel to it.
Also it’s a bit overplayed because although these are special moments, there’s no shortage of examples in history where a new approach in one field is applied to get results previously unattainable in another.
However there’s no need for rationalizations. The result doesn’t generalize to anyone is smarter than anyone else, or that it’s not important because it’s not the super most important drug discovery tool.
It’s just a special moment where a technique is impacting another area of research. Which is part of a special period in history where AI/ML is evolving from being literally a joke that could defund your research, to finally having a broad set of useful applications.
We should be glad to see as many of these moments as possible from any discipline to any other.
> academics working on folding often believe far too much in the importance of structure in drug discovery.
It reminds me of the 90's and early 2000's when many were convinced that sequencing the human genome would suddenly make drug discovery a simple problem.
I worked a few years in pharma R&D (Roche, largest R&D budget in this industry).
In a pharma setting, the 3D structure of a protein is mostly used to perform drug design (https://en.wikipedia.org/wiki/Drug_design#Computer-aided_dru...), i.e. trying to understand how a chemical will physically interact with a protein and thereby modify it's physiological function, in order to treat a disease.
The biggest problem comes from the fact that proteins are (1) non-static and very flexible and (2) don't exist in vacuum, they interact with a myriad of other entities in a living system. In other words, it's not because you know the structure of a protein and how to theoretically perturb it with a small molecule, that you have a drug. The large majority of structures predicted to be active against a protein target are not, when tested in a biological assay. The process helps, but ultimately it's a very empirical endeavor (test a ton of different chemicals in actual experiments, try to abstract some logic and move on from that). As a result, simply knowing the structure of a protein will not get you far down the line into finding a new drug.
On the resource topic: Even in a very large pharma setting, you will find only a dozen of scientists or so dedicated to the topic (out of tens of thousands employees), supporting many projects and with very little time to perform their own research. As a result, any team fully dedicated to the problem (like AlphaFold) can easily over-compete pharma. Most of the cost in drug discovery comes from dealing with patients and clinical trial. It's only at this stage that you'll know how your drug really works, and how it fits in the existing market and society (think of neuroscience for instance).
I don't want to undermine the protein structure field and AlphaFold results (it's fascinating), but pharma business model de facto relies very little on knowing a protein structure or not. It's also mostly useful to design small molecules, a class a bit out of fashion (biologics are the top-sellers in 2018, and new modalities are coming-up, like RNAs and gene editing for instance).
This is true, and big pharma will continue to invest very heavily here, since these medications contain living organisms and are much harder/more expensive to make into "generic" versions.
Biologics do not contain living organisms. For example monoclonal antibodies are considered biologics and are not alive, they can also easily be made “generic”.
I don't know why the author is so critical of his peers. DeepMind didn't come up with a novel biological insight; they simply pointed their unparalleled AI resources towards a deep learning problem. Is it really surprising that a team of world class deep learning scientists with virtually infinite resources managed to outperform pharmaceutical companies at a deep learning problem? I don't think so.
> DeepMind didn't come up with a novel biological insight; they simply pointed their unparalleled AI resources towards a deep learning problem.
What this essentially comes down to is a bunch of teams with high domain expertise and high technical capability was beaten by a team with higher comparative technical capability and lower comparative domain expertise. One has to wonder if one of the teams with higher domain expertise (ie work in the field every day) could achieve better results by improving their technical capabilities, or more aggressively applying their domain knowledge.
The question really comes down to whether DeepMind can beat any hard "game" (chess, go, protein folding etc) better than humans with deep domain expertise.
> What this essentially comes down to is a bunch of teams with high domain expertise and high technical capability was beaten by a team with higher comparative technical capability and lower comparative domain expertise.
The fact that the experts are entrenched in departmental trench wars and crippled by bureaucratic crap on every step is probably a contributing factor. If they were free to advance actual research instead of contributing to some middle manager's manager's agenda, thinks might look different.
That's what people outside drug development think. Meanwhile, 70 % of all drugs fail at Stage 2 testing. That's right, experienced pharmacologists at established companies go forward with compounds they believe will succeed, and still in 2 out of 3 cases they are not better than placebo.
I'd hasten to add it's the same in software. We think the experts are right on the edge of performance, accuracy, and success while using all the right tools - meanwhile from the inside, it's a shambles in all but the highest stakes environments with the most qualified teams.
While big pharma does do internal r&d, de facto, new drugs and drug strategies largely come by letting the government fund researchers (typically professors at large research institutes), encouraging these professors to do a spin-off (I deliberately don't want to use "startup") to absorb the scientific risk, followed by an acquisition for IP.
I think the key to the success was translating folding into deep learning problem and solving it (maybe there is still something to win by optimizing the models). The academia and industry were not treating folding as deep learning problem and they were using 'old' methods with slight improvements from year to year. The usage of deep learning was a breakthrough and will speed up research in this field.
Eh... no? If you look at abstracts, Zhang group (the second place) also used "deep-learning based contact predictor". Really, the usage of deep learning was kind of standard long before CASP13.
If you read OP, AlphaFold's main innovation is predicting distance instead of contact (regression instead of binary classification), which was also independently developed by Xu, and further, using probability distribution over distance instead of simply choosing the best distance.
Yes, you wonder how Alphafold compares with the combined wisdom of Foldit. You start out with a secondary structure prediction from Zhang lab and ask a panel of experienced humans to fold up the protein into a plausible tertiary structure. As a human you have a good idea what larger pieces fit where, but AI still has problems with the large picture, cf. the leopard sofa problem.
The Go community was rather forthcoming to what AlphaGo brought, because it provided new insights to a field that some of them study quasi-religiously.
Let's see how pharmaceutical companies are dealing with that, or the next fields DeepMind will enter and shake up like that.
The crucial observation is how their value chain works. Most of the value is in restricting the distribution of treatments. That, in turn, is achieved by getting medicines approved in the US.
There’s a lot of value near the end of the development cycle, and not near the start.
(I conclude from this that there remains a role for government in directly funding scientific research.)
You're right that the methods are simoly not good enough. Further a large fraction of pharma research has shifted to biologics which doesn't depend on structure prediction for new compounds, etc but they only need the hi res structure of their target protein (which more often than not is already done) to go on and generate the antibodies and make sure they direct the desired motif on the target.
I think you have a point. OP mentions "target-driven research". If you have a fixed target, you can do crystallography and get the structure directly, you don't need structure prediction. That is, I think pharma's core interest is closer to particular structures, not a general method to predict structures.
Elucidation a given protein’s structure is still a fraught process. Maybe the technology has improved, but when I was in that fiel 5 years ago, the going rate was “One PhD” per structure, i. e. 2 or t3 years of tinkering, with no guarantee of success.
By referring to structures per PhD, it is clear you are influenced by academia, where that is more true. But academia is more concerned with solving novel structures than industry.
In industry, often we have structures of related proteins, giving a couple advantages:
Template-based modeling is feasible, a category that AlphaFold didn't win because they didn't use templates; using templates gives better structure prediction than what their free modeling can do.
Crystallography conditions are often similar between related proteins and techniques such as molecular replacement make it so we can solve the phasing problem easily, which is often a roadblock as well in academia.
I wonder, is this because these methods are simply 'not good enough' to really have an application for medicine yet? I know nothing of the pharmaceutical sector, but saying they don't do basic research seems to stretch my world view given their vast profit baselines and government funding for exactly that purpose. Is there someone in the field who knows more?