High-throughput experimentation has been a mainstay in pharmaceutical discovery since the mid-1990’s. In a 1999 C&E News article (C&EN, vol. 77, pp 33-48 March 8, 1999) this approach was hailed as the next great thing. Unfortunately, we chemists soon realized that quantity is no replacement for quality; a notable article in the WSJ “Drug Industry’s Big Push into Technology Falls Short,” was critical of this approach.
At the time, I was working on a DOE-funded project (DE-FC26-02NT41218) for high-throughput catalyst discovery for NOx catalysis in lean diesel engines, together with GM and Engelhard (now BASF). In practice, our method was not to generate 1000’s of samples and hope for the best but to screen fewer carefully selected samples quickly, and subject the “winners” to more sophisticated testing.
The approach employed in our NOx project was based on analysis of experimental data, design of experiment, and fitting response surfaces – and it worked. As pointed out in a recent BIOIT World article, however, experimental data alone are usually too noisy to build reliable statistical models. What’s a researcher to do? Molecular modeling, of course – hey I’m a modeller: you knew I was going to suggest that.
The key for success, it seems, is to employ a plurality of methods, both experimental and computational. Given even a modest amount of experimental data, you’ll need a database with decent search & query tools and basic statistical approaches like principle component analysis. But atomistic modeling is also important. Work by a number of research groups has shown that you can generate good predictive models from quantum mechanical methods (QM) for lots of different kinds of materials. (Keep in mind that these examples barely scratch the surface of the available literature).
But how do get to the point that anybody can make use of QM-based results? Doing these calculations typically takes a log time.
QSAR (Quantitative Structure Activity Relationship) is a terrific way to leverage QM results for complex research topics. These research groups followed the same basic procedure:
- Start with some experimental data
- Generate a statistical model
- Grind through a lot of calculations
- Forward the “winners” for experimental testing
You can see in the examples above that the approach can actually work. But how do you figure out what QM calculations to perform, and how do you create good statistical models? Well, that’s a story for next month.