Current costs of developing a new drug are several hundred million dollars. Good candidates are scarce. For example, by screening 10s to 100s of thousands of natural products and other chemicals, new anticancer agents emerge at less than one per year. Computation is anticipated as a powerful aid to designing and prescreening new candidates. This approach requires fundamental structural data on potential targets and the agents that may affect them. High-performance computer models are essential for analysis of experimental structural data to produce detailed molecular models, to molecular mechanical/dynamic exploration of target structures, to chemical structure of small drugs, and to interaction between drugs and targets. All of these methods will require PetaFLOPS capability to achieve reliable prediction, design, and testing.
Structure determination requires computation in solution of the phase problems of x-ray crystallography and in distance geometry calculation for magnetic resonance. Computational chemistry is needed to understand the structure of drugs, how they bind to targets at the atomic level, and for details of electronic structure.
As examples, consider the protease of HIV and its inhibitors. This
enzyme is required for mature, infectious virus to be produced, and is
thus a candidate for drug targeting. It is being studied
extensively. At this stage, computing speeds are on the order of
fold slower than desired. Typically,
s of real time requires 100
CRAY Y-MP hours. We want to cover
of chemical time.
PetaFLOPS performance would permit electronic calculations for the entire protease (at moderate levels of theory) and very detailed calculations for drug-sized molecules.
The calculations currently scale as for number of atoms in
molecular mechanics calculations, with various heuristics to reduce
from this upper bound. Electronic structure requirements vary
depending on the level of theory, but range from
upward. Memory
requirements scale as
for molecular mechanisms with additional
offline storage for molecular dynamics trajectories. Electronic
structure codes vary in memory requirement depending on whether storage
and re-use of intermediate values or recomputation is chosen.
Problem match to three categories of PetaFLOPS computers:
Class I-These codes traditionally are heavy users of Class I machines. Vectorization and multiprocessing are not high. Memory use is significantly less than 1 MW/MF for molecular mechanics and can be less or more than 1 MW/MF for electronic calculations.
Class II-Current codes are being ported to predecessor machines of this class. Managing communication is a significant effort.
Class III-A few demonstrations have been done on SIMD machines that may be analogous to Class III architectures.