From today's machines to the PetaFLOPS computer, there is a
factor of in speed. How will this produce value in problems
of major importance to society?
Most important problems are already solved at some level, but most solutions are insufficient and need improvement in various respects:
For PDE-based problems, the computational effort scales inversely as
grid size to the fourth power, , and often, especially for
implicit problems, higher powers, such as
can occur.
For field scale oil reservoir simulation, grids on the order of 100 meter spacing might be common. Geological variation occurs on all length scales, down to the pore size of the rock, about a micron. Not all of this variation needs to be simulated, fortunately. The interwell separation is perhaps 400 meters, and flow between wells is the important variable to be predicted. Variation on the range of 10 to 20 meters is not well represented by averaging methods, and is better computed, so that there is a utility in refining grids by a factor of 5 to 10.
On the basis of these considerations, we propose the following simulation, for which the PetaFLOPS machine would be necessary:
At these length scales, geological data is not known, except in a statistical sense, and so statistical ensemble averages will provide average performance as well as a measure of variability associated with these averages and the possibility or probability of outlier solutions, such as early breakthroughs.
Similar issues apply to ground water remediation sites. Here the sites and well spacings are typically smaller, but the same scaling of grid to well spacing arguments apply. Commonly narrow conduction bands, or isolated time events, such as runoff during storms dominate total migration of contaminants so that accurate resolution in space and time is needed for reliable predictive capability.
Complicated chemistry, included binding of contaminants to absorption sites, or the trapping of contaminant bearing water in semi-isolated micropores gives rise to the disturbing phenomena of sites which appear to be remediated by a pump and treat method, only to have the contaminant re-emerge when treatment is terminated. For this reason, physical processes, and system variables often need an increased accuracy of description, as well as finer grid resolution.
What are the architectural issues which result from this problem?
For memory, we see that memory size is determined almost entirely by the application, and is nearly independent of system architecture. The PetaFLOPS machine is mainly justified to solve large problems, rather than to solve problems of a fixed size more rapidly. For the easiest problems, we have
but for many cases, and especially the more computationally difficult ones, the exponent will be smaller, because:
These scaling laws should be developed with known proportionality coefficients coming from today's machines, which appear to be well balanced for a broad mix of problems.
There is a similar scaling law for communication latencies in memory
hierarchies. Communication of bytes takes
units of time,
where
and
are measured dimensionlessly in units of floating
point operations. Here
is latency and
is bandwidth.
The number of floating point operations which can be usefully
performed between communication steps is proportional to the local
memory size. Consider a two-level hierarchy, with bytes stored at
locations. After
floating point operations, there will be
a need to communicate
domain decomposition boundary information
messages of size
in the most favorable case, and
messages of size
in the worst case. For the more common
favorable case, the communication cost is
and the computational cost is
so we need
or
and a
Constants in these relations can be determined from current balanced machines. The same arguments can be extended to multilevel memory in a hierarchy.
In the unfavorable cases, the problem has not been parallelized successfully at a conceptual level, and the machine will fail for these problems in the absence of further algorithmic work. For example, dense matrices may achieve a more nearly sparse character if a better basis is used, such as a multipole or wavelet basis. Such algorithmic methods may move a problem from an unfavorable to a favorable case.
For porous media problems, the pressure is solved implicitly. This problem has a bad condition number as the mesh is refined. Thus, there is a need for preconditioners (approximate solution methods) of a hierarchical nature, to put these problems in the favorable scaling cases. Clearly research issues of a numerical analysis nature will arise.