In this section, some basic notions that dictate various characteristics of architectures are examined. The section opens with a list of key metrics and some brief explanations, and closes with a statement of three important laws.
The key metrics used to characterize high-performance computers are
Basic Laws
Assume that a processor is fed operands at the bandwidth of
operations per cycle, each of which requires an elapsed time of
cycles. That is, each cycle the processor accepts
sets of
input operands
and produces
outputs, and the time elapsed between a specific input
and its corresponding output is
cycles. Then the internal
concurrency of the processor is
. In other words, at any given
cycle, there are
distinct computations proceeding concurrently
within the processor. This formula specifies how much concurrency must
exist within a processor for a specified bandwidth and latency.
The formula also can be applied to the latency and bandwidth of a
memory system. To sustain operations at maximum rate from a memory
system whose bandwidth is and whose latency is
requires
concurrent access streams from the memory system to the processors, and
thus the memory must be able to support
concurrent operations
internally.
Corollary: If operation latency is on the order of 1 ns
( s), the concurrency required to achieve a
processing bandwidth of
PetaFLOPS (
)
must be on the order
.
To reduce concurrency below 10,000, the operation latency must
be less than 10 ps (
s), and to reduce concurrency
below 10, the operation latency must be less than 10 femtoseconds
(
s).
In a machine not limited by the speed of light, that is, in a machine
whose diameter is a small number of cycles close to unity, or in a
machine with a very small bisection bandwidth, the dependence latency
can grow at a rate proportional to