Next: Results Up: Architecture Previous: Barriers

Alternatives

The approach taken by the Architecture Working Group was to consider three classes of architecture which were lineal descendents of the most promising approaches being pursued today and to determine their viability. If, after close analysis, none were found to promise a likely path to PetaFLOPS performance, this would expose the need for more avant garde approaches perhaps reflecting a new architecture paradigm. The three architecture models considered were:

  1. Coarse Grain: A low-latency, shared-memory computer employing hundreds of heavily pipelined processors, each capable of a TeraFLOPS performance.

  2. Medium Grain: A multiprocessor with tens of thousands of workstation-derived microprocessors, each capable of between 10 and 100 GigaFLOPS performance. This system would probably include a common global name space so that any processor could address any part of main memory directly. But, because of the anticipated large diameter and memory access time, it will require advanced latency management strategies.

  3. Fine Grain: A distributed multiprocessor with CPUs and memory co-resident on the same chip to expose high levels of memory bandwidth. Hundreds of thousands of these Processor-In-Memory (PIM) chips would be required because the performance of each would be between 1 and 10 GigaFLOPS. But, the cost would be much lower because less memory-by an order-of-magnitude or more-would be installed with respect to the other two system types. Undoubtedly, this architecture would have a fragmented address space and off-chip transactions would be expensive.

These three system types impose distinct demands on resources and design and provide different characteristics in terms of behavior, e.g., the same applications probably would not perform optimally on all three of these systems. But, the Architecture Working Group did consider the concept of a heterogeneous system made up of one of each of these types with each providing a large fraction of a PetaFLOPS such that the aggregate peak performance would be equal to a PetaFLOPS. It is expected that such a heterogeneous system would offer better performance to cost than any one of the system types scaled up to a full PetaFLOPS.



Next: Results Up: Architecture Previous: Barriers


gcf@npac.syr.edu