By the late 1980s, several highly parallel systems were able to achieve high levels of performance-the Connection Machine Model CM-2, the Intel iPSC/860, the nCUBE-2, and, early in the decade of the '90s, the Intel Touchstone Delta System. The peak speeds of these systems are quite high and, at least for some applications, the speeds achieved are also high, exceeding those achieved on vector supercomputers. The fastest CRAY system until 1992 was a CRAY Y-MP with eight processors, a peak speed of , and a maximum speed observed for applications of . In contrast, the Connection Machine Model CM-2 and the Intel Delta have achieved over for some real applications [Brickner:89b], [Messina:92a], [Mihaly:92a;92b]. There are some new Japanese vector supercomputers with a small number of processors (but a large number of instruction pipelines) that have peak speeds of over .
Finally, the vector computers continued to become faster and to have more processors. For example, the CRAY Y-MP C-90 that was introduced in 1992 has sixteen processors and a peak speed of .
By 1992, parallel computers were substantially faster. As was noted above, the Intel Paragon has a peak speed of . The CM-5, an MIMD computer introduced by Thinking Machines Corporation in 1992 has a maximum configuration of 16K processors, each with a peak speed of . The largest system at this writing is a 1024-node configuration in use at Los Alamos National Laboratory.
New introductions continue with Fall, 1992 seeing Fujitsu (Japan) and Meiko (U. K.) introducing distributed-memory parallel machines with a high-performance node featuring a vector unit using, in each case, a different VLSI implementation of the node of Fujitsu's high-end vector supercomputer. 1993 saw major Cray and Convex systems built around Digital and HP RISC microprocessor nodes.
Recently, there has been an interesting new architecture with a distributed-memory design supported by special hardware to build an appearance of shared memory to the user. The goal is to continue the cost effectiveness of distributed memory with the programmability of a shared-memory architecture. There are two major university projects: DASH at Stanford [Hennessy:93a], [Lenoski:89a] and Alewife [Agarwal:91a] at MIT. The first commercial machine, the Kendall Square KSR-1, was delivered to customers in Fall, 1991. A high-performance ring supports the apparent shared memory, which is essentially a distributed dynamic cache. The ring can be scaled up to 32 nodes that can be joined hierarchically to a full-size, 1024-node system that could have a performance of approximately . Burton Smith, the architect of the pioneering Denelcor HEP-1, has formed Teracomputer, whose machine has a virtual shared memory and other innovative features. The direction of parallel computing research could be profoundly affected if this architecture proves successful.
In summary, the 1980s saw an incredible level of activity in parallel computing, much greater than most people would have predicted. Even those projects that in a sense failed-that is, that were not commercially successful or, in the case of research projects, failed to produce an interesting prototype in a timely fashion-were nonetheless useful in that they exposed many people to parallel computing at universities, computer vendors, and (as outlined in Chapter 19) commercial companies such as Xerox, DuPont, General Motors, United Technologies, and aerospace and oil companies.