Alliant's first parallel machines were announced in 1985, and a number of subsequent models were based on custom designed processors including the FX/40 and FX/80 shared memory multiprocessors, which were superceded at the end of the 1980's by the FX/2800 so-called "standards based supercomputer", which was based on Intel i860 microprocessors.
More than 650 Alliant systems were known to be installed in over 400 customer sites worldwide in 1992.
Typical applications included image/signal/seismic processing, computational fluid dynamics, computational chemistry, visualisation and structural analysis. The FX/2800 quickly took over and expanded the large application base of its predecessors.
The CAMPUS/800 product was announced at the end of 1991, and consisted of a number of FX/2800 shared-memory nodes, and unified-memory: physically distributed memory with a shared-memory view, controlled by sophisticated compilers and normal imperative language constructs, rather than explicit message-passing. Much of the development work, particularly the parallelizing compiler, was done in collaboration with CSRD (Illinois) and is based on the Cedar project. Alliant had several installed CAMPUS/800 machines (which were probably still prototypes), with the largest known installation having 75 processors.
Alliant Computer Systems Corp. 1 Monarch Drive Littleton, MA 01460 508-486-4950 Alliant Computer Systems UK Ltd 10 Heatherley Road Camberley Surrey GU15 3LW UK 0276-682765 FAX 0276-65235
Compute Hardware: Computational elements (CEs) execute applications code using vector instructions. The CEs transparently execute the code of an application in parallel.
Weitek 1064/1065 plus ten different gate array types with 2600 to 8000 gates. First-generation computational elements (FX1, FX4, FX8) may be added in the field, increasing performance without recompilation or relinking. Advanced Computational Elements (ACEs) for second generation (FX40, FX80, VFX) are based on the BIT floating point chips. Each CE has 8 vector registers, each with 32 64-bit elements, and 8 64-bit scalar floating point, 8 32-bit integer, and 8 32-bit address registers.
Interactive Processors (IP) were Motorola 68020. 4 Mbyte local memory in each IP. ACE 64-bit processor 20,000 gate CMOS VLSI gate array, with BIT floating-point processors. 64 Kbyte instruction cache.
The cycle time is 170 nsec. Only six different PC boards are used.
Floating-point unit: IEEE 32- and 64-bit formats including hardware divide and square root and microcoded elementary functions.
Interconnect / Communications System: CEs are cross-bar connected on the backplane to a 512 Kbyte write-back computational processor (CP) cache (FX/80). Bandwidth is 376 Mbyte/sec.
Connectivity: crossbar (CE to cache), bus (cache to memory, cache to cache)
Memory System: Each 32-Kbyte IP cache is connected to 1-3 IPs (FX/80) or 1-2 IPs and a CE (FX/1). The FX/80 has 1-4 IP caches; the FX/4 and FX/40 have 2 IP caches; the FX/1 has one IP cache.
The CP and IP caches are attached by two 72-bit buses to the main memory. Memory bus bandwidth is 188 Mbyte/sec, and memory cycle time is 85 nsec.
Range of memory sizes available: 32-64 Mbytes (FX/1), 32-160 Mbytes (FX/4 and FX/40), and 32-256 Mbytes (FX/80), using 1 Mbit chips with ECC.
Virtual memory: 2 Gbytes per process.
Advanced CE's (ACE's).\\ Scalar 32-bit : 7.2 mips / CE (14700 Kwhetstones).\\ Scalar 64-bit : 6.2 mips / CE (13700 Kwhetstones).\\ Vector 32-bit and 64-bit : 23.5 Mflops / CE.\\ FX/80 on 1,000 x 1,000 LINPACK benchmark: 69.3 Mflops. Peak performance 188.8 Mflops.
Languages supported included: Fortran, C, Pascal, Ada, Lisp, STSC APL, 68020 Assembler.
The Fortran implementation was known to have the following characteristics: F77 - Conformed to 1978 ANSI standard; Extensions - Most of VAX/VMS extensions and Fortran 8x array extensions; Debugging facilities existed.
Vectorizing/parallelizing capabilities - Automatic detection of vectors and feedback to user via diagnostic messages.
Can employ COVI (concurrent outer, vector inner) on nested loops. User control of transformations via directives in the form of Fortran comments Interprocedural dependency analysis for automatic determination of parallel subroutine calls in loops.
Stand-alone (hostless) configuration. TCP/IP network support.
Size: FX/1 system - 28" x 13" x 25" (the FX/1 I/O expansion cabinet is the same size); FX/4, FX/40, and FX/80 systems - 43.5" x 29.5" x 33.8" (the I/O expansion cabinet is 24.5" and same height and depth, while the tape cabinet is 61" in height).
Cooling: All systems are air-cooled with power consumption of FX/1 -- 1155 Watts (max. configuration), 725 Watts (I/O Expansion); FX/4 -- 4500 Watts, 2100 Watts (I/O Expansion); FX/40 -- 4200 Watts, 2100 Watts (I/O Expansion); FX/80 -- 5100 Watts, 2100 Watts (I/O Expansion).
Peripherals available included:
800/1600/6250 BPI start-stop tape drive; 550 Mbyte (formatted) Winchester disk drives; 45 Mbyte cartridge tape drive; Floppy disk drive; 8/16 line multichannel communications controllers; 600 lpm printer; Ethernet controller.
First beta delivery May 1985; first production shipment September 1985. Alliant's customers include Asahi Chemical Corp, AT\&T, Boeing Airplane Co., Ford Motor Co., Hughes Aircraft Corp., Motorola Inc., Siemens, The Whittle Laboratory at the University of Cambridge, CERFACS at Toulouse, and the Jodrell Bank Observatory at the University of Manchester.
Compute Hardware: Intel i860.
Interconnect / Communications System: Shared memory on bus system.
Memory System:
Data transfer rates of 376MB/s maximum bandwidth to (up to) one 512kB cache per module were reported.
Languages --- Compilers available from the vendor for Fortran (including Fortran-90 extensions), C and Ada. Sophisticated parallelizing Fortran compiler for scalar, vector, scalar concurrent, vector concurrent and concurrent-outer-vector-inner operations.
Programming Environment --- Parallel libraries and kernels provided developed for shared-memory, including optimised mathematics libraries for parallel EISpack, LINpack, NAG and IMSL. Sophisticated programming, profiling, debugging and visualisation tools. Also graphics support for PHIGS+, X, OSF/Motif, in addition to NFS/NQS and TCP/IP networking.
The CAMPUS/800 architecture consists of clusters of processors in ClusterNodes sharing up to 4GB of memory, with intra-ClusterNode (1st level switch) latency of 1us and 1.12GB/s bandwidth. ClusterNodes closely resemble (and can be upgraded from) FX/2800 machines. High-speed Memory Interconnect (HMI) connecting ClusterNodes, which have distributed, and may even be remotely sited, with inter-ClusterNode (2nd level switch) latency of typically 30us and 2.56GB/s bandwidth.
Data Transfer --- Two-level switch: intra-ClusterNode, 1 us latency, 1.12 GB/s bandwidth; inter-ClusterNode, 30 us latency, 2.56 GB/s bandwidth. HMI based on HiPPI channel standard, allows ClusterNodes to be physically separated (up to 30km) at different locations without loss of bandwidth.
Fault Tolerance --- considerable apparent flexibility in reconfiguring around faulty nodes, and since the technology is reasonably mature and standard, robustness should not be a problem. Remotely-controlled diagnostic software programs for pinpointing failed and/or weak components.