Given by Chuck Koelbel -- Rice University at DoD Training and Others on 1995-98. Foils prepared August 7 98
Outside Index
Summary of Material
| First in series of Chuck Koelbel on HPF |
| HPF and its performance |
| Types of Parallel Computers |
| Types of Applications |
| Data Parallelism Message Passing |
| Why use Compilers |
Outside Index
Summary of Material
| Charles Koelbel |
Supported in part by
|
As always
|
| The High Performance Fortran Forum |
| The D System Group (Rice University) |
| The Fortran 90D Group (Syracuse University) |
| Ken Kennedy (Rice) |
| David Loveman (DEC) |
| Piyush Mehrotra (ICASE) |
| Rob Schreiber (HP Labs) |
| Guy Steele (Sun) |
| Mary Zosel (Livermore) |
| Defined by the High Performance Fortran Forum (HPFF) as a portable language for data-parallel computation |
History:
|
Influences:
|
| All of Fortran 90 |
| FORALL and INDEPENDENT |
| Data Alignment and Distribution |
| Miscellaneous Support Operations |
But:
|
Performance is highly dependent on the compiler and on the nature of the code
|
Commercial compilers are now competitive with MPI for regular problems
|
Research continues on irregular problems and task parallelism
|
A full application for ocean modeling
|
Well-known compact application benchmarks from NASA
|
| World Wide Web |
| These slides |
Mailing Lists:
|
Anonymous FTP:
|
| 1. Introduction to Data-Parallelism |
| 2. Fortran 90/95 Features |
| 3. HPF Parallel Features |
| 4. HPF Data Mapping Features |
| 5. Parallel Programming in HPF |
| 6. HPF Version 2.0 |
| Parallel computers allow several CPUs to contribute to a computation simultaneously. |
For our purposes, a parallel computer has three types of parts:
|
Key points:
|
| Every processor has a memory others can¹t access. |
Advantages:
|
Disadvantages:
|
| All processors access the same memory. |
Advantages:
|
Disadvantages:
|
| Combining the advantages of shared and disttributed memory |
Lots of hierarchical designs are appearing.
|
| A parallel algorithm is a collection of tasks and a partial ordering between them. |
Design goals:
|
Sources of parallelism:
|
Data-parallel algorithms exploit the parallelism inherent in many large data structures.
|
Analysis:
|
Functional parallelism exploits the parallelism between the parts of many systems.
|
Analysis:
|
| A parallel language provides an executable notation for implementing a parallel algorithm. |
Design criteria:
|
| Usually a language reflects a particular type of parallelism. |
Data-parallel languages provide an abstract, machine-independent model of parallelism.
|
Advantages:
|
Disadvantages:
|
| Abstractions like data parallelism split the work between the programmer and the compiler. |
Programmer¹s task: Solve the problem in this model.
|
Compiler¹s task: Map conceptual (massive) parallelism to physical (finite) machine.
|
| Program is based on relatively coarse-grain tasks |
| Separate address space and a processor number for each task |
Data shared by explicit messages
|
| Examples: MPI, PVM, Occam |
Advantages:
|
Disadvantages:
|
Analysis
|
Computation Partitioning
|
Communication Introduction
|
Code Generation
|
Help analysis with assertions
|
Distribute array dimensions that exhibit parallelism
|
Consider communications patterns
|
| Don't hide what you are doing |