Skip to main content

Programming Massively Parallel Processors

A Hands-on Approach

Programming Massively Parallel Processors: A Hands-on Approach shows both students and professionals alike the basic concepts of parallel programming and GPU architect… Read more

Early spring sale

Nurture your knowledge

Grow your expertise with up to 25% off trusted resources.

Description

Programming Massively Parallel Processors: A Hands-on Approach shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. The new edition includes updated coverage of CUDA, including the newer libraries such as CuDNN. New chapters on frequently used parallel patterns have been added, and case studies have been updated to reflect current industry practices.

Key features

  • Parallel Patterns Introduces new chapters on frequently used parallel patterns (stencil, reduction, sorting) and major improvements to previous chapters (convolution, histogram, sparse matrices, graph traversal, deep learning)
  • Ampere Includes a new chapter focused on GPU architecture and draws examples from recent architecture generations, including Ampere
  • Systematic Approach Incorporates major improvements to abstract discussions of problem decomposition strategies and performance considerations, with a new optimization checklist

Readership

Upper level through grad level students studying parallel computing within computer science or engineering

Table of contents

1. Introduction

Part I Fundamental Concepts

2. Heterogeneous data parallel computing

3. Multidimensional grids and data

4. Compute architecture and scheduling

5. Memory architecture and data locality

6. Performance considerations

Part II Parallel Patterns

7. Convolution: An introduction to constant memory and caching

8. Stencil

9. Parallel histogram

10. Reduction And minimizing divergence

11. Prefix sum (scan)

12. Merge: An introduction to dynamic input data identification

Part III Advanced patterns and applications

13. Sorting

14. Sparse matrix computation

15. Graph traversal

16 Deep learning

17. Iterative magnetic resonance imaging reconstruction

18. Electrostatic potential map

19. Parallel programming and computational thinking

Part IV Advanced Practices

20. Programming a heterogeneous computing cluster: An introduction to CUDA streams

21. CUDA dynamic parallelism

22. Advanced practices and future evolution

23. Conclusion and outlook

Appendix A: Numerical considerations

Product details

About the authors

WH

Wen-mei W. Hwu

Wen-mei W. Hwu is a Senior Director of Research of NVIDIA and the Sanders-AMD Endowed Chair Professor Emeritus of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. His work focuses on parallel computing—covering architecture, implementation, compilers, and algorithms. Dr. Hwu has received numerous honors, including the ACM/ IEEE Eckert-Mauchly Award, ACM Grace Murray Hopper Award, IEEE B.R. Rau Award. He is an IEEE and ACM Fellow. He earned his Ph.D. in Computer Science from UC Berkele
Affiliations and expertise
CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA

DK

David B. Kirk

David B. Kirk is known for major contributions to graphics, hardware, and algorithms. Before pursuing his Ph.D. at Caltech, he earned B.S. and M.S. degrees in mechanical engineering from MIT and worked at Raster Technologies and Hewlett-Packard’s Apollo Systems Division. After completing his doctorate, he served as chief scientist and head of technology at Crystal Dynamics. In 1997, he became Chief Scientist at NVIDIA. Dr. Kirk has received numerous honors including the IEEE Seymour Cray Computer Engineering Award and ACM SIGGRAPH Computer Graphics Achievement Award. He is a member of the U.S. National Academy of Engineering.
Affiliations and expertise
NVIDIA Fellow

IE

Izzat El Hajj

Izzat El Hajj is an Assistant Professor of Computer Science at the American University of Beirut. His research focuses on leveraging accelerator architectures to tackle challenging computations, with a focus on GPU computing, processing-in-memory, and performance modeling. He earned his Ph.D. in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. He has received the Dan Vivoli Endowed Fellowship (UIUC) and the Distinguished Graduate Award from the American University of Beirut.
Affiliations and expertise
Assistant Professor, Department of Computer Science, American University of Beirut, Lebanon

View book on ScienceDirect

Read Programming Massively Parallel Processors on ScienceDirect