Skip to main content

Programming Massively Parallel Processors

A Hands-on Approach

  • 5th Edition - February 27, 2026
  • Latest edition
  • Authors: Wen-mei W. Hwu, David B. Kirk, Izzat El Hajj
  • Language: English

Programming Massively Parallel Processors: A Hands-on Approach, Fifth Edition shows both students and professionals alike the basic concepts of parallel programming and GPU archit… Read more

Early spring sale

Nurture your knowledge

Grow your expertise with up to 25% off trusted resources.

Description

Programming Massively Parallel Processors: A Hands-on Approach, Fifth Edition shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This new edition has been updated with an expanded repertoire of optimizations, new patterns and applications, ad more coverage of important CUDA features.

Key features

  • Expanded optimization checklist with a more comprehensive demonstration of essential optimizations across patterns
  • New pattern and application chapters including: filtering, wavefront parallelism, advanced optimizations for matrix multiplication, and large language models (LLMs)
  • More coverage of important CUDA features including warp-level programming, cooperative groups, CUDA C++ atomics, and multi-GPU programming with NCCL and NVSHMEM

Readership

Upper-level undergraduate through graduate level students studying parallel computing within computer science or engineering

Table of contents

1. Introduction

Part I. Fundamental Concepts

2. Heterogeneous data parallel computing

3. Multidimensional grids and data

4. Compute architecture and scheduling

5. Memory architecture and data locality

6. Performance considerations

Part II. Parallel Patterns

7. Convolution

8. Stencil

9. Parallel histogram

10. Reduction

11. Prefix sum (scan)

12. Merge

Part III. Advanced Patterns and Applications

13. Sorting

14. Filtering (new)

15. Sparse matrix computation

16. Wavefront Algorithms (new)

17. Graph traversal

18. Deep learning

19. Multi-GPU API (new)

20. Electrostatic potential map

21. Parallel programming and computational thinking

Part IV. Advanced Practices

22. Programming a heterogeneous computing cluster

23. Advanced Optimizations for Matrix Multiplication (new)

24. Advanced practices and future evolution

25. Conclusion and outlook

Product details

  • Edition: 5
  • Latest edition
  • Published: February 27, 2026
  • Language: English

About the authors

WH

Wen-mei W. Hwu

Wen-mei W. Hwu is a Senior Director of Research of NVIDIA and the Sanders-AMD Endowed Chair Professor Emeritus of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. His work focuses on parallel computing—covering architecture, implementation, compilers, and algorithms. Dr. Hwu has received numerous honors, including the ACM/ IEEE Eckert-Mauchly Award, ACM Grace Murray Hopper Award, IEEE B.R. Rau Award. He is an IEEE and ACM Fellow. He earned his Ph.D. in Computer Science from UC Berkele
Affiliations and expertise
CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA

DK

David B. Kirk

David B. Kirk is known for major contributions to graphics, hardware, and algorithms. Before pursuing his Ph.D. at Caltech, he earned B.S. and M.S. degrees in mechanical engineering from MIT and worked at Raster Technologies and Hewlett-Packard’s Apollo Systems Division. After completing his doctorate, he served as chief scientist and head of technology at Crystal Dynamics. In 1997, he became Chief Scientist at NVIDIA. Dr. Kirk has received numerous honors including the IEEE Seymour Cray Computer Engineering Award and ACM SIGGRAPH Computer Graphics Achievement Award. He is a member of the U.S. National Academy of Engineering.
Affiliations and expertise
NVIDIA Fellow

IE

Izzat El Hajj

Izzat El Hajj is an Assistant Professor of Computer Science at the American University of Beirut. His research focuses on leveraging accelerator architectures to tackle challenging computations, with a focus on GPU computing, processing-in-memory, and performance modeling. He earned his Ph.D. in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. He has received the Dan Vivoli Endowed Fellowship (UIUC) and the Distinguished Graduate Award from the American University of Beirut.
Affiliations and expertise
Assistant Professor, Department of Computer Science, American University of Beirut, Lebanon