Programming Massively Parallel Processors

A Hands-on Approach

5th Edition - February 27, 2026
Latest edition
Authors: Wen-mei W. Hwu, David B. Kirk, Izzat El Hajj
Language: English

Programming Massively Parallel Processors: A Hands-on Approach, Fifth Edition shows both students and professionals alike the basic concepts of parallel programming and GPU archit… Read more

Spring sale

Knowledge that grows with you

Up to 25% off trusted resources

Shop the spring sale

Description

Programming Massively Parallel Processors: A Hands-on Approach, Fifth Edition shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This new edition has been updated with an expanded repertoire of optimizations, new patterns and applications, ad more coverage of important CUDA features.

Key features

Expanded optimization checklist with a more comprehensive demonstration of essential optimizations across patterns
New pattern and application chapters including: filtering, wavefront parallelism, advanced optimizations for matrix multiplication, and large language models (LLMs)
More coverage of important CUDA features including warp-level programming, cooperative groups, CUDA C++ atomics, and multi-GPU programming with NCCL and NVSHMEM

Readership

Upper-level undergraduate through graduate level students studying parallel computing within computer science or engineering

1. Introduction

Part I. Fundamental Concepts

2. Heterogeneous data parallel computing

3. Multidimensional grids and data

4. Compute architecture and scheduling

5. Memory architecture and data locality

6. Performance considerations

Part II. Parallel Patterns

7. Convolution

8. Stencil

9. Parallel histogram

10. Reduction

11. Prefix sum (scan)

12. Merge

Part III. Advanced Patterns and Applications

13. Sorting

14. Filtering (new)

15. Sparse matrix computation

16. Wavefront Algorithms (new)

17. Graph traversal

18. Deep learning

19. Multi-GPU API (new)

20. Electrostatic potential map

21. Parallel programming and computational thinking

Part IV. Advanced Practices

22. Programming a heterogeneous computing cluster

23. Advanced Optimizations for Matrix Multiplication (new)

24. Advanced practices and future evolution

25. Conclusion and outlook

Product details

Edition: 5
Latest edition
Published: February 27, 2026
Language: English

About the authors

Wen-mei W. Hwu

Wen-mei W. Hwu is a Senior Director of Research of NVIDIA and the Sanders-AMD Endowed Chair Professor Emeritus of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. His work focuses on parallel computing—covering architecture, implementation, compilers, and algorithms. Dr. Hwu has received numerous honors, including the ACM/ IEEE Eckert-Mauchly Award, ACM Grace Murray Hopper Award, IEEE B.R. Rau Award. He is an IEEE and ACM Fellow. He earned his Ph.D. in Computer Science from UC Berkele

Affiliations and expertise

CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA

David B. Kirk

David B. Kirk is known for major contributions to graphics, hardware, and algorithms. Before pursuing his Ph.D. at Caltech, he earned B.S. and M.S. degrees in mechanical engineering from MIT and worked at Raster Technologies and Hewlett-Packard’s Apollo Systems Division. After completing his doctorate, he served as chief scientist and head of technology at Crystal Dynamics. In 1997, he became Chief Scientist at NVIDIA. Dr. Kirk has received numerous honors including the IEEE Seymour Cray Computer Engineering Award and ACM SIGGRAPH Computer Graphics Achievement Award. He is a member of the U.S. National Academy of Engineering.

Affiliations and expertise

NVIDIA Fellow

Izzat El Hajj

Izzat El Hajj is an Assistant Professor of Computer Science at the American University of Beirut. His research focuses on leveraging accelerator architectures to tackle challenging computations, with a focus on GPU computing, processing-in-memory, and performance modeling. He earned his Ph.D. in Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. He has received the Dan Vivoli Endowed Fellowship (UIUC) and the Distinguished Graduate Award from the American University of Beirut.

Affiliations and expertise

Assistant Professor, Department of Computer Science, American University of Beirut, Lebanon

View book on ScienceDirect

Read Programming Massively Parallel Processors on ScienceDirect

Life Sciences

Physical Sciences & Engineering

Social Sciences & Humanities

Health

Programming Massively Parallel Processors

A Hands-on Approach

Knowledge that grows with you

Description

Key features

Readership

Table of contents

Product details

About the authors

Wen-mei W. Hwu

David B. Kirk

Izzat El Hajj

View book on ScienceDirect