Scaling GPU-Accelerated Applications with the C++ Standard Library | NVIDIA

About 3 min

Scaling GPU-Accelerated Applications with the C++ Standard Library | NVIDIA 관련

Learn how to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You’ll also learn an iterative style of CUDA development that will allow you to ship accelerated applications fast.

About This Course

Harnessing the incredible acceleration of NVIDIA GPUs is easier than ever. For over a decade NVIDIA has been collaborating in the C++ standard language committees on the adoption of features to enable parallel programming without the need for additional extensions or APIs. On account of this work, developers can now write GPU-accelerated C++ code using only standard language features: no language extensions, pragmas, directives, or non-standard libraries.

Standard language parallelism is the simplest, most productive, and most portable approach to accelerated computing. It requires nothing more than ISO standard C++ and allows developers to write applications that are parallel-first such that there is never a need to port them to new platforms or to run them on GPU-accelerators.

Learning Objectives

In this interactive hands-on workshop, which is intended as a followup to GPU Acceleration with the C++ Standard Library we present how to write scalable GPU-accelerated hybrid applications using C++ standard language features alongside MPI. By the time you complete this workshop you will be able to:

Rewrite serial C++ / MPI hybrid applications to use C++ standard template library parallel algorithms that can leverage GPU accelerators
Use the NVIDIA HPC C++ compiler (NVC++) to compile standard C++ / MPI hybrid applications for execution on NVIDIA GPUs and/or multiple nodes with GPUs
Utilize C++ standard library features to support effective inter-rank communication alongside the use of C++ STL parallel algorithms
Use NVIDIA's reference implementation of Senders, a proposed standard model for asynchronous execution in C++