Kernel Creation from MATLAB Code
GPU Coder™ generates and executes optimized CUDA kernels for specific algorithm structures and patterns in your MATLAB® code. The generated code calls optimized NVIDIA® CUDA libraries, including cuFFT, cuSolver, cuBLAS, cuDNN, and TensorRT. The generated code can be integrated into your project as source code, static libraries, or dynamic libraries, and can be compiled for desktops, servers, and GPUs embedded on NVIDIA Jetson, DRIVE, and other platforms. GPU Coder lets you incorporate handwritten CUDA code into your algorithms and into the generated code.
Apps
Functions
Objects
Topics
- Kernels from Element-Wise Loops
Create kernels from MATLAB functions containing scalarized, element-wise math operations.
- Kernels from Scatter-Gather Type Operations
Create kernels from MATLAB functions containing reduction operations.
- Kernels from Library Calls
Target GPU optimized math libraries such as cuBLAS, cuSOLVER, cuFFT, and Thrust.
- Support for GPU Arrays
Generate CUDA code that uses GPU arrays.
- Use Dynamically Allocated C++ Arrays in Generated Function Interfaces
Understand and use dynamically allocated arrays from the generated CUDA C++ function interfaces.
- Call Custom CUDA Kernels from the Generated Code
Integrate custom CUDA kernels with MATLAB code intended for code generation.
- Call Custom CUDA Device Function from the Generated Code
Integrate custom GPU device functions with MATLAB code intended for code generation.
- Design Patterns
Create kernels for MATLAB functions containing computational design patterns.
- GPU Memory Allocation and Minimization
Memory allocation options and optimizations for GPU Coder.
- How Shared GPU Memory Manager Improves Performance of Generated MEX
GPU Coder creates a single universal memory manager that handles the memory management for all running CUDA MEX functions.
- What is Half Precision?
Introduction to the half-precision data type in MATLAB and Simulink®.
- Half Precision Code Generation Support
C/C++ and GPU code generation support for functions that support half-precision inputs.