Numba
0.48
1. User Manual
2. Reference Manual
3. Numba for CUDA GPUs
4. CUDA Python Reference
5. Numba for AMD ROC GPUs
5.1. Overview
5.2. Writing HSA Kernels
5.3. Memory management
5.4. Writing Device Functions
5.5. Supported Atomic Operations
5.6. The Agents
5.7. ROC Ufuncs and Generalized Ufuncs
5.8. Examples
6. Extending Numba
7. Developer Manual
8. Numba Enhancement Proposals
9. Glossary
10. Release Notes
Numba
Docs
»
5. Numba for AMD ROC GPUs
View page source
5. Numba for AMD ROC GPUs
ΒΆ
5.1. Overview
5.1.1. Terminology
5.1.2. Requirements
5.1.3. Installation
5.2. Writing HSA Kernels
5.2.1. Introduction
5.2.2. Introduction for CUDA Programmers
5.2.3. Kernel declaration
5.2.4. Kernel invocation
5.2.4.1. Choosing the workgroup size
5.2.4.2. Multi-dimensional workgroup and grid
5.2.5. WorkItem positioning
5.3. Memory management
5.3.1. Data transfer
5.3.1.1. Device arrays
5.3.1.2. Data Registration
5.3.2. Streams
5.3.3. Shared memory and thread synchronization
5.4. Writing Device Functions
5.5. Supported Atomic Operations
5.5.1. Example
5.6. The Agents
5.7. ROC Ufuncs and Generalized Ufuncs
5.7.1. Basic ROC UFunc Example
5.7.2. Calling Device Functions from ROC UFuncs
5.7.3. Generalized ROC ufuncs
5.7.4. Async execution: A Chunk at a Time
5.8. Examples
5.8.1. Matrix multiplication