BarraCUDA Compiles CUDA Code for AMD GPUs Without LLVM
BarraCUDA Compiles CUDA Code for AMD GPUs Without LLVM BarraCUDA is an open-source CUDA compiler designed to target AMD GPUs, specifically compiling CUDA C source code directly into AMD RDNA 3 binaries. This project, written in C99, bypasses the need for LLVM and other translation layers, offering a direct path from CUDA files to executable machine code for AMD hardware.

BarraCUDA Compiles CUDA Code for AMD GPUs Without LLVM
BarraCUDA is an open-source CUDA compiler designed to target AMD GPUs, specifically compiling CUDA C source code directly into AMD RDNA 3 binaries. This project, written in C99, bypasses the need for LLVM and other translation layers, offering a direct path from CUDA files to executable machine code for AMD hardware.
BarraCUDA is an innovative open-source compiler that enables the compilation of CUDA C source code directly for AMD GPUs. Unlike traditional methods that rely on LLVM or translation layers, BarraCUDA compiles .cu files straight into GFX11 machine code, producing ELF .hsaco binaries that are executable on AMD GPUs. This approach allows developers to utilise CUDA code on AMD hardware without the need for intermediate conversion steps, making the process more streamlined and efficient.
The compiler is written in 15,000 lines of C99 and is designed to handle CUDA features that compile into working GFX11 machine code. It uses a lexer, a parser, an intermediate representation (IR), and a hand-written instruction selection process to achieve this. The project is notable for its lack of dependency on LLVM, although LLVM is used for validation purposes to ensure the accuracy of the compiled code. The compiler's design focuses on efficiency, with pre-allocated fixed-size arrays, no recursion, and bounded loops, adhering to high coding standards.
Technical Details
BarraCUDA's architecture is built to compile CUDA C source code directly into AMD RDNA 3 (gfx1100) binaries. The compiler's backend is cleanly separated, allowing for the addition of new targets by writing new instruction selection and emission pairs. The project currently supports 14 test files, over 35 kernels, and approximately 1,700 BIR instructions, resulting in around 27,000 bytes of machine code. The compiler's instruction selection process is meticulously crafted, with every encoding validated against llvm-objdump to ensure zero decode failures.
Despite its capabilities, BarraCUDA acknowledges certain limitations, such as missing support for compound assignment operators, bare unsigned integers, and other minor parser changes. These are not architectural blockers but rather areas for future development. The goal is to compile real-world .cu files without requiring modifications, although the generated code is not yet optimised for benchmark performance.
Getting Started
BarraCUDA is available under the Apache 2.0 license, allowing users to freely use and modify the software. Developers interested in testing the current implementation or contributing to the project can access the source code on GitHub. The project encourages community involvement, inviting users to open issues or discuss potential improvements and bug fixes. The developer is open to feedback and collaboration, particularly from those interested in the intricacies of AMDGPU instruction encoding.
Story based on discussion on Hacker News.
Enjoyed this tech story? Share it with others!


