Skip to content
Change the repository type filter

All

    Repositories list

    • A system validation and diagnostics tool for monitoring, stress testing, detecting, and troubleshooting issues impacting AMD GPUs in high-performance computing environments
      C++
      MIT License
      406608Updated Feb 3, 2025Feb 3, 2025
    • aiter

      Public
      AI Tensor Engine for ROCm
      Cuda
      MIT License
      01221Updated Feb 3, 2025Feb 3, 2025
    • Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      1.4k152236Updated Feb 3, 2025Feb 3, 2025
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      Other
      23k2207741Updated Feb 3, 2025Feb 3, 2025
    • aotriton

      Public
      Ahead of Time (AOT) Triton Math Library
      Python
      MIT License
      1750111Updated Feb 3, 2025Feb 3, 2025
    • hipBLASLt

      Public
      hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
      Assembly
      MIT License
      9974972Updated Feb 3, 2025Feb 3, 2025
    • triton

      Public
      Development repository for the Triton language and compiler
      C++
      MIT License
      1.8k1051050Updated Feb 2, 2025Feb 2, 2025
    • xla

      Public
      A machine learning compiler for GPUs, CPUs, and ML accelerators
      C++
      Apache License 2.0
      4933014Updated Feb 2, 2025Feb 2, 2025
    • xformers

      Public
      Hackable and optimized Transformers building blocks, supporting a composable construction.
      Python
      Other
      6372281Updated Feb 2, 2025Feb 2, 2025
    • Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
      C++
      Other
      1433382251Updated Feb 2, 2025Feb 2, 2025
    • LLVM
      Other
      0200Updated Feb 2, 2025Feb 2, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5.4k58524Updated Feb 2, 2025Feb 2, 2025
    • AMD's graph optimization engine.
      C++
      MIT License
      9019635251Updated Feb 2, 2025Feb 2, 2025
    • ROCm Systems Profiler
      C++
      MIT License
      615010Updated Feb 2, 2025Feb 2, 2025
    • Tensile

      Public
      Stretching GPU performance for GEMMs and tensor contractions.
      Python
      MIT License
      15423143Updated Feb 2, 2025Feb 2, 2025
    • amdsmi

      Public
      AMD SMI
      C++
      MIT License
      315168Updated Feb 2, 2025Feb 2, 2025
    • clr

      Public
      C++
      MIT License
      551161521Updated Feb 1, 2025Feb 1, 2025
    • ROCgdb

      Public
      This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.
      C
      GNU General Public License v2.0
      105351Updated Feb 1, 2025Feb 1, 2025
    • rdc

      Public
      RDC
      C++
      MIT License
      112612Updated Feb 1, 2025Feb 1, 2025
    • ROCm

      Public
      AMD ROCm™ Software - GitHub Home
      Shell
      MIT License
      4024.9k9913Updated Feb 1, 2025Feb 1, 2025
    • Jupyter Notebook
      104711Updated Feb 1, 2025Feb 1, 2025
    • Advanced Profiling and Analytics for AMD Hardware
      Python
      MIT License
      511394912Updated Feb 1, 2025Feb 1, 2025
    • 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      Apache License 2.0
      28k405Updated Feb 1, 2025Feb 1, 2025
    • Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
      Go
      Apache License 2.0
      55301132Updated Feb 1, 2025Feb 1, 2025
    • ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime
      C++
      Other
      1132342120Updated Feb 1, 2025Feb 1, 2025
    • rocminfo

      Public
      ROCm Application for Reporting System Info
      C++
      Other
      3235010Updated Feb 1, 2025Feb 1, 2025
    • HIPIFY

      Public
      HIPIFY: Convert CUDA to Portable C++ Code
      C++
      MIT License
      79544202Updated Jan 31, 2025Jan 31, 2025
    • A collection of examples for the ROCm software stack
      C++
      MIT License
      4718122Updated Jan 31, 2025Jan 31, 2025
    • This is a collection of CMake modules that are useful for all ROCm-DS projects. By sharing the code in a single place it makes rolling out CMake fixes easier.
      CMake
      Apache License 2.0
      0300Updated Jan 31, 2025Jan 31, 2025
    • Ongoing research training transformer models at scale
      Python
      Other
      2.5k1408Updated Jan 31, 2025Jan 31, 2025