Skip to content
@neuralmagic

Neural Magic

Neural Magic helps developers in accelerating machine learning performance using automated model sparsification techniques and inference technologies.

Pinned Loading

  1. nm-vllm-certs nm-vllm-certs Public

    General Information, model certifications, and benchmarks for nm-vllm enterprise distributions

    10 1

  2. deepsparse deepsparse Public

    Sparsity-aware deep learning inference runtime for CPUs

    Python 3.1k 180

  3. sparseml sparseml Public

    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

    Python 2.1k 149

  4. docs docs Public

    Top-level directory for documentation and general content

    MDX 120 7

  5. sparsezoo sparsezoo Public

    Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

    Python 377 26

  6. guidellm guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    Python 185 16

Repositories

Showing 10 of 61 repositories
  • compressed-tensors Public

    A safetensors extension to efficiently store sparse quantized tensors on disk

    neuralmagic/compressed-tensors’s past year of commit activity
    Python 67 Apache-2.0 6 2 6 Updated Jan 30, 2025
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    neuralmagic/vllm’s past year of commit activity
    Python 8 Apache-2.0 5,481 0 22 Updated Jan 30, 2025
  • nm-actions Public

    Neural Magic GHA

    neuralmagic/nm-actions’s past year of commit activity
    Python 0 Apache-2.0 0 0 3 Updated Jan 30, 2025
  • flash-attention Public Forked from vllm-project/flash-attention

    Fast and memory-efficient exact attention

    neuralmagic/flash-attention’s past year of commit activity
    C++ 0 BSD-3-Clause 1,447 0 0 Updated Jan 29, 2025
  • neuralmagic/mistral-evals’s past year of commit activity
    Python 0 4 0 1 Updated Jan 27, 2025
  • nm-vllm-certs Public

    General Information, model certifications, and benchmarks for nm-vllm enterprise distributions

    neuralmagic/nm-vllm-certs’s past year of commit activity
    10 1 1 0 Updated Jan 27, 2025
  • evalplus Public Forked from evalplus/evalplus

    NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)

    neuralmagic/evalplus’s past year of commit activity
    Python 0 Apache-2.0 120 0 1 Updated Jan 24, 2025
  • vllm-flash-attention Public Forked from vllm-project/flash-attention

    Fast and memory-efficient exact attention

    neuralmagic/vllm-flash-attention’s past year of commit activity
    C++ 1 BSD-3-Clause 1,447 0 0 Updated Jan 23, 2025
  • lm-evaluation-harness Public Forked from EleutherAI/lm-evaluation-harness

    A framework for few-shot evaluation of language models.

    neuralmagic/lm-evaluation-harness’s past year of commit activity
    Python 3 MIT 2,061 0 1 Updated Jan 22, 2025
  • yolov5 Public Forked from ultralytics/yolov5

    YOLOv5 in PyTorch > ONNX > CoreML > TFLite

    neuralmagic/yolov5’s past year of commit activity
    Python 20 GPL-3.0 16,802 0 3 Updated Jan 20, 2025