Skip to content
Change the repository type filter

All

    Repositories list

    • guidellm

      Public
      Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
      Python
      Apache License 2.0
      212082114Updated Mar 6, 2025Mar 6, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      6.1k10018Updated Mar 6, 2025Mar 6, 2025
    • research

      Public
      Repository to enable research flows
      Python
      0001Updated Mar 6, 2025Mar 6, 2025
    • A safetensors extension to efficiently store sparse quantized tensors on disk
      Python
      Apache License 2.0
      97947Updated Mar 6, 2025Mar 6, 2025
    • Neural Magic GHA
      Python
      Apache License 2.0
      0003Updated Mar 5, 2025Mar 5, 2025
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      Apache License 2.0
      969002Updated Mar 4, 2025Mar 4, 2025
    • yolov5

      Public
      YOLOv5 in PyTorch > ONNX > CoreML > TFLite
      Python
      GNU General Public License v3.0
      17k2003Updated Mar 3, 2025Mar 3, 2025
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      28k100Updated Feb 24, 2025Feb 24, 2025
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.5k000Updated Feb 20, 2025Feb 20, 2025
    • Pytest plugin used by the Release Engineering team
      Python
      Apache License 2.0
      0000Updated Feb 17, 2025Feb 17, 2025
    • General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
      11110Updated Feb 15, 2025Feb 15, 2025
    • Python
      7001Updated Feb 8, 2025Feb 8, 2025
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      2.2k301Updated Feb 3, 2025Feb 3, 2025
    • evalplus

      Public
      NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)
      Python
      Apache License 2.0
      137001Updated Jan 24, 2025Jan 24, 2025
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.5k100Updated Jan 23, 2025Jan 23, 2025
    • Benchmarking code for running quantized kernels from vLLM and other libraries
      Python
      0510Updated Dec 3, 2024Dec 3, 2024
    • docs

      Public
      Top-level directory for documentation and general content
      MDX
      712104Updated Nov 25, 2024Nov 25, 2024
    • graphs

      Public
      Apache License 2.0
      0000Updated Nov 15, 2024Nov 15, 2024
    • An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
      Jupyter Notebook
      Apache License 2.0
      259000Updated Nov 12, 2024Nov 12, 2024
    • LLM training code for MosaicML foundation models
      Python
      Apache License 2.0
      548000Updated Oct 24, 2024Oct 24, 2024
    • nm-vllm

      Public archive
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Other
      6.1k26000Updated Oct 11, 2024Oct 11, 2024
    • mteb

      Public
      MTEB: Massive Text Embedding Benchmark
      Jupyter Notebook
      Apache License 2.0
      337001Updated Oct 2, 2024Oct 2, 2024
    • 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      Apache License 2.0
      28k9013Updated Oct 1, 2024Oct 1, 2024
    • AutoFP8

      Public
      Python
      Apache License 2.0
      24178113Updated Oct 1, 2024Oct 1, 2024
    • OmniQuant

      Public
      [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
      Python
      MIT License
      59001Updated Sep 27, 2024Sep 27, 2024
    • An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
      Python
      MIT License
      508000Updated Sep 16, 2024Sep 16, 2024
    • Supercharge Your Model Training
      Python
      Apache License 2.0
      435000Updated Aug 27, 2024Aug 27, 2024
    • MixEval

      Public
      NM fork of MixEval compatible with SparseAutoModel.
      Python
      41001Updated Aug 20, 2024Aug 20, 2024
    • mamba

      Public
      Mamba SSM architecture
      Python
      Apache License 2.0
      1.2k000Updated Aug 12, 2024Aug 12, 2024
    • Causal depthwise conv1d in CUDA, with a PyTorch interface
      Cuda
      BSD 3-Clause "New" or "Revised" License
      81000Updated Aug 8, 2024Aug 8, 2024