-
NVIDIA
- Beijing
Starred repositories
Intelligent Router for Mixture-of-Models
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
An implementation of the 5th Edition game system for Foundry Virtual Tabletop (http://foundryvtt.com).
Diplomacy: DATC-Compliant Game Engine with Web Interface
Frontier Models playing the board game Diplomacy.
Implementation for FP8/INT8 Rollout for RL training without performence drop.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
slime is an LLM post-training framework for RL Scaling.
The AI coding agent built for the terminal.
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba C682 Cloud.
MAGI-1: Autoregressive Video Generation at Scale
SkyRL: A Modular Full-stack RL Library for LLMs
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
An open-source AI agent that brings the power of Gemini directly into your terminal.
SWE-bench: Can Language Models Resolve Real-world Github Issues?
Production-ready platform for agentic workflow development.
A safetensors extension to efficiently store sparse quantized tensors on disk
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Lightweight coding agent that runs in your terminal
About Awesome things towards foundation agents. Papers / Repos / Blogs / ...