abacaj

💭

Writing more code

Anton Bacaj abacaj

💭

Writing more code

Software engineer. Hacking on large language models

390 followers · 14 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member

Stars

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,579 569 Updated Oct 22, 2025

Quentin-Anthony / torch-profiling-tutorial

Python 511 27 Updated Aug 6, 2025

huggingface / gpt-oss-recipes

Collection of scripts and notebooks for OpenAI's latest GPT OSS models

Jupyter Notebook 463 49 Updated Aug 25, 2025

xingchensong / TouchNet

A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.

Python 216 22 Updated Aug 6, 2025

OpenPipe / ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,652 588 Updated Oct 22, 2025

docling-project / docling

Get your documents ready for gen AI

Python 42,047 3,002 Updated Oct 22, 2025

facebookresearch / swe-rl

[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 609 49 Updated Mar 16, 2025

microsoft / Magma

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Python 1,828 144 Updated Oct 4, 2025

SmallDoges / small-doge

Doge Family of Small Language Models

Python 181 13 Updated Aug 13, 2025

ahxt / mini-r1-zero

Python 20 Updated Feb 2, 2025

xlang-ai / Spider2

[ICLR 2025 Oral] Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

HTML 613 93 Updated Aug 6, 2025

Photoroom / datago

A natively parallel dataloader for Python, written in Rust. Serving data at GB/s speeds, while covering aspect ratio bucketing, crop and resize for image ML workloads.

Rust 116 6 Updated Oct 22, 2025

MadcowD / ell

A language model programming library.

Python 5,838 353 Updated Jun 5, 2025

THUDM / LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Python 507 32 Updated Dec 31, 2024

zhangfaen / finetune-Qwen2-VL

Python 375 42 Updated Feb 8, 2025

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 30,222 4,447 Updated Oct 22, 2025