- Seoul,Korea
- https://bittersweet.ai
- in/codertimo
- @codertimo
Stars
Unofficial Node.js API client for the Caret HTTP API
Turn any computer or edge device into a command center for your computer vision projects.
An open-source AI agent that brings the power of Gemini directly into your terminal.
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
LLM training code for Databricks foundation models
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Implementation of Flash Attention in Jax
Making large AI models cheaper, faster and more accessible
The code of Zero To Production In Rust for exercise
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
A latent text-to-image diffusion model
min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Everything you want to know about Google Cloud TPU
DALL·E Mini - Generate images from a text prompt
Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
PyTorch Implementation of the Paper "Efficient Training of Retrieval Models using Negative Cache"
Training and serving large-scale neural networks with auto parallelization.
Go binding to TensorRT C API to do inference with pre-trained model in Go