Starred repositories
自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili
Cappuccino is an GUI Agent based on desktop screen. It is a Manus-like AI Agent that can be deployed locally.
Fully open reproduction of DeepSeek-R1
This repository is maintained to release dataset and models for multimodal puzzle reasoning.
This repository automatically updates a list of the top 100 repositories related to ComfyUI based on the number of stars on GitHub.
[ICLR 2024] DiffTactile: A Physics-based Differentiable Tactile Simulator for Contact-rich Robotic Manipulation
BizyAir: Comfy Nodes that can run in any environment.
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
tracking medical datasets, with a focus on medical imaging
YouTube video to chords, lyrics, beat and melody.
A command line interface to download PDF files from https://arxiv.org.
Train high-quality text-to-image diffusion models in a data & compute efficient manner
MINT-1T: A one trillion token multimodal interleaved dataset.
WildEval / ZeroEval
Forked from allenai/WildBenchA simple unified framework for evaluating LLMs
Universal memory layer for AI Agents; Announcing OpenMemory MCP - local and secure memory management.
A collection of benchmarks and datasets for evaluating LLM.
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
Repository for collecting and categorizing papers outlined in our survey paper: "Large Language Models on Tabular Data -- A Survey".
The latest version of the abcmidi package is found on https://ifdo.ca/~seymour/runabc/top.html
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.