- GuangZhou
Stars
A python module to repair invalid JSON from LLMs
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
[PR 2025] DocAligner: Automating the Annotation of Photographed Documents Through Real-virtual Alignment
[CVPR 2023] DKM: Dense Kernelized Feature Matching for Geometry Estimation
Document Dewarping with Control Points
Code for the paper "UVDoc: Neural Grid-based Document Unwarping"
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
The official repo for “DocScanner: Robust Document Image Rectification with Progressive Learning”, IJCV, 2025.
✨✨Latest Advances on Multimodal Large Language Models
A lightweight LMM-based Document Parsing Model
React + Vue Search UI for Elasticsearch & Opensearch. Compatible with Algolia's Instantsearch and Autocomplete components.
A site to instantly search 32M songs from the MusicBrainz songs database, using Typesense Search (an open source alternative to Algolia / ElasticSearch) ⚡ 🎵 🔍
A demo app that shows you how to use Vue & the Typesense InstantSearch adapter, to build rich search interfaces.
Search UI components for React and Vue
Handwritten Text Recognition and Character Detection
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
DeepSparkInference has selected 216 inference models of both small and large sizes. The small models cover fields such as computer vision, natural language processing, and speech recognition; the L…
基于transformer的ocr识别,在公章(印章识别, seal recognition)拓展应用
Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
An open-source AI content search engine designed specifically for content creators. Supports extraction of text, images, and short videos. Allows full local deployment (web app, RAG server, LLM ser…
Latest Advances on System-2 Reasoning
Solve Visual Understanding with Reinforced VLMs
[ICLR'24] Recursive Generalization Transformer for Image Super-Resolution
[FG 2025] official implementation for the paper 'Representation Learning and Identity Adversarial Training for Facial Behavior Understanding'