-
VSLab@NTHU; MediaTek; AILabs
- Taipei, Taiwan
- https://albert100121.github.io/
- @Albert_NH_Wang
- in/ning-hsu-albert-wang
Lists (5)
Sort Name ascending (A-Z)
Stars
[NeurIPS 2024] NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
High-Resolution Image Synthesis with Latent Diffusion Models
Command-line program to download videos from YouTube.com and other video sites
Character Animation (AnimateAnyone, Face Reenactment)
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Official implementation of "DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion"
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Code Repository for Machine Learning with PyTorch and Scikit-Learn
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Stable Diffusion web UI
Hackable and optimized Transformers building blocks, supporting a composable construction.
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Official toolkit for Multi-View Layout Estimation Challenge in OmniCV workshop at CVPR'23.
Instant neural graphics primitives: lightning fast NeRF and more
Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge
Datasets, Transforms and Models specific to Computer Vision
Tensorflow implementation of our end-to-end model to recover 3D layouts. Also with equirectangular convolutions!