Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-031-73383-3guideproceedingsBook PagePublication PagesConference Proceedingsacm-pubtype
Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XXVII
2024 Proceeding
  • Editors:
  • Aleš Leonardis,
  • Elisa Ricci,
  • Stefan Roth,
  • Olga Russakovsky,
  • Torsten Sattler,
  • Gül Varol
Publisher:
  • Springer-Verlag
  • Berlin, Heidelberg
Conference:
European Conference on Computer VisionMilan, Italy29 September 2024
ISBN:
978-3-031-73382-6
Published:
14 November 2024

Reflects downloads up to 29 Nov 2024Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
front-matter
Front Matter
Pages i–lxxxv
back-matter
Back Matter
Article
SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking
Abstract

Open-vocabulary Multiple Object Tracking (MOT) aims to generalize trackers to novel categories not in the training set. Currently, the best-performing methods are mainly based on pure appearance matching. Due to the complexity of motion patterns ...

Article
Tensorial Template Matching for Fast Cross-Correlation with Rotations and Its Application for Tomography
Abstract

Object detection is a main task in computer vision. Template matching is the reference method for detecting objects with arbitrary templates. However, template matching computational complexity depends on the rotation accuracy, being a limiting ...

Article
FreeAugment: Data Augmentation Search Across All Degrees of Freedom
Abstract

Data augmentation has become an integral part of deep learning, as it is known to improve the generalization capabilities of neural networks. Since the most effective set of image transformations differs between tasks and domains, automatic data ...

Article
Learning Representations of Satellite Images From Metadata Supervision
Abstract

Self-supervised learning is increasingly applied to Earth observation problems that leverage satellite and other remotely sensed data. Within satellite imagery, metadata such as time and location often hold significant semantic information that ...

Article
I2-SLAM: Inverting Imaging Process for Robust Photorealistic Dense SLAM
Abstract

We present an inverse image-formation module that can enhance the robustness of existing visual SLAM pipelines for casually captured scenarios. Casual video captures often suffer from motion blur and varying appearances, which degrade the final ...

Article
FlashTex: Fast Relightable Mesh Texturing with LightControlNet
Abstract

Manually creating textures for 3D meshes is time-consuming, even for expert visual content creators. We propose a fast approach for automatically texturing an input 3D mesh based on a user-provided text prompt. Importantly, our approach ...

Article
GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence
Abstract

Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics. Recently, deep-learning-based approaches have made great progress, but are typically hindered by the need for large datasets of ...

Article
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
Abstract

Recognizing and disentangling visual attributes from objects is a foundation to many computer vision applications. While large vision-language representations like CLIP had largely resolved the task of zero-shot object recognition, zero-shot ...

Article
PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-View Self-guidance
Abstract

Immersive scene generation, notably panorama creation, benefits significantly from the adaptation of large pre-trained text-to-image (T2I) models for multi-view image generation. Due to the high cost of acquiring multi-view images, tuning-free ...

Article
SOS: Segment Object System for Open-World Instance Segmentation with Object Priors
Abstract

We propose an approach for Open-World Instance Segmentation (OWIS), a task that aims to segment arbitrary unknown objects in images by generalizing from a limited set of annotated object classes during training. Our Segment Object System (SOS) ...

Article
Lagrangian Hashing for Compressed Neural Field Representations
Abstract

We present Lagrangian Hashing, a representation for neural fields combining the characteristics of fast training NeRF methods that rely on Eulerian grids (i.e. InstantNGP), with those that employ points equipped with features as a way to ...

Article
EDformer: Transformer-Based Event Denoising Across Varied Noise Levels
Abstract

Currently, there is relatively limited research on the background activity noise of event cameras in different brightness conditions, and the relevant real-world datasets are extremely scarce. This limitation contributes to the lack of robustness ...

Article
Foster Adaptivity and Balance in Learning with Noisy Labels
Abstract

Label noise is ubiquitous in real-world scenarios, posing a practical challenge to supervised models due to its effect in hurting the generalization performance of deep neural networks. Existing methods primarily employ the sample selection ...

Article
MetaAug: Meta-data Augmentation for Post-training Quantization
Abstract

Post-Training Quantization (PTQ) has received significant attention because it requires only a small set of calibration data to quantize a full-precision model, which is more practical in real-world applications in which full access to a large ...

Article
Thermal3D-GS: Physics-Induced 3D Gaussians for Thermal Infrared Novel-View Synthesis
Abstract

Novel-view synthesis based on visible light has been extensively studied. In comparison to visible light imaging, thermal infrared imaging offers the advantage of all-weather imaging and strong penetration, providing increased possibilities for ...

Article
Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach
Abstract

In this paper, we construct a large-scale benchmark dataset for Ground-to-Aerial Video-based person Re-Identification, named G2A-VReID, which comprises 185,907 images and 5,576 tracklets, featuring 2,788 distinct identities. To our knowledge, this ...

Article
Unleashing the Power of Prompt-Driven Nucleus Instance Segmentation
Abstract

Nucleus instance segmentation in histology images is crucial for a broad spectrum of clinical applications. Current dominant algorithms rely on regression of nuclear proxy maps. Distinguishing nucleus instances from the estimated maps requires ...

Article
Gaze Target Detection Based on Head-Local-Global Coordination
Abstract

This paper introduces a novel approach to gaze target detection leveraging a head-local-global coordination framework. Unlike traditional methods that rely heavily on estimating gaze direction and identifying salient objects in global view images, ...

Article
3DSA: Multi-view 3D Human Pose Estimation With 3D Space Attention Mechanisms
Abstract

In this study, we introduce the 3D space attention module (3DSA) as a novel approach to address the drawback of multi-view 3D human pose estimation methods, which fail to recognize the object’s significance from diverse viewpoints. Specifically, ...

Article
Toward Tiny and High-Quality Facial Makeup with Data Amplify Learning
Abstract

Contemporary makeup approaches primarily hinge on unpaired learning paradigms, yet they grapple with the challenges of inaccurate supervision (e.g., face misalignment) and sophisticated facial prompts (including face parsing, and landmark ...

Article
An Economic Framework for 6-DoF Grasp Detection
Abstract

Robotic grasping in clutters is a fundamental task in robotic manipulation. In this work, we propose an economic framework for 6-DoF grasp detection, aiming to economize the resource cost in training and meanwhile maintain effective grasp ...

Article
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
Abstract

3D semantic occupancy prediction aims to obtain 3D fine-grained geometry and semantics of the surrounding scene and is an important task for the robustness of vision-centric autonomous driving. Most existing methods employ dense grids such as ...

Article
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Abstract

Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images). While remarkable results have been achieved using diffusion-based generation ...

Article
AdaLog: Post-training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Abstract

Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks in the computer vision community. Despite the high accuracy, deploying it in real applications raises critical challenges including the high computational ...

Article
Multi-label Cluster Discrimination for Visual Representation Learning
Abstract

Contrastive Language Image Pre-training (CLIP) has recently demonstrated success across various tasks due to superior feature representation empowered by image-text contrastive learning. However, the instance discrimination method used by CLIP can ...

Article
Plan, Posture and Go: Towards Open-Vocabulary Text-to-Motion Generation
Abstract

Conventional text-to-motion generation methods are usually trained on limited text-motion pairs, making them hard to generalize to open-vocabulary scenarios. Some works use the CLIP model to align the motion space and the text space, aiming to ...

Article
DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion
Abstract

Infrared-visible object detection aims to achieve robust even full-day object detection by fusing the complementary information of infrared and visible images. However, highly dynamically variable complementary characteristics and commonly ...

Contributors
  • University of Birmingham
  • Bruno Kessler Foundation
  • Technical University of Darmstadt
  • Princeton University
  • Czech Technical University in Prague
  • National School of Bridges and Roads
Index terms have been assigned to the content through auto-classification.
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations