Author: Wu, Enhua : Search

research-article

Learning self-target knowledge for few-shot segmentation

Pattern Recognition (PATT), Volume 149, Issue Chttps://doi.org/10.1016/j.patcog.2024.110266

Abstract

Few-shot semantic segmentation uses a few annotated data of a specific class in the support set to segment the target of the same class in the query set. Most existing approaches fail to perform well when there are significant intra-class ...

Highlights

We propose a Query Prototype Generation Module to alleviate appearance discrepancy.
We propose a Support Auxiliary Refinement Module to mine more class information.
Extensive experiments prove the proposed model outperforms ...

editorial

Preface

Journal of Computer Science and Technology (JCST), Volume 39, Issue 2Pages 267–268https://doi.org/10.1007/s11390-024-0002-1

research-article

Dual Branch Multi-Level Semantic Learning for Few-Shot Segmentation

IEEE Transactions on Image Processing (TIP), Volume 33Pages 1432–1447https://doi.org/10.1109/TIP.2024.3364056

Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples in support images. Although progress has been made recently by combining prototype-based metric learning, existing methods still face ...

research-article

Spatial constraint for efficient semi-supervised video object segmentation

Computer Vision and Image Understanding (CVIU), Volume 237, Issue Chttps://doi.org/10.1016/j.cviu.2023.103843

Abstract

Semi-supervised video object segmentation is the process of tracking and segmenting objects in a video sequence based on annotated masks for one or more frames. Recently, memory-based methods have attracted a significant amount of attention due ...

Highlights

Time-varying sensor and dynamic feature memory reduce redundancy but retain key data.
Efficient memory reader has smaller footprint and reduces computational overhead.
Spatial constraint module maintains response map to filter ...

research-article

Efficient Binocular Rendering of Volumetric Density Fields With Coupled Adaptive Cube-Map Ray Marching for Virtual Reality

IEEE Transactions on Visualization and Computer Graphics (ITVC), Volume 30, Issue 10Pages 6625–6638https://doi.org/10.1109/TVCG.2023.3322416

Creating visualizations of multiple volumetric density fields is demanding in virtual reality (VR) applications, which often include divergent volumetric density distributions mixed with geometric models and physics-based simulations. Real-time rendering ...

research-article

Boosting Video Object Segmentation via Robust and Efficient Memory Network

IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 34, Issue 5Pages 3340–3352https://doi.org/10.1109/TCSVT.2023.3321977

Recently, memory-based methods have exhibited remarkable performance in Video Object Segmentation (VOS) by employing non-local pixel-wise matching between the query and memory. Nevertheless, these methods suffer from two limitations: 1) Non-local pixel-...

research-article

Easy recognition of artistic Chinese calligraphic characters

The Visual Computer: International Journal of Computer Graphics (VISC), Volume 39, Issue 8Pages 3755–3766https://doi.org/10.1007/s00371-023-03026-2

Abstract

Chinese calligraphy is one of the excellent expressions of Chinese traditional art. But people without domain knowledge of calligraphy can hardly read, appreciate, or learn this art form, due to it contains many brush strokes with unique shapes ...

research-article

3D human pose lifting with grid convolution

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial IntelligenceArticle No.: 123, Pages 1105–1113https://doi.org/10.1609/aaai.v37i1.25192

Existing lifting networks for regressing 3D human poses from 2D single-view poses are typically constructed with linear layers based on graph-structured representation learning. In sharp contrast to them, this paper presents Grid Convolution (GridConv), ...

research-article

Video object segmentation through semantic visual words matching

Multimedia Tools and Applications (MTAA), Volume 82, Issue 13Pages 19591–19605https://doi.org/10.1007/s11042-023-14361-w

Abstract

Video object segmentation (VOS) has been widely used in the fields of computer vision. However, existing VOS algorithms have drawbacks, such as difficulty with object deformation, occlusion, and fast motion. We therefore propose an effective VOS ...

proceeding

VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

research-article

Redistribution of weights and activations for AdderNet quantization

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 1652, Pages 22739–22751

Adder Neural Network (AdderNet) provides a new way for developing energy-efficient neural networks by replacing the expensive multiplications in convolution with cheaper additions (i.e., ℓ₁-norm). To achieve higher hardware efficiency, it is necessary to ...

research-article

Vision GNN: an image is worth graph of nodes

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 603, Pages 8291–8303

Network architecture plays a key role in the deep learning-based computer vision system. The widely-used convolutional neural network and transformer treat the image as a grid or sequence structure, which is not flexible to capture irregular and complex ...

research-article

Meta-transfer-adjustment learning for few-shot learning

Journal of Visual Communication and Image Representation (JVCIR), Volume 89, Issue Chttps://doi.org/10.1016/j.jvcir.2022.103678

Abstract

Deep neural network models with strong feature extraction capacity are prone to overfitting and fail to adapt quickly to new tasks with few samples. Gradient-based meta-learning approaches can minimize overfitting and adapt to new tasks fast, but ...

research-article

A Boundary-Aware Network for Shadow Removal

IEEE Transactions on Multimedia (TOM), Volume 25Pages 6782–6793https://doi.org/10.1109/TMM.2022.3214422

Shadow removal is a challenging computer vision and multimedia task that aims to restore image content in shadow regions. The state-of-the-art shadow removal methods introduce artifacts near shadow boundaries or inconsistencies between shadow and ...

Article

SlimFliud-Net: Fast Fluid Simulation Using Admm Pruning

Advances in Computer GraphicsPages 582–593https://doi.org/10.1007/978-3-031-23473-6_45

Abstract

While data-driven fluid simulation methods greatly replace the physics-based fluid solver and achieve high quality results, it is a challenge to get enough realistic effect with less time. The Huge neural network models brought by the complexity ...

research-article

Spatio-temporal compression for semi-supervised video object segmentation

The Visual Computer: International Journal of Computer Graphics (VISC), Volume 39, Issue 10Pages 4929–4942https://doi.org/10.1007/s00371-022-02638-4

Abstract

In this paper, we explore the spatial–temporal redundancy in video object segmentation (VOS) under semi-supervised context with the purpose to improve the computational efficiency. Recently, memory-based methods have attracted great attention for ...

research-article

GhostNets on Heterogeneous Devices via Cheap Operations

International Journal of Computer Vision (IJCV), Volume 130, Issue 4Pages 1050–1069https://doi.org/10.1007/s11263-022-01575-y

Abstract

Deploying convolutional neural networks (CNNs) on mobile devices is difficult due to the limited memory and computation resources. We aim to design efficient neural networks for heterogeneous devices including CPU and GPU, by exploiting the ...

research-article

Viewport-Resolution Independent Anti-Aliased Ray Marching on Interior Faces in Cube-Map Space

SA '21: SIGGRAPH Asia 2021 Technical CommunicationsArticle No.: 21, Pages 1–4https://doi.org/10.1145/3478512.3488598

This paper presents a novel approach to anti-aliased ray marching by indirect shading in cube-map space. Our volume renderer firstly performs ray marching on each visible interior pixel of a maximum-resolution-limited cube map, and then resamples (...

research-article

Dynamic resolution network

NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsArticle No.: 2092, Pages 27319–27330

Deep convolutional neural networks (CNNs) are often of sophisticated design with numerous learnable parameters for the accuracy reason. To alleviate the expensive costs of deploying them on mobile devices, recent works have made huge efforts for ...

research-article

Transformer in transformer

NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsArticle No.: 1217, Pages 15908–15919

Transformer is a new kind of neural architecture which encodes the input data as powerful features via the attention mechanism. Basically, the visual transformers first divide the input images into several local patches and then calculate both ...

Applied Filters

People

Names

Institutions

Authors

Editors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder