Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleFebruary 2024
Pyramid Deep Fusion Network for Two-Hand Reconstruction From RGB-D Images
IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 34, Issue 7Pages 5843–5855https://doi.org/10.1109/TCSVT.2024.3369646Accurately recovering the dense 3D mesh of both hands from monocular images poses considerable challenges due to occlusions and projection ambiguity. Most of the existing methods extract features from color images to estimate the root-aligned hand meshes, ...
- research-articleFebruary 2024
Box2Mask: Box-Supervised Instance Segmentation via Level-Set Evolution
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 7Pages 5157–5173https://doi.org/10.1109/TPAMI.2024.3363054In contrast to fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of simple box annotations, which has recently attracted increasing research attention. This paper presents a novel single-shot ...
- research-articleMay 2024
Label-efficient segmentation via affinity propagation
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1302, Pages 29901–29913Weakly-supervised segmentation with label-efficient sparse annotations has attracted increasing research attention to reduce the cost of laborious pixel-wise labeling process, while the pairwise affinity modeling techniques play an essential role in this ...
- research-articleNovember 2023
SLED: Structure Learning based Denoising for Recommendation
- Shengyu Zhang,
- Tan Jiang,
- Kun Kuang,
- Fuli Feng,
- Jin Yu,
- Jianxin Ma,
- Zhou Zhao,
- Jianke Zhu,
- Hongxia Yang,
- Tat-Seng Chua,
- Fei Wu
ACM Transactions on Information Systems (TOIS), Volume 42, Issue 2Article No.: 43, Pages 1–31https://doi.org/10.1145/3611385In recommender systems, click behaviors play a fundamental role in mining users’ interests and training models (clicked items as positive samples). Such signals are implicit feedback and are arguably less representative of users’ inherent interests. Most ...
- research-articleOctober 2023
Moiré Backdoor Attack (MBA): A Novel Trigger for Pedestrian Detectors in the Physical World
MM '23: Proceedings of the 31st ACM International Conference on MultimediaPages 8828–8838https://doi.org/10.1145/3581783.3611910A backdoor attack is executed by injecting a few poisoned samples into the training dataset of Deep Neural Networks (DNNs), enabling attackers to implant a hidden manipulation. This manipulation can be triggered during inference to exhibit controlled ...
-
- research-articleJuly 2023
Topology-preserved human reconstruction with details
The Visual Computer: International Journal of Computer Graphics (VISC), Volume 39, Issue 8Pages 3609–3619https://doi.org/10.1007/s00371-023-02957-0AbstractDue to the high diversity and complexity of body shapes, it is challenging to directly estimate the human geometry from a single image with the various clothing styles. Most of the model-based approaches are limited to predict the shape and pose ...
- research-articleJuly 2023
End-to-end weakly-supervised single-stage multiple 3D hand mesh reconstruction from a single RGB image
Computer Vision and Image Understanding (CVIU), Volume 232, Issue Chttps://doi.org/10.1016/j.cviu.2023.103706AbstractIn this paper, we consider the challenging task of simultaneously locating and recovering multiple hands from a single 2D image. Previous studies either focus on single hand reconstruction or solve this problem in a multi-stage way. Moreover, the ...
Highlights- A single-stage framework for multi-hand 3D reconstruction.
- End-to-end training and requires no additional third-party detectors.
- Multi-hand data generation under weak supervision.
- Advantages in prediction accuracy and inference ...
- research-articleApril 2023
Multiview Textured Mesh Recovery by Differentiable Rendering
IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 33, Issue 4Pages 1684–1696https://doi.org/10.1109/TCSVT.2022.3213543Although having achieved the promising results on shape and color recovery through self-supervision, the multi-layer perceptrons-based methods usually suffer from heavy computational cost on learning the deep implicit surface representation. Since ...
- research-articleMarch 2023
Improving Nighttime Driving-Scene Segmentation via Dual Image-Adaptive Learnable Filters
IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 33, Issue 10Pages 5855–5867https://doi.org/10.1109/TCSVT.2023.3260240Semantic segmentation on driving-scene images is vital for autonomous driving. Although encouraging performance has been achieved on daytime images, the performance on nighttime images are less satisfactory due to the insufficient exposure and the lack of ...
- research-articleFebruary 2023
READ: large-scale neural scene rendering for autonomous driving
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial IntelligenceArticle No.: 169, Pages 1522–1529https://doi.org/10.1609/aaai.v37i2.25238With the development of advanced driver assistance systems (ADAS) and autonomous vehicles, conducting experiments in various scenarios becomes an urgent need. Although having been capable of synthesizing photo-realistic street scenes, conventional image-...
- ArticleOctober 2022
Box-Supervised Instance Segmentation with Level Set Evolution
AbstractIn contrast to the fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of the simple box annotations, which has recently attracted a lot of research attentions. In this paper, we propose a ...
- research-articleJanuary 2022
Horizontal-to-Vertical Video Conversion
IEEE Transactions on Multimedia (TOM), Volume 24Pages 3036–3048https://doi.org/10.1109/TMM.2021.3092202At this blooming age of social media and mobile platform, mass consumers are migrating from horizontal video to vertical contents delivered on hand-held devices. Accordingly, revitalizing the exposure of horizontal video becomes vital and urgent, which is ...
- research-articleDecember 2020
DeepFacade: A Deep Learning Approach to Facade Parsing With Symmetric Loss
IEEE Transactions on Multimedia (TOM), Volume 22, Issue 12Pages 3153–3165https://doi.org/10.1109/TMM.2020.2971431Parsing building facades into procedural grammars plays an important role for 3D building model generation tasks, which have been long desired in computer vision. Deep learning is a promising approach to facade parsing, however, a straightforward solution ...
- research-articleOctober 2020
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
MM '20: Proceedings of the 28th ACM International Conference on MultimediaPages 4373–4382https://doi.org/10.1145/3394171.3413518In this paper, we propose to investigate the problem of out-of-domain visio-linguistic pretraining, where the pretraining data distribution differs from that of downstream data on which the pretrained model will be fine-tuned. Existing methods for this ...
- research-articleMarch 2020
Feature agglomeration networks for single stage face detection
Neurocomputing (NEUROC), Volume 380, Issue CPages 180–189https://doi.org/10.1016/j.neucom.2019.10.087AbstractRecent years have witnessed promising results of exploring deep convolutional neural network for face detection. Despite making remarkable progress, face detection in the wild remains challenging especially when detecting faces at ...
- research-articleOctober 2019
AI Coach: Deep Human Pose Estimation and Analysis for Personalized Athletic Training Assistance
MM '19: Proceedings of the 27th ACM International Conference on MultimediaPages 374–382https://doi.org/10.1145/3343031.3350910Recent years have witnessed an unprecedented growing of sport videos, as different types of sports activities can be widely-observed (i.e., from professional athletics to personal fitness). Existing approaches by computer vision have predominantly ...
- demonstrationOctober 2019
AI Coach: Deep Human Pose Estimation and Analysis for Personalized Athletic Training Assistance
MM '19: Proceedings of the 27th ACM International Conference on MultimediaPages 2228–2230https://doi.org/10.1145/3343031.3350609Accurate pose analysis in sport videos is beneficial to users to improve skills. In this paper, we propose an AI coach system to provide personalized athletic training experiences for posture-wise sports activities, in which the training quality largely ...
- research-articleJanuary 2019
Robust estimation of similarity transformation for visual object tracking
AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial IntelligenceArticle No.: 1063, Pages 8666–8673https://doi.org/10.1609/aaai.v33i01.33018666Most of existing correlation filter-based tracking approaches only estimate simple axis-aligned bounding boxes, and very few of them is capable of recovering the underlying similarity transformation. To tackle this challenging problem, in this paper, we ...
- research-articleApril 2018
Noise-aware co-segmentation with local and global priors
Neurocomputing (NEUROC), Volume 287, Issue CPages 221–231https://doi.org/10.1016/j.neucom.2018.02.018Image segmentation is a long-standing challenge in image and video processing. The method of co-segmentation aims at discovering common foreground object shared in image set. The traditional co-segmentation methods usually assume that all images should ...
- research-articleApril 2018
Temporally-adjusted correlation filter-based tracking
Neurocomputing (NEUROC), Volume 286, Issue CPages 121–129https://doi.org/10.1016/j.neucom.2018.01.067Recently, discriminative correlation filter (DCF) has been wildly studied and adopted in visual object tracking task. Since the convolution operation can be efficiently computed through fast Fourier transform (FFT), DCF trackers achieve the outstanding ...