Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleNovember 2024
QR-CLIP: Introducing Explicit Knowledge for Location and Time Reasoning
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 11Article No.: 358, Pages 1–22https://doi.org/10.1145/3689638This article focuses on reasoning about the location and time behind images. Given that pre-trained vision-language models (VLMs) exhibit excellent image and text understanding capabilities, most existing methods leverage them to match visual cues with ...
- research-articleNovember 2024
Structure recovery from single omnidirectional image with distortion-aware learning
Journal of King Saud University - Computer and Information Sciences (JKSUCIS), Volume 36, Issue 7https://doi.org/10.1016/j.jksuci.2024.102151AbstractRecovering structures from images with 180∘ or 360∘ FoV is pivotal in computer vision and computational photography, particularly for VR/AR/MR and autonomous robotics applications. Due to varying distortions and the complexity of indoor scenes, ...
- research-articleFebruary 2024
Toward Generalized Few-Shot Open-Set Object Detection
IEEE Transactions on Image Processing (TIP), Volume 33Pages 1389–1402https://doi.org/10.1109/TIP.2024.3364495Open-set object detection (OSOD) aims to detect the known categories and reject unknown objects in a dynamic world, which has achieved significant attention. However, previous approaches only consider this problem in data-abundant conditions, while ...
- research-articleDecember 2023
Mirror world: creating digital twins of the space and persons from video streamings
The Visual Computer: International Journal of Computer Graphics (VISC), Volume 40, Issue 9Pages 6689–6704https://doi.org/10.1007/s00371-023-03193-2AbstractCreating digital twins of scenes has been widely studied in smart city applications. The development of 3D virtual games attracts many researchers. However, most products have high costs, neglect 3D recovery of the moving people, and lack ...
- research-articleNovember 2023
Recommending Analogical APIs via Knowledge Graph Embedding
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringPages 1496–1508https://doi.org/10.1145/3611643.3616305Library migration, which replaces the current library with a different one to retain the same software behavior, is common in software evolution. An essential part of this is finding an analogous API for the desired functionality. However, due to the ...
-
- research-articleNovember 2023
Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes
IEEE Transactions on Intelligent Transportation Systems (ITS-TRANSACTIONS), Volume 25, Issue 5Pages 3575–3587https://doi.org/10.1109/TITS.2023.3330498Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes ...
- research-articleNovember 2023
Hybrid semantic segmentation for tunnel lining cracks based on Swin Transformer and convolutional neural network
Computer-Aided Civil and Infrastructure Engineering (MICE), Volume 38, Issue 17Pages 2491–2510https://doi.org/10.1111/mice.13003AbstractIn the field of tunnel lining crack identification, the semantic segmentation algorithms based on convolution neural network (CNN) are extensively used. Owing to the inherent locality of CNN, these algorithms cannot make full use of context ...
- research-articleOctober 2023
HSIC-based Moving Weight Averaging for Few-Shot Open-Set Object Detection
MM '23: Proceedings of the 31st ACM International Conference on MultimediaPages 5358–5369https://doi.org/10.1145/3581783.3611850We study the problem of few-shot open-set object detection (FOOD), whose goal is to quickly adapt a model to a small set of labeled samples and reject unknown class samples. Recent works usually use the weight sparsification for unknown rejection, but ...
- research-articleOctober 2023
VirtualLoc: Large-scale Visual Localization Using Virtual Images
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 3Article No.: 66, Pages 1–19https://doi.org/10.1145/3622788Robust and accurate camera pose estimation is fundamental in computer vision. Learning-based regression approaches acquire six-degree-of-freedom camera parameters accurately from visual cues of an input image. However, most are trained on street-view and ...
- research-articleOctober 2023
Geometric-driven structure recovery from a single omnidirectional image based on planar depth map learning
Neural Computing and Applications (NCAA), Volume 35, Issue 34Pages 24407–24433https://doi.org/10.1007/s00521-023-09025-7AbstractScene structure recovery is a crucial process for assisting scene reconstruction and understanding by extracting vital scene structure information and has been widely used in smart city, VR/AR and intelligent robot navigation. Omnidirectional ...
- research-articleMarch 2023
Center-point-pair detection and context-aware re-identification for end-to-end multi-object tracking
AbstractOnline multi-object tracking aims at generating the trajectories for multiple objects in the surveillance scene. It remains a challenging problem in crowded scenes because objects often gather together and occlude in tracking frames. ...
- research-articleSeptember 2023
Context and Spatial Feature Calibration for Real-Time Semantic Segmentation
IEEE Transactions on Image Processing (TIP), Volume 32Pages 5465–5477https://doi.org/10.1109/TIP.2023.3318967Context modeling or multi-level feature fusion methods have been proved to be effective in improving semantic segmentation performance. However, they are not specialized to deal with the problems of pixel-context mismatch and spatial feature misalignment, ...
- research-articleOctober 2022
Mutual purification for unsupervised domain adaptation in person re-identification
Neural Computing and Applications (NCAA), Volume 34, Issue 19Pages 16929–16944https://doi.org/10.1007/s00521-022-07340-zAbstractUnsupervised domain adaptation person re-identification aims to adapt the model learned on a labeled source domain to an unlabeled target domain. It has attracted extensively attention in the computer vision community due to its important ...
- research-articleAugust 2022
Part-Level Car Parsing and Reconstruction in Single Street View Images
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 44, Issue 8Pages 4291–4305https://doi.org/10.1109/TPAMI.2021.3064837Part information has been proven to be resistant to occlusions and viewpoint changes, which are main difficulties in car parsing and reconstruction. However, in the absence of datasets and approaches incorporating car parts, there are limited works that ...
- research-articleApril 2022
Automatic detection method of tunnel lining multi‐defects via an enhanced You Only Look Once network
Computer-Aided Civil and Infrastructure Engineering (MICE), Volume 37, Issue 6Pages 762–780https://doi.org/10.1111/mice.12836AbstractAiming to solve the challenges of low detection accuracy, poor anti‐interference ability, and slow detection speed in the traditional tunnel lining defect detection methods, a novel deep learning‐based model, named You Only Look Once network v4 ...
- ArticleNovember 2021
Diabetic Retinopathy Grading Base on Contrastive Learning and Semi-supervised Learning
AbstractThe diabetic retinopathy (DR) detection based on deep learning is a powerful tool for early screening of DR. Although several automatic DR grading algorithms have been proposed, their performance is still limited by the characteristics of DR ...
- ArticleNovember 2021
Online Multi-Object Tracking with Pose-Guided Object Location and Dual Self-Attention Network
PRICAI 2021: Trends in Artificial IntelligencePages 223–235https://doi.org/10.1007/978-3-030-89370-5_17AbstractThe recent trend in Multi-Object Tracking (MOT) is heading towards using deep learning to detect objects and extract features. Although tracking frameworks using detection network have achieved outstanding performance in object locating on MOT, it ...
- ArticleNovember 2021
Asymmetric Mutual Learning for Unsupervised Cross-Domain Person Re-identification
PRICAI 2021: Trends in Artificial IntelligencePages 124–137https://doi.org/10.1007/978-3-030-89370-5_10AbstractUnsupervised domain adaptation in person re-identification is a challenging task. The performance of models trained on a specific domain generally degrades significantly on other domains due to the domain gaps. State-of-the-art clustering-based ...