Nothing Special   »   [go: up one dir, main page]

skip to main content
Reflects downloads up to 23 Nov 2024Bibliometrics
Skip Table Of Content Section
research-article
Causal Inference with Knowledge Distilling and Curriculum Learning for Unbiased VQA
Article No.: 67, Pages 1–23https://doi.org/10.1145/3487042

Recently, many Visual Question Answering (VQA) models rely on the correlations between questions and answers yet neglect those between the visual information and the textual information. They would perform badly if the handled data distribute differently ...

research-article
Interactive Re-ranking via Object Entropy-Guided Question Answering for Cross-Modal Image Retrieval
Article No.: 68, Pages 1–17https://doi.org/10.1145/3485042

Cross-modal image-retrieval methods retrieve desired images from a query text by learning relationships between texts and images. Such a retrieval approach is one of the most effective ways of achieving the easiness of query preparation. Recent cross-...

research-article
Shuffle-invariant Network for Action Recognition in Videos
Article No.: 69, Pages 1–18https://doi.org/10.1145/3485665

The local key features in video are important for improving the accuracy of human action recognition. However, most end-to-end methods focus on global feature learning from videos, while few works consider the enhancement of the local information in a ...

research-article
Learning Adaptive Spatial-Temporal Context-Aware Correlation Filters for UAV Tracking
Article No.: 70, Pages 1–18https://doi.org/10.1145/3486678

Tracking in the unmanned aerial vehicle (UAV) scenarios is one of the main components of target-tracking tasks. Different from the target-tracking task in the general scenarios, the target-tracking task in the UAV scenarios is very challenging because of ...

research-article
Enhanced 3D Shape Reconstruction With Knowledge Graph of Category Concept
Article No.: 71, Pages 1–20https://doi.org/10.1145/3491224

Reconstructing three-dimensional (3D) objects from images has attracted increasing attention due to its wide applications in computer vision and robotic tasks. Despite the promising progress of recent deep learning–based approaches, which directly ...

research-article
Domain-invariant Graph for Adaptive Semi-supervised Domain Adaptation
Article No.: 72, Pages 1–18https://doi.org/10.1145/3487194

Domain adaptation aims to generalize a model from a source domain to tackle tasks in a related but different target domain. Traditional domain adaptation algorithms assume that enough labeled data, which are treated as the prior knowledge are available in ...

research-article
Objective Object Segmentation Visual Quality Evaluation: Quality Measure and Pooling Method
Article No.: 73, Pages 1–19https://doi.org/10.1145/3491229

Objective object segmentation visual quality evaluation is an emergent member of the visual quality assessment family. It aims to develop an objective measure instead of a subjective survey to evaluate the object segmentation quality in agreement with ...

research-article
CRAR: Accelerating Stereo Matching with Cascaded Residual Regression and Adaptive Refinement
Article No.: 74, Pages 1–19https://doi.org/10.1145/3488719

Dense stereo matching estimates the depth for each pixel of the referenced images. Recently, deep learning algorithms have dramatically promoted the development of stereo matching. The state-of-the-art result is achieved by models adopting deep ...

research-article
Recognizing Gaits Across Walking and Running Speeds
Article No.: 75, Pages 1–22https://doi.org/10.1145/3488715

For decades, very few methods were proposed for cross-mode (i.e., walking vs. running) gait recognition. Thus, it remains largely unexplored regarding how to recognize persons by the way they walk and run. Existing cross-mode methods handle the walking-...

research-article
Inner Knowledge-based Img2Doc Scheme for Visual Question Answering
Article No.: 76, Pages 1–21https://doi.org/10.1145/3489142

Visual Question Answering (VQA) is a research topic of significant interest at the intersection of computer vision and natural language understanding. Recent research indicates that attributes and knowledge can effectively improve performance for both ...

research-article
Matching Faces and Attributes Between the Artistic and the Real Domain: the PersonArt Approach
Article No.: 77, Pages 1–23https://doi.org/10.1145/3490033

In this article, we present an approach for retrieving similar faces between the artistic and the real domain. The application we refer to is an interactive exhibition inside a museum, in which a visitor can take a photo of himself and search for a ...

research-article
A Multimodal Framework for Large-Scale Emotion Recognition by Fusing Music and Electrodermal Activity Signals
Article No.: 78, Pages 1–23https://doi.org/10.1145/3490686

Considerable attention has been paid to physiological signal-based emotion recognition in the field of affective computing. For reliability and user-friendly acquisition, electrodermal activity (EDA) has a great advantage in practical applications. ...

research-article
GraSP: Local Grassmannian Spatio-Temporal Patterns for Unsupervised Pose Sequence Recognition
Article No.: 79, Pages 1–23https://doi.org/10.1145/3491227

Many applications of action recognition, especially broad domains like surveillance or anomaly-detection, favor unsupervised methods considering that exhaustive labeling of actions is not possible. However, very limited work has happened in this domain. ...

research-article
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition
Article No.: 80, Pages 1–24https://doi.org/10.1145/3491228

Action recognition has been a heated topic in computer vision for its wide application in vision systems. Previous approaches achieve improvement by fusing the modalities of the skeleton sequence and RGB video. However, such methods pose a dilemma between ...

research-article
Distributed Gateway Selection for Video Streaming in VANET Using IP Multicast
Article No.: 81, Pages 1–24https://doi.org/10.1145/3491388

The volume of video traffic as infotainment service over vehicular ad hoc network (VANET) has rapidly increased for past few years. Providing video streaming as VANET infotainment service is very challenging because of high mobility and heterogeneity of ...

research-article
Multilayer Video Encoding for QoS Managing of Video Streaming in VANET Environment
Article No.: 82, Pages 1–19https://doi.org/10.1145/3491433

Efficient delivery and maintenance of the quality of service (QoS) of audio/video streams transmitted over VANETs for mobile and heterogeneous nodes are one of the major challenges in the convergence of this network type and these services. In this ...

research-article
When Pairs Meet Triplets: Improving Low-Resource Captioning via Multi-Objective Optimization
Article No.: 83, Pages 1–20https://doi.org/10.1145/3492325

Image captioning for low-resource languages has attracted much attention recently. Researchers propose to augment the low-resource caption dataset into (image, rich-resource language, and low-resource language) triplets and develop the dual attention ...

research-article
Improving Crowd Density Estimation by Fusing Aerial Images and Radio Signals
Article No.: 84, Pages 1–23https://doi.org/10.1145/3492346

A recent line of research focuses on crowd density estimation from RGB images for a variety of applications, for example, surveillance and traffic flow control. The performance drops dramatically for low-quality images, such as occlusion, or poor light ...

research-article
A Format-compatible Searchable Encryption Scheme for JPEG Images Using Bag-of-words
Article No.: 85, Pages 1–18https://doi.org/10.1145/3492705

The development of cloud computing attracts enterprises and individuals to outsource their data, such as images, to the cloud server. However, direct outsourcing causes the extensive concern of privacy leakage, as images often contain rich sensitive ...

research-article
Blockchain-Based Audio Watermarking Technique for Multimedia Copyright Protection in Distribution Networks
Article No.: 86, Pages 1–23https://doi.org/10.1145/3492803

Copyright protection in multimedia protection distribution is a challenging problem. To protect multimedia data, many watermarking methods have been proposed in the literature. However, most of them cannot be used effectively in a multimedia distribution ...

research-article
Deep Illumination-Enhanced Face Super-Resolution Network for Low-Light Images
Article No.: 87, Pages 1–19https://doi.org/10.1145/3495258

Face images are typically a key component in the fields of security and criminal investigation. However, due to lighting and shooting angles, faces taken under low-light conditions are often difficult to recognize. Face super-resolution (FSR) technology ...

research-article
Scribble-Supervised Meibomian Glands Segmentation in Infrared Images
Article No.: 88, Pages 1–23https://doi.org/10.1145/3497747

Infrared imaging is currently the most effective clinical method to evaluate the morphology of the meibomian glands (MGs) in patients. As an important indicator for monitoring the development of MG dysfunction, it is necessary to accurately measure gland-...

survey
Towards Integrating Image Encryption with Compression: A Survey
Article No.: 89, Pages 1–21https://doi.org/10.1145/3498342

As digital images are consistently generated and transmitted online, the unauthorized utilization of these images is an increasing concern that has a significant impact on both security and privacy issues; additionally, the representation of digital ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.