Author: Li, Wanqing : Search

Article

Frequency-Domain Transformation-Based Dynamic Gesture Recognition with Skeleton

Pattern Recognition and Computer VisionPages 173–185https://doi.org/10.1007/978-981-97-8502-5_13

Abstract

Graph convolutional networks (GCNs) have been widely used in skeleton-based hand gesture recognition due to strong ability in mining non-Euclidean features. However, GCNs cannot effectively extract long temporal information. To address this issue, ...

research-article

Improving smartphone GNSS positioning in challenging urban environments using GA-BPNN

GPS Solutions (SPGPS), Volume 29, Issue 1https://doi.org/10.1007/s10291-024-01756-x

Abstract

Smartphones have become the mainstream terminals in the field of location services due to their low cost, portability, and ubiquity. In highly dynamic situations, the challenging urban environment causes the received Global Navigation Satellite ...

Article

Extractive Question Answering with Contrastive Puzzles and Reweighted Clues

Document Analysis and Recognition - ICDAR 2024Pages 97–112https://doi.org/10.1007/978-3-031-70552-6_6

Abstract

The task of Extractive Question Answering (EQA) involves identifying correct answer spans in response to provided questions and passages. The emergence of Pretrained Language Models (PLMs) has sparked increased interest in leveraging these models ...

Article

ConClue: Conditional Clue Extraction for Multiple Choice Question Answering

Document Analysis and Recognition - ICDAR 2024Pages 183–198https://doi.org/10.1007/978-3-031-70552-6_11

Abstract

The task of Multiple Choice Question Answering (MCQA) aims to identify the correct answer from a set of candidates, given a background passage and an associated question. Considerable research efforts have been dedicated to addressing this task, ...

research-article

Static graph convolution with learned temporal and channel-wise graph topology generation for skeleton-based action recognition

Computer Vision and Image Understanding (CVIU), Volume 244, Issue Chttps://doi.org/10.1016/j.cviu.2024.104012

Abstract

Graph convolutional networks (GCNs) are widely used in skeleton-based action recognition. It is known that the graph topology is a vital part in GCNs, and different kinds of graph topologies have been proposed for skeleton-based action ...

Highlights

Temporal frame-wise and channel-wise topology based GCNs (TC-GCNs) are developed instead of using a predefined topology.
The proposed TC-GCNs can be integrated with the conventional dynamic graph to improve performance.
Extensive ...

research-article

DFN: A deep fusion network for flexible single and multi-modal action recognition

Expert Systems with Applications: An International Journal (EXWA), Volume 245, Issue Chttps://doi.org/10.1016/j.eswa.2024.123145

Abstract

Multi-modal action recognition methods can be generally classified into two categories: (1) fusing multi-modal features with simple concatenation or fusing the classification scores of individual modalities without considering the interaction ...

Highlights

End-to-end trainable deep fusion network (DFN) for action recognition.
DFN outperforms the commonly used fusion methods.
DFN performs better than single modality cases when one modality is missing.
Competitive performance ...

research-article

Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 8Article No.: 250, Pages 1–19https://doi.org/10.1145/3663570

Monocular depth estimation aims to infer a depth map from a single image. Although supervised learning-based methods have achieved remarkable performance, they generally rely on a large amount of labor-intensively annotated data. Self-supervised methods, ...

research-article

An attention-based CNN for automatic whole-body postural assessment

Expert Systems with Applications: An International Journal (EXWA), Volume 238, Issue PFhttps://doi.org/10.1016/j.eswa.2023.122391

Abstract

Fully automatic postural assessment is highly useful, but has been challenging. Conventional methods either require manual assessment by ergonomists or depend on special devices that are intrusive, thus being hardly feasible in daily activities ...

Highlights

A novel attention-based CNN for automatic whole-body postural assessment.
The network works directly on single color images rather than 3D skeletons.
A new multi-view and multi-modality dataset is created for postural assessment ...

research-article

mmHSV: In-Air Handwritten Signature Verification via Millimeter-Wave Radar

ACM Transactions on Internet of Things (TIOT), Volume 4, Issue 4Article No.: 27, Pages 1–22https://doi.org/10.1145/3614443

Electronic signatures are widely used in financial business, telecommuting, and identity authentication. Offline electronic signatures are vulnerable to copy or replay attacks. Contact-based online electronic signatures are limited by indirect contact ...

research-article

A consensus model under framework of prospect theory with acceptable adjustment and endo-confidence

Information Fusion (INFU), Volume 97, Issue Chttps://doi.org/10.1016/j.inffus.2023.101808

Highlights

Considering the dynamic reference in prospect theory to reflect the idea of DMs.

Abstract

Consensus is an important issue in group decision making to make a reliable and scientific decision, and it has become a hot topic recently. Due to the complexity and uncertainty of decision-making problems, several aspects of ...

research-article

Modeling Long-range Dependencies and Epipolar Geometry for Multi-view Stereo

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 19, Issue 6Article No.: 200, Pages 1–17https://doi.org/10.1145/3596445

This article proposes a network, referred to as Multi-View Stereo TRansformer (MVSTR) for depth estimation from multi-view images. By modeling long-range dependencies and epipolar geometry, the proposed MVSTR is capable of extracting dense features with ...

research-article

Neural network model based on global and local features for multi-view mammogram classification

Neurocomputing (NEUROC), Volume 536, Issue CPages 21–29https://doi.org/10.1016/j.neucom.2023.03.028

Abstract

Mammography is an important screening criterion for breast cancer, one of the major diseases causing numerous deaths among female patients. Meanwhile, manual diagnosis of mammography is a time-consuming and labor-consuming job. ...

research-article

Novel View Synthesis from a Single Unposed Image via Unsupervised Learning

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 19, Issue 6Article No.: 186, Pages 1–23https://doi.org/10.1145/3587467

Novel view synthesis aims to generate novel views from one or more given source views. Although existing methods have achieved promising performance, they usually require paired views with different poses to learn a pixel transformation. This article ...

research-article

Sign language recognition via dimensional global–local shift and cross-scale aggregation

Neural Computing and Applications (NCAA), Volume 35, Issue 17Pages 12481–12493https://doi.org/10.1007/s00521-023-08380-9

Abstract

Sign languages generally consist of a sequence of upper body gestures and are cooperative processes among various parts such as the hands, arms, and face. Therefore, the dynamics of the parts as well as the holistic appearance of the upper body ...

Article

Learning Using Privileged Information for Zero-Shot Action Recognition

Computer Vision – ACCV 2022Pages 347–362https://doi.org/10.1007/978-3-031-26316-3_21

Abstract

Zero-Shot Action Recognition (ZSAR) aims to recognize video actions that have never been seen during training. Most existing methods assume a shared semantic space between seen and unseen actions and intend to directly learn a mapping from a ...

Article

Focal and Global Spatial-Temporal Transformer for Skeleton-Based Action Recognition

Computer Vision – ACCV 2022Pages 155–171https://doi.org/10.1007/978-3-031-26316-3_10

Abstract

Despite great progress achieved by transformer in various vision tasks, it is still underexplored for skeleton-based action recognition with only a few attempts. Besides, these methods directly calculate the pair-wise global self-attention equally ...

research-article

Towards video text visual question answering: benchmark and baseline

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 2576, Pages 35549–35562

There are already some text-based visual question answering (TextVQA) benchmarks for developing machine's ability to answer questions based on texts in images in recent years. However, models developed on these benchmarks cannot work effectively in many ...

Article

Contrastive Positive Mining for Unsupervised 3D Action Representation Learning

Computer Vision – ECCV 2022Pages 36–51https://doi.org/10.1007/978-3-031-19772-7_3

Abstract

Recent contrastive based 3D action representation learning has made great progress. However, the strict positive/negative constraint is yet to be relaxed and the use of non-self positive is yet to be explored. In this paper, a Contrastive Positive ...

research-article

FT-HID: a large-scale RGB-D dataset for first- and third-person human interaction analysis

Neural Computing and Applications (NCAA), Volume 35, Issue 2Pages 2007–2024https://doi.org/10.1007/s00521-022-07826-w

Abstract

Analysis of human interaction is one important research topic of human motion analysis. It has been studied either using first-person vision (FPV) or third-person vision (TPV). However, the joint learning of both types of vision has so far ...

research-article

An endo-confidence-based consensus with hierarchical clustering and automatic feedback in multi-attribute large-scale group decision-making

Information Sciences: an International Journal (ISCI), Volume 608, Issue CPages 1702–1730https://doi.org/10.1016/j.ins.2022.07.042

Highlights

A consensus model based on endo-confidence is constructed.
Double hierarchical clustering is introduced in the consensus reaching process. Firstly, the experts are classified according to their evaluations (the numerical evaluations ...

Abstract

With the development of novel technological and societal paradigms, consensus in multi-attribute large-scale group decision making is of great significance. The confidence derived by evaluation is considered in this paper and it is named as endo-...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Frequency-Domain Transformation-Based Dynamic Gesture Recognition with Skeleton

Improving smartphone GNSS positioning in challenging urban environments using GA-BPNN

Extractive Question Answering with Contrastive Puzzles and Reweighted Clues

ConClue: Conditional Clue Extraction for Multiple Choice Question Answering

Static graph convolution with learned temporal and channel-wise graph topology generation for skeleton-based action recognition

DFN: A deep fusion network for flexible single and multi-modal action recognition

Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning

An attention-based CNN for automatic whole-body postural assessment

mmHSV: In-Air Handwritten Signature Verification via Millimeter-Wave Radar

A consensus model under framework of prospect theory with acceptable adjustment and endo-confidence

Modeling Long-range Dependencies and Epipolar Geometry for Multi-view Stereo

Neural network model based on global and local features for multi-view mammogram classification

Novel View Synthesis from a Single Unposed Image via Unsupervised Learning

Sign language recognition via dimensional global–local shift and cross-scale aggregation

Learning Using Privileged Information for Zero-Shot Action Recognition

Focal and Global Spatial-Temporal Transformer for Skeleton-Based Action Recognition

Towards video text visual question answering: benchmark and baseline

Contrastive Positive Mining for Unsupervised 3D Action Representation Learning

FT-HID: a large-scale RGB-D dataset for first- and third-person human interaction analysis

An endo-confidence-based consensus with hierarchical clustering and automatic feedback in multi-attribute large-scale group decision-making

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder