default search action
7th PRCV 2024: Urumqi, China - Part VII
- Zhouchen Lin, Ming-Ming Cheng, Ran He, Kurban Ubul, Wushouer Silamu, Hongbin Zha, Jie Zhou, Cheng-Lin Liu:
Pattern Recognition and Computer Vision - 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024, Proceedings, Part VII. Lecture Notes in Computer Science 15037, Springer 2025, ISBN 978-981-97-8510-0
Character Recognition
- Hongxia Zhang, Minqiang Xu, Liang He:
Scene Text Recognition Via k-NN Attention-Based Decoder and Margin-Based Softmax Loss. 3-15 - Lu Xu, Zhufeng Jiang, Xingyu Han, Hui Wang, Zizhu Fan:
Real-Time Text Detection with Multi-level Feature Fusion and Pixel Clustering. 16-29 - Liu Yu, Xiangcheng Du, Ziang Liu, Daoguo Dong, Liang He:
Refined and Locality-Enhanced Feature for Handwritten Mathematical Expression Recognition. 30-43 - Hao Sun, Jie Cao, Zhida Zhang, Tao Wu, Kai Zhou, Huaibo Huang:
Learning Fine-Grained and Semantically Aware Mamba Representations for Tampered Text Detection in Images. 44-57 - Miaomiao Xu, Jiang Zhang, Lianghui Xu, Yanbing Li, Wushour Silamu:
Dual Feature Enhanced Scene Text Recognition Method for Low-Resource Uyghur. 58-71 - Weiqi Wang, Feilong Bao, Hui Zhang:
Segmentation-Free Todo Mongolian OCR and its Public Dataset. 72-85 - Miaomiao Xu, Jiang Zhang, Lianghui Xu, Yanbing Li, Wushour Silamu:
Hybrid Encoding Method for Scene Text Recognition in Low-Resource Uyghur. 86-99 - Zhengchen Li, Xintong Li, Kaiwen Qian, Yuchun Fang:
ROBC: A Radical-Level Oracle Bone Character Dataset. 100-113 - Zhongjie Hu, Qi Liu, Song-Lu Chen, Yan Liu, Feng Chen, Xu-Cheng Yin:
Integrated Recognition of Arbitrary-Oriented Multi-line Billet Number. 114-128 - JunJie Yang, Bo Zhou, Anna Zhu:
Improving Scene Text Recognition with Counting-Aware Contrastive Learning and Attention Alignment. 129-142 - Zhonghong Ou, Yiqun Zhang, Siyuan Yao, Meina Song:
GridMask: An Efficient Scheme for Real Time Curved Scene Text Detection. 143-155 - Binglin Li, Jie Zhu, Dongcai Zhao:
Tibetan Handwriting Recognition Method Based on Structural Re-Parameterization ViT and Vertical Attention. 156-169
Document Analysis and Recognition
- Huanxin Yang, Qiwen Wang:
MFH: Marrying Frequency Domain with Handwritten Mathematical Expression Recognition. 173-186 - Zi-Rui Wang:
Leveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text. 187-200 - Xinyu Zhou, Zihan Ji, Anna Zhu:
OCR-Aware Scene Graph Generation Via Multi-modal Object Representation Enhancement and Logical Bias Learning. 201-215 - Ziyi Zhu, Wenqi Zhao, Liangcai Gao:
Enhancing Transformer-Based Table Structure Recognition for Long Tables. 216-230 - Yan Zhang, Gangyan Zeng, Huawen Shen, Can Ma, Yu Zhou:
Show Exemplars and Tell Me What You See: In-Context Learning with Frozen Large Language Models for TextVQA. 231-245 - Peisen Wang, Bo Wang, Xixi Nie, Chunyi Guo, Kaijiang Li:
MLR-NET: An Arbitrary Skew Angle Detection Algorithm for Complex Layout Document Images. 246-260 - Elham Eli, Wenting Xu, Alimjan Aysa, Hornisa Mamat, Kurban Ubul:
TextViTCNN: Enhancing Natural Scene Text Recognition with Hybrid Transformer and Convolutional Networks. 261-275 - Teng Li, Jiapeng Wang, Lianwen Jin:
Enhancing Visual Information Extraction with Large Language Models Through Layout-Aware Instruction Tuning. 276-289 - Hongwei Chen, Mengxi Cheng, Tianshun Cheng, Yun Xiao:
SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature Extractor. 290-304 - Dehu Du, Yujia Wu:
Improving Zero-Shot Image Captioning Efficiency with Metropolis-Hastings. 305-318 - Yujia Wu, Xuan Zhang, Hong Ren:
Improving Text Classification Performance Through Multimodal Representation. 319-333 - Shiwen Sun, Hongxi Wei, Yiming Wang, Chao He:
A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents. 334-347 - Liucheng Pang, Yaping Zhang, Cong Ma, Yang Zhao, Yu Zhou, Chengqing Zong:
[inline-graphic not available: see fulltext] TableRocket: An Efficient and Effective Framework for Table Reconstruction. 348-362 - Linjie Tang, Pengfei Yi, Mingrui Chen, Mingkun Yang, Dingkang Liang:
Not All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection. 363-377 - Yiming Zhang, Yaping Zhang, Lu Xiang, Yu Zhou:
Multi-Modal Attention Based on 2D Structured Sequence for Table Recognition. 378-391
Action Recognition
- Ruoqi Yin, Jianqin Yin:
A Two-Stream Hybrid CNN-Transformer Network for Skeleton-Based Human Interaction Recognition. 395-408 - Yi Liu, Ruyi Liu, Wentian Xin, Qiguang Miao, Yuzhi Hu, Jiahao Qi:
Language-Skeleton Pre-training to Collaborate with Self-Supervised Human Action Recognition. 409-423 - Yezi Gong, Mingtao Pei:
Spatio-Temporal Contrastive Learning for Compositional Action Recognition. 424-438 - Zongyun Li, Yang Yang, Xuehao Gao:
Path-Guided Motion Prediction with Multi-view Scene Perception. 439-453 - Xiao Li, Yukun Qiu, Yi-Xing Peng, Ling-An Zeng, Wei-Shi Zheng:
Privacy-Preserving Action Recognition: A Survey. 454-468 - Yutong Hu:
Attention-Based Spatio-Temporal Modeling with 3D Convolutional Neural Networks for Dynamic Gesture Recognition. 469-480 - Weilong Peng, Qingfeng Chen, Keke Tang, Zhihao Yang, Meng Xing, Meie Fang:
MIT: Multi-cue Injected Transformer for Two-Stage HOI Detection. 481-495 - Haobo Huang, Jianan Li, Hongbin Fan, Zhifu Zhao, Yangtao Zhou:
DIDA: Dynamic Individual-to-integrateD Augmentation for Self-supervised Skeleton-Based Action Recognition. 496-510 - Yifei Du, Mingliang Zhang, Bin Li:
Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition. 511-524 - Yuxi Liu, Wenyu Zhang, Sihong Chen, Xinming Zhang:
Improving Video Representation of Vision-Language Model with Decoupled Explicit Temporal Modeling. 525-539 - Keming Mao, Yilong Xiao, Xin Jing, Zepeng Hu, Yi Ping:
KS-FuseNet: An Efficient Action Recognition Method Based on Keyframe Selection and Feature Fusion. 540-553 - Zixian Liu, Longfei Zhang, Xiaokun Zhao, Yixuan Wang:
Dynamic Skeleton Association Transformer for Dyadic Interaction Action Recognition. 554-569 - Zhen Zhai, Hailun Zhang, Qijun Zhao, Keren Fu:
Species-Aware Guidance for Animal Action Recognition with Vision-Language Knowledge. 570-583
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.