default search action
ICMR 2021: Taipei, Taiwan
- Wen-Huang Cheng, Mohan S. Kankanhalli, Meng Wang, Wei-Ta Chu, Jiaying Liu, Marcel Worring:
ICMR '21: International Conference on Multimedia Retrieval, Taipei, Taiwan, August 21-24, 2021. ACM 2021, ISBN 978-1-4503-8463-6
Full Research Papers
- Evlampios Apostolidis, Eleni Adamantidou, Vasileios Mezaris, Ioannis Patras:
Combining Adversarial and Reinforcement Learning for Video Thumbnail Selection. 1-9 - Petra Budíková, Jan Sedmidubský, Pavel Zezula:
Efficient Indexing of 3D Human Motions. 10-18 - Jie Cao, Shengsheng Qian, Huaiwen Zhang, Quan Fang, Changsheng Xu:
Global Relation-Aware Attention Network for Image-Text Retrieval. 19-28 - Pei-Chun Chang, Yong-Sheng Chen, Chang-Hsing Lee:
MS-SincResNet: Joint Learning of 1D and 2D Kernels Using Multi-scale SincNet and ResNet for Music Genre Classification. 29-36 - Xu Chen, Lei Wu, Minggang He, Lei Meng, Xiangxu Meng:
MLFont: Few-Shot Chinese Font Generation via Deep Meta-Learning. 37-45 - Yiu-Ming Cheung, Mengke Li, Rong Zou:
Facial Structure Guided GAN for Identity-preserved Face Image De-occlusion. 46-54 - Feifei Dai, Xiaoyan Gu, Zhuo Wang, Mingda Qian, Bo Li, Weiping Wang:
Heterogeneous Side Information-based Iterative Guidance Model for Recommendation. 55-63 - Feng Dai, Hao Liu, Yike Ma, Xi Zhang, Qiang Zhao:
Dense Scale Network for Crowd Counting. 64-72 - Yujuan Ding, Yunshan Ma, Wai Keung Wong, Tat-Seng Chua:
Leveraging Two Types of Global Graph for Sequential Fashion Recommendation. 73-81 - Yu Duan, Yun Xiong, Yao Zhang, Yuwei Fu, Yangyong Zhu:
HSGMP: Heterogeneous Scene Graph Message Passing for Cross-modal Retrieval. 82-91 - Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara:
GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph. 92-100 - Yuqian Fu, Yanwei Fu, Yu-Gang Jiang:
Can Action be Imitated? Learn to Reconstruct and Transfer Human Dynamics from Videos. 101-109 - Ziwang Fu, Feng Liu, Jiahao Zhang, Hanyang Wang, Chengyi Yang, Qing Xu, Jiayin Qi, Xiangling Fu, Aimin Zhou:
SAGN: Semantic Adaptive Graph Network for Skeleton-Based Human Action Recognition. 110-117 - Liying Gao, Kai Niu, Zehong Ma, Bingliang Jiao, Tonghao Tan, Peng Wang:
Text-Guided Visual Feature Refinement for Text-Based Person Search. 118-126 - Yuhui Guo, Xun Liang:
RGB-D Scene Recognition based on Object-Scene Relation and Semantics-Preserving Attention. 127-134 - Xiaoshuai Hao, Yucan Zhou, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang:
Multi-Feature Graph Attention Network for Cross-Modal Video-Text Retrieval. 135-143 - Bin Ji, Chen Yang, Shunyu Yao, Ye Pan:
HPOF: 3D Human Pose Recovery from Monocular Video with Optical Flow. 144-154 - Giorgos Kordopatis-Zilos, Panagiotis Galopoulos, Symeon Papadopoulos, Ioannis Kompatsiaris:
Leveraging EfficientNet and Contrastive Learning for Accurate Global-scale Location Estimation. 155-163 - Fangtao Li, Ting Bai, Chenyu Cao, Zihe Liu, Chenghao Yan, Bin Wu:
Relation-aware Hierarchical Attention Framework for Video Question Answering. 164-172 - Jiao Li, Jialiang Sun, Xing Xu, Wei Yu, Fumin Shen:
Cross-Modal Image-Recipe Retrieval via Intra- and Inter-Modality Hybrid Fusion. 173-182 - Mingyong Li, Hongya Wang:
Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval. 183-191 - Qinghua Li, Xue Zhang, Cuiping Li, Hong Chen:
A Unified-Model via Block Coordinate Descent for Learning the Importance of Filter. 192-200 - Guoqiang Liang, Shiyu Ji, Yanning Zhang:
Local-enhanced Interaction for Temporal Moment Localization. 201-209 - Zhiguang Liu, Liangwei Wang, Jian Qiao:
Reading Scene Text by Fusing Visual Attention with Semantic Representations. 210-218 - Jia Long, Hongtao Lu:
Generative Adversarial Networks with Bi-directional Normalization for Semantic Image Synthesis. 219-226 - Junda Lu, Mingyang Chen, Yifang Sun, Wei Wang, Yi Wang, Xiaochun Yang:
A Smart Adversarial Attack on Deep Hashing Based Image Retrieval. 227-235 - Sanbi Luo, Tao Guo:
Image-to-Image Transfer Makes Chaos to Order. 236-243 - Yu-Shu Ni, Chia-Chi Tsai, Jiun-In Guo, Jenq-Neng Hwang, Bo-Xun Wu, Po-Chi Hu, Ted T. Kuo, Po-Yu Chen, Hsien-Kai Kuo:
Summary of the 2021 Embedded Deep Learning Object Detection Model Compression Competition for Traffic in Asian Countries. 244-249 - Cheng Qiu, Yirong Yao, Yuntao Du:
Nested Dense Attention Network for Single Image Super-Resolution. 250-258 - Yifan Ren, Xing Xu, Fumin Shen, Zheng Wang, Yang Yang, Heng Tao Shen:
Multi-scale Dynamic Network for Temporal Action Detection. 267-275 - Zikai Song, Zhiwen Wan, Wei Yuan, Ying Tang, Junqing Yu, Yi-Ping Phoebe Chen:
Distractor-Aware Tracker with a Domain-Special Optimized Benchmark for Soccer Player Tracking. 276-284 - Kimihiro Tanaka, Yusuke Matsui, Shin'ichi Satoh:
Efficient Nearest Neighbor Search by Removing Anti-hub. 285-293 - Lucas Pascotti Valem, Daniel Carlos Guimarães Pedronette:
A Denoising Convolutional Neural Network for Self-Supervised Rank Effectiveness Estimation on Image Retrieval. 294-302 - Shaoying Wang, Hanjiang Lai, Zhenyu Shi:
Know Yourself and Know Others: Efficient Common Representation Learning for Few-shot Cross-modal Retrieval. 303-311 - Xiaomei Wang, Lin Ma, Yanwei Fu, Xiangyang Xue:
Neural Symbolic Representation Learning for Image Captioning. 312-321 - Yangtao Wang, Yanzhao Xie, Yu Liu, Lisheng Fan:
G-CAM: Graph Convolution Network Based Class Activation Mapping for Multi-label Image Recognition. 322-330 - Lei Wu, Xueliang Liu, Yanbin Hao, Yunjie Ma, Richang Hong:
NASTER: Non-local Attentional Scene Text Recognizer. 331-338 - Ting-Ting Xie, Christos Tzelepis, Fan Fu, Ioannis Patras:
Few-Shot Action Localization without Knowing Boundaries. 339-348 - Baoming Yan, Qingheng Zhang, Liyu Chen, Lin Wang, Leihao Pei, Jiang Yang, Enyun Yu, Xiaobo Li, Binqiang Zhao:
Learning Hierarchical Visual-Semantic Representation with Phrase Alignment. 349-357 - Chenghao Yan, Zihe Liu, Fangtao Li, Chenyu Cao, Zheng Wang, Bin Wu:
Social Relation Analysis from Videos via Multi-entity Reasoning. 358-366 - Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert:
Aligning Visual Prototypes with BERT Embeddings for Few-Shot Learning. 367-375 - Hong-Lei Yao, Yu-Wei Zhan, Zhen-Duo Chen, Xin Luo, Xin-Shun Xu:
TEACH: Attention-Aware Deep Cross-Modal Hashing. 376-384 - Min Zhang, Meng Ma, Ping Wang:
Scene Text Recognition with Cascade Attention Network. 385-393 - Wen Zhang, Jie Shao:
Multi-Attention Audio-Visual Fusion Network for Audio Spatialization. 394-401 - Feng Zhao, Donglin Wang, Xintao Xiang:
Multi-Initialization Graph Meta-Learning for Node Classification. 402-410 - Xinzhe Zhou, Yadong Mu:
Question-Guided Semantic Dual-Graph Visual Reasoning with Novel Answers. 411-419 - Nan Zhuang, Yadong Mu:
Joint Hand-Object Pose Estimation with Differentiably-Learned Physical Contact Point Analysis. 420-428 - Zifeng Zhuang, Xintao Xiang, Siteng Huang, Donglin Wang:
HINFShot: A Challenge Dataset for Few-Shot Node Classification in Heterogeneous Information Network. 429-436
Short Research Papers
- Marco Cagrandi, Marcella Cornia, Matteo Stefanini, Lorenzo Baraldi, Rita Cucchiara:
Learning to Select: A Fully Attentive Approach for Novel Object Captioning. 437-441 - Yu-Chen Chang, Wen-Cheng Chen, Min-Chun Hu:
Semi-supervised Many-to-many Music Timbre Transfer. 442-446 - Yan-He Chen, Mei-Chen Yeh:
Text-Enhanced Attribute-Based Attention for Generalized Zero-Shot Fine-Grained Image Classification. 447-450 - Konstantinos Gkountakos, Despoina Touska, Konstantinos Ioannidis, Theodora Tsikrika, Stefanos Vrochidis, Ioannis Kompatsiaris:
Spatio-Temporal Activity Detection and Recognition in Untrimmed Surveillance Videos. 451-455 - Haifan Gong, Guanqi Chen, Sishuo Liu, Yizhou Yu, Guanbin Li:
Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question Answering. 456-460 - Shintami Chusnul Hidayati, Yeni Anistyasari:
Body Shape Calculator: Understanding the Type of Body Shapes from Anthropometric Measurements. 461-465 - Hussain Kanafani, Junaid Ahmed Ghauri, Sherzod Hakimov, Ralph Ewerth:
Unsupervised Video Summarization via Multi-source Features. 466-470 - Tarun Krishna, Kevin McGuinness, Noel E. O'Connor:
Evaluating Contrastive Models for Instance-based Image Retrieval. 471-475 - Xiaocheng Lu, Yuan Yuan, Qi Wang:
AWFA-LPD: Adaptive Weight Feature Aggregation for Multi-frame License Plate Detection. 476-480 - Zekun Luo, Zheng Fang, Sixiao Zheng, Yabiao Wang, Yanwei Fu:
NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection. 481-485 - Bowen Wang, Liangzhi Li, Yuta Nakashima, Takehiro Yamamoto, Hiroaki Ohshima, Yoshiyuki Shoji, Kenro Aihara, Noriko Kando:
Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task Learning. 486-490 - Lan Yan, Wenbo Zheng, Fei-Yue Wang, Chao Gou:
Weakly Supervised Sketch Based Person Search. 491-495 - An-Zi Yen, Chia-Chung Chang, Hen-Hsen Huang, Hsin-Hsi Chen:
Personal Knowledge Base Construction from Multimodal Data. 496-500 - Kang Yuan, Sheng Li:
2.5D Pose Guided Human Image Generation. 501-505 - Min Zhu, Weifeng Liu, Kai Zhang, Ye Li, Peng Liu, Baodi Liu:
Collaborative Representation for Deep Meta Metric Learning. 506-510
Brave New Idea
- An-Zi Yen, Hen-Hsen Huang, Hsin-Hsi Chen:
Ten Questions in Lifelog Mining and Information Recall. 511-518
Challenge Papers
- Yongkun Du, Zhineng Chen, Caiyan Jia, Xuanya Li, Yu-Gang Jiang:
Bag of Tricks for Building an Accurate and Slim Object Detector for Embedded Applications. 519-525 - Chih-Chung Hsu, Chieh Lee, Lin Chen, Min-Kai Hung, Andy Yu-Lun Lin, Xian-Yu Wang:
Efficient-ROD: Efficient Radar Object Detection based on Densely Connected Residual Network. 526-532 - Bo Ju, Wei Yang, Jinrang Jia, Xiaoqing Ye, Qu Chen, Xiao Tan, Hao Sun, Yifeng Shi, Errui Ding:
DANet: Dimension Apart Network for Radar Object Detection. 533-539 - Bao-Hong Lai, Hsun-Ping Hsieh:
Object Detection on Embedded Systems for Traffic in Asian Countries. 540-544 - Pengliang Sun, Xuetong Niu, Pengfei Sun, Kele Xu:
Squeeze-and-Excitation network-Based Radar Object Detection With Weighted Location Fusion. 545-552 - Yizhou Wang, Jenq-Neng Hwang, Gaoang Wang, Hui Liu, Kwang-Ju Kim, Hung-Min Hsu, Jiarui Cai, Haotian Zhang, Zhongyu Jiang, Renshu Gu:
ROD2021 Challenge: A Summary for Radar Object Detection Challenge for Autonomous Driving Applications. 553-559 - Wen-Kai Wu, Chien-Yu Chen, Jiann-Shu Lee:
Embedded YOLO: Faster and Lighter Object Detection. 560-565 - Jun Yu, Xinlong Hao, Xinjian Gao, Qiang Sun, Yuyu Liu, Peng Chang, Zhong Zhang, Fang Gao, Feng Shuang:
Radar Object Detection Using Data Merging, Enhancement and Fusion. 566-572 - Zangwei Zheng, Xiangyu Yue, Kurt Keutzer, Alberto L. Sangiovanni-Vincentelli:
Scene-aware Learning Network for Radar Object Detection. 573-579
Conflict of Interest Papers
- Jia-Hong Huang, Luka Murn, Marta Mrak, Marcel Worring:
GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization. 580-589 - Omar Shahbaz Khan, Björn Þór Jónsson, Jan Zahálka, Stevan Rudinac, Marcel Worring:
Impact of Interaction Strategies on User Relevance Feedback. 590-598
Demonstrations
- Ting-Hsuan Chou, Wei-Ta Chu:
Automatic Baseball Pitch Overlay. 599-602 - Yuko Iinuma, Shin'ichi Satoh:
Video Action Retrieval Using Action Recognition Model. 603-606 - Mitchell Lee, Praveena Avula, Min Chen:
MeTILDA: Platform for Melodic Transcription in Language Documentation and Application. 607-610 - Rintaro Yanagi, Ren Togo, Takahiro Ogawa, Miki Haseyama:
IR Questioner: QA-based Interactive Retrieval System. 611-614
Reproducibility Paper
- Yunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao, Wai Keung Wong, Tat-Seng Chua, Jinyoung Moon, Hong-Han Shuai:
Reproducibility Companion Paper: Knowledge Enhanced Neural Fashion Trend Forecasting. 615-618
Doctoral Consortium
- Fityanul Akhyar, Chih-Yang Lin, Gugan S. Kathiresan:
A Beneficial Dual Transformation Approach for Deep Learning Networks Used in Steel Surface Defect Detection. 619-622 - Ka-Hou Chan, Sio Kei Im:
Discrete Tchebichef Transform for Versatile Video Coding. 623-626 - Mohammad Shahid, Kai-Lung Hua:
Fire Detection using Transformer Network. 627-630
Special Session Paper
- Huangpeng Dai, Qing Xie, Jiachen Li, Yanchun Ma, Lin Li, Yongjian Liu:
Visible-infrared Person Re-identification with Human Body Parts Assistance. 631-637 - Zilong Fu, Hongtao Xie, Guoqing Jin, Junbo Guo:
Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition. 638-644 - Jia-Hong Huang, Ting-Wei Wu, Marcel Worring:
Contextualized Keyword Representations for Multi-modal Retinal Image Captioning. 645-652 - Huibing Wang, Guangqi Jiang, Jinjia Peng, Xianping Fu:
MSAV: An Unified Framework for Multi-view Subspace Analysis with View Consistence. 653-659 - Jian Wang, Xian-Hua Han, Lanfen Lin, Hongjie Hu, Yen-Wei Chen:
A Tensor Sparse Representation-Based CBMIR System for Computer-Aided Diagnosis of Focal Liver Lesions and its Pilot Trial. 660-666 - Yingying Xu, Jing Liu, Lanfen Lin, Hongjie Hu, Ruofeng Tong, Jingsong Li, Yen-Wei Chen:
M-DFNet: Multi-phase Discriminative Feature Network for Retrieval of Focal Liver Lesions. 667-673 - Chengyuan Zhang, Zhi Zhong, Lei Zhu, Shichao Zhang, Da Cao, Jianfeng Zhang:
M2GUDA: Multi-Metrics Graph-Based Unsupervised Domain Adaptation for Cross-Modal Hashing. 674-681 - Congcong Zhang, Ning He, Qixiang Sun, Xiaojie Yin, Ke Lu:
Human Pose Estimation based on Attention Multi-resolution Network. 682-687
Workshop Summaries
- Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Cathal Gurrin, Minh-Triet Tran, Nguyen Thanh Binh:
ICDAR'21: Intelligent Cross-Data Analysis and Retrieval. 688-689 - Cathal Gurrin, Björn Þór Jónsson, Klaus Schöffmann, Duc-Tien Dang-Nguyen, Jakub Lokoc, Minh-Triet Tran, Wolfgang Hürst, Luca Rossetto, Graham Healy:
Introduction to the Fourth Annual Lifelog Search Challenge, LSC'21. 690-691 - Min-Chun Hu, Ichiro Ide, Kensuke Tobitani:
MMArt-ACM'21: International Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia 2021. 692-693 - Bei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui:
MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding. 694-695 - Yoko Yamakata, Atsushi Hashimoto:
CEA'21: The 13th Workshop on Multimedia for Cooking and Eating Activities. 696-697
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.