default search action
ICMR 2022: Newark, NJ, USA
- Vincent Oria, Maria Luisa Sapino, Shin'ichi Satoh, Brigitte Kerhervé, Wen-Huang Cheng, Ichiro Ide, Vivek K. Singh:
ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27 - 30, 2022. ACM 2022, ISBN 978-1-4503-9238-9
Short Papers
- Zujie Liang, Fan Liang:
TransPCC: Towards Deep Point Cloud Compression via Transformers. 1-5 - Markus Fox, Klaus Schoeffmann:
The Impact of Dataset Splits on Classification Performance in Medical Videos. 6-10 - Xiaoyuan Guo, Jiali Duan, Saptarshi Purkayastha, Hari Trivedi, Judy Wawira Gichoya, Imon Banerjee:
OSCARS: An Outlier-Sensitive Content-Based Radiography Retrieval System. 11-18 - Yuma Honbu, Keiji Yanai:
Unseen Food Segmentation. 19-23 - Yinghao Wang, Haonan Chen, Jiong Wang, Yingying Zhu:
DMPCANet: A Low Dimensional Aggregation Network for Visual Place Recognition. 24-28 - Yikang Li, Jenhao Hsiao, Chiuman Ho:
VideoCLIP: A Cross-Attention Model for Fast Video-Text Retrieval Task with Image CLIP. 29-33 - Mingao Zhang, Changhong Liu, Yong Chen, Zhenchun Lei, Mingwen Wang:
Music-to-Dance Generation with Multiple Conformer. 34-38 - Wenliang Tang, Zhenzhen Hu, Zijie Song, Richang Hong:
OCR-oriented Master Object for Text Image Captioning. 39-43 - Yongbiao Chen, Kaicheng Guo, Fangxin Liu, Yusheng Huang, Zhengwei Qi:
Supervised Contrastive Vehicle Quantization for Efficient Vehicle Retrieval. 44-48 - Rino Naka, Marie Katsurai, Keisuke Yanagi, Ryosuke Goto:
Fashion Style-Aware Embeddings for Clothing Image Retrieval. 49-53
Session 1A: Reidentification
- Shuyuan Tu, Tianzhen Guan, Li Kuang:
Multiple Biological Granularities Network for Person Re-Identification. 54-62 - Yajing Zhai, Yawen Zeng, Da Cao, Shaofei Lu:
TriReID: Towards Multi-Modal Person Re-Identification via Descriptive Fusion Model. 63-71 - Bingliang Jiao, Liying Gao, Peng Wang:
Temporal-Consistent Visual Clue Attentive Network for Video-Based Person Re-Identification. 72-80 - Lu Yang, Hongbang Liu, Lingqiao Liu, Jinghao Zhou, Lei Zhang, Peng Wang, Yanning Zhang:
Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification. 81-89
Session 1B: Recommendations
- Yanbin Jiang, Huifang Ma, Xiaohui Zhang, Zhixin Li, Liang Chang:
An Effective Two-way Metapath Encoder over Heterogeneous Information Network for Recommendation. 90-98 - Zhuang Liu, Yunpu Ma, Matthias Schubert, Yuanxin Ouyang, Zhang Xiong:
Multi-Modal Contrastive Pre-training for Recommendation. 99-108 - Mingda Qian, Xiaoyan Gu, Lingyang Chu, Feifei Dai, Haihui Fan, Bo Li:
Flexible Order Aware Sequential Recommendation. 109-117 - Jinpeng Chen, Yuan Cao, Fan Zhang, Pengfei Sun, Kaimin Wei:
Sequential Intention-aware Recommender based on User Interaction Graph. 118-126
Session 2A: Visual+Text Retrieval
- Yongbiao Chen, Sheng Zhang, Fangxin Liu, Zhigang Chang, Mang Ye, Zhengwei Qi:
TransHash: Transformer-based Hamming Hashing for Efficient Image Retrieval. 127-136 - Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan:
Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval. 137-145 - Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz:
Relevance-based Margin for Contrastively-trained Video Retrieval Models. 146-157 - Yaoxin Zhuo, Yikang Li, Jenhao Hsiao, Chiuman Ho, Baoxin Li:
CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval. 158-166
Session 2B: Deep Learning - Methodological Advancements
- Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou:
Nearest Neighbor Search with Compact Codes: A Decoder Perspective. 167-175 - Jan Schutte, Pascal Mettes:
Teaching a New Dog Old Tricks: Contrastive Random Walks in Videos with Unsupervised Priors. 176-184 - Shaoxiong Zhu, Qi Qi, Zirui Zhuang, Jingyu Wang, Haifeng Sun, Jianxin Liao:
FedNKD: A Dependable Federated Learning Using Fine-tuned Random Noise and Knowledge Distillation. 185-193 - Anqi Hu, Zhengxing Sun, Qian Li:
Weakly Supervised Fine-grained Recognition based on Combined Learning for Small Data and Coarse Label. 194-201
Demos
- Yifei Fan, Modan Xie, Peihan Wu, Gang Yang:
Real-Time Deepfake System for Live Streaming. 202-205 - Alessandro B. Melchiorre, David Penz, Christian Ganhör, Oleg Lesota, Vasco Fragoso, Florian Friztl, Emilia Parada-Cabaleiro, Franz Schubert, Markus Schedl:
EmoMTB: Emotion-aware Music Tower Blocks. 206-210 - Aaron Duane, Björn Þór Jónsson:
ViRMA: Virtual Reality Multimedia Analytics. 211-214 - Tingting Dong, Jianquan Liu:
Person Search by Uncertain Attributes. 215-218
Best Paper Candidates
- Yiqi Gao, Xinglin Hou, Wei Suo, Mengyang Sun, Tiezheng Ge, Yuning Jiang, Peng Wang:
Dual-Level Decoupled Transformer for Video Captioning. 219-228 - Zhongwei Xie, Lin Li, Luo Zhong, Jianquan Liu, Ling Liu:
Cross-Modal Retrieval between Event-Dense Text and Image. 229-238 - Sheng Zeng, Changhong Liu, Jun Zhou, Yong Chen, Aiwen Jiang, Hanxi Li:
Learning Hierarchical Semantic Correspondences for Cross-Modal Image-Text Retrieval. 239-248
Session 3A: Visual+Text Retrieval
- Jianlong Wu, Liangming Pan, Jingjing Chen, Yu-Gang Jiang:
Ingredient-enriched Recipe Generation from Cooking Videos. 249-257 - Bin Zhu, Chong-Wah Ngo, Jingjing Chen, Wing Kwong Chan:
Cross-lingual Adaptation for Recipe Retrieval with Mixup. 258-267 - Pei Dong, Lei Wu, Lei Meng, Xiangxu Meng:
Disentangled Representations and Hierarchical Refinement of Multi-Granularity Features for Text-to-Image Synthesis. 268-276 - Haochen Sun, Lei Wu, Xiang Li, Xiangxu Meng:
Style-woven Attention Network for Zero-shot Ink Wash Painting Style Transfer. 277-285
Session 3B: Applications
- Georgios Begkas, Panagiotis Giannakeris, Konstantinos Ioannidis, Georgios Kalpakis, Theodora Tsikrika, Stefanos Vrochidis, Ioannis Kompatsiaris:
Automatic Visual Recognition of Unexploded Ordnances Using Supervised Deep Learning. 286-294 - Yu Yin, Will Hutchcroft, Naji Khosravan, Ivaylo Boyadzhiev, Yun Fu, Sing Bing Kang:
Generating Topological Structure of Floorplans from Room Attributes. 295-303 - Xuan Wang, Jiajun Chen, Hao Tang, Zhigang Zhu:
MultiCLU: Multi-stage Context Learning and Utilization for Storefront Accessibility Detection and Evaluation. 304-312 - Yuan Chang, Tao Peng, Ruhan He, Xinrong Hu, Junping Liu, Zili Zhang, Minghua Jiang:
UF-VTON: Toward User-Friendly Virtual Try-On Network. 313-321
Session 3C: Synchronized MM
- Peijun Bao, Yadong Mu:
Learning Sample Importance for Cross-Scenario Video Temporal Grounding. 322-329 - Suwichaya Suwanwimolkul, Satoshi Komorita:
Efficient Linear Attention for Fast and Accurate Keypoint Matching. 330-341 - Ben Xue, Chenchen Liu, Yadong Mu:
Video2Subtitle: Matching Weakly-Synchronized Sequences via Dynamic Temporal Alignment. 342-350 - Bolin Zhang, Bin Jiang, Chao Yang, Liang Pang:
Dual-Channel Localization Networks for Moment Retrieval with Natural Language. 351-359
Session 4A: Alignment and Localization
- Sizhe Li, Chang Li, Minghang Zheng, Yang Liu:
Phrase-level Prediction for Video Temporal Localization. 360-368 - Xingyu Shen, Long Lan, Huibin Tan, Xiang Zhang, Xurui Ma, Zhigang Luo:
Joint Modality Synergy and Spatio-temporal Cue Purification for Moment Localization. 369-379 - Ru Peng, Yawen Zeng, Junbo Zhao:
HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment. 380-388
Session 4B: Captioning and Summarization
- Yiqi Gao, Ning Wang, Wei Suo, Mengyang Sun, Peng Wang:
Improving Image Captioning via Enhancing Dual-Side Context Awareness. 389-397 - Minghao Geng, Qingjie Zhao:
Improve Image Captioning by Modeling Dynamic Scene Graph Extension. 398-406 - Evlampios Apostolidis, Georgios Balaouras, Vasileios Mezaris, Ioannis Patras:
Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames. 407-415
Session 5A: Applications
- Shanchuan Gao, Fankai Zeng, Lu Cheng, Jicong Fan, Mingbo Zhao:
Fashion Image Search via Anchor-Free Detector. 416-425 - Jingyu Li, Haokai Ma, Xiangxian Li, Zhuang Qi, Lei Meng, Xiangxu Meng:
Unsupervised Contrastive Masking for Visual Haze Classification. 426-434 - Anwer Slimi, Mounir Zrigui, Henri Nicolas:
MuLER: Multiplet-Loss for Emotion Recognition. 435-442 - Xingyu Zhu, Yingshuo Liang, Jianlei Zhang, Zengqiang Chen:
STAFNet: Swin Transformer Based Anchor-Free Network for Detection of Forward-looking Sonar Imagery. 443-450
Session 5B: Robust MM
- Chao Jiang, Yi He, Richard Chapman, Hongyi Wu:
Camouflaged Poisoning Attack on Graph Neural Networks. 451-461 - Siyuan Li, Guangji Huang, Xing Xu, Yang Yang, Fumin Shen:
Accelerated Sign Hunter: A Sign-based Black-box Attack via Branch-Prune Strategy and Stabilized Hierarchical Search. 462-470 - Zhen Luo, Yingfang Zhang, Peihao Zhong, Jingjing Chen, Donglong Chen:
DiGAN: Directional Generative Adversarial Network for Object Transfiguration. 471-479 - Xiaoheng Sun, Xia Liang, Qiqi He, Bilei Zhu, Zejun Ma:
GIO: A Timbre-informed Approach for Pitch Tracking in Highly Noisy Environments. 480-488
Session 5C: Action, Pose and Body
- Peipeng Chen, Andy J. Ma:
Source-free Temporal Attentive Domain Adaptation for Video Action Recognition. 489-497 - Neng Zhou, Hairu Wen, Yi Wang, Yang Liu, Longfei Zhou:
Review of Deep Learning Models for Spine Segmentation. 498-507 - Zhidan Liu, Zhen Xing, Xiangdong Zhou, Yijiang Chen, Guichun Zhou:
3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation. 508-517 - Yiran Zhu, Guangji Huang, Xing Xu, Yanli Ji, Fumin Shen:
Selective Hypergraph Convolutional Networks for Skeleton-based Action Recognition. 518-526
Session 6: Multifarious Multimedia
- Guangyu Chen, Deyuan Zhang, Tao Liu, Xiaoyong Du:
Self-Lifting: A Novel Framework for Unsupervised Voice-Face Association Learning. 527-535 - Hongya Wang, Shunxin Dai, Ming Du, Bo Xu, Mingyong Li:
Revisiting Performance Measures for Cross-Modal Hashing. 536-544 - Yifeng Zhuang, Qiang Sun, Yanwei Fu, Lifeng Chen, Xiangyang Xue:
Local Slot Attention for Vision and Language Navigation. 545-553 - Yuhui Guo, Xun Liang, Tang Hui, Bo Wu, Xiangping Zheng:
Cross-Pixel Dependency with Boundary-Feature Transformation for Weakly Supervised Semantic Segmentation. 554-561 - Kangning Yang, Benjamin Tag, Yue Gu, Chaofan Wang, Tilman Dingler, Greg Wadley, Jorge Gonçalves:
Mobile Emotion Recognition via Multiple Physiological Signals using Convolution-augmented Transformer. 562-570
Special Session 1: Adversarial Learning for Multimedia Understanding and Retrieval
- Weidong Shi, Yunzhou Zhang, Shangdong Zhu, Yixiu Liu, Sonya Coleman, Dermot Kerr:
VAC-Net: Visual Attention Consistency Network for Person Re-identification. 571-578 - Lijia Deng, Yu-Dong Zhang:
MFGAN: A Lightweight Fast Multi-task Multi-scale Feature-fusion Model based on GAN. 579-586 - Zhipeng Wei, Jingjing Chen, Hao Zhang, Linxi Jiang, Yu-Gang Jiang:
Adaptive Temporal Grouping for Black-box Adversarial Attacks on Videos. 587-593
Special Session 2A: Transformer-based Multimedia Understanding: Model Design, Learning, Distillation
- Guangqi Jiang, Huibing Wang, Jinjia Peng, Xianping Fu:
Parallelism Network with Partial-aware and Cross-correlated Transformer for Vehicle Re-identification. 594-600 - Siqi Sun, Yongqing Sun, Mitsuhiro Goto, Shigekuni Kondo, Dan Mikami, Susumu Yamamoto:
Motor Learning based on Presentation of a Tentative Goal. 601-607 - Kui Xiao, Youheng Bai, Yan Zhang:
Extracting Precedence Relations between Video Lectures in MOOCs. 608-614 - Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, Jingjing Chen, Yu-Gang Jiang, Ser-Nam Lim:
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection. 615-623
Special Session 2B: Transformer-based Multimedia Understanding: Model Design, Learning, Distillation
- Bo Fu, Yuanxin Mao, Shilin Fu, Yonggong Ren, Zhongxuan Luo:
Blindfold Attention: Novel Mask Strategy for Facial Expression Recognition. 624-630 - Lei Zhu, Liewu Cai, Jiayu Song, Xinghui Zhu, Chengyuan Zhang, Shichao Zhang:
MSSPQ: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval. 631-638
Special Session 3A: Weakly Supervised Learning for Medical Image Analysis
- Yue Wu, Yang Zhou, Jianchun Zhao, Jingyuan Yang, Weihong Yu, Youxin Chen, Xirong Li:
Lesion Localization in OCT by Semi-Supervised Object Detection. 639-646 - Yunyan Yan, Chuanbin Liu, Hongtao Xie, Sicheng Zhang, Zhendong Mao:
Weakly Supervised Pediatric Bone Age Assessment Using Ultrasonic Images via Automatic Anatomical RoI Detection. 647-653 - Chao Suo, Xuanya Li, Donghui Tan, Yuan Zhang, Xieping Gao:
I2-Net: Intra- and Inter-scale Collaborative Learning Network for Abdominal Multi-organ Segmentation. 654-660 - Fenxia Duan, Chunhong Cao, Xieping Gao:
SA-NAS-BFNR: Spatiotemporal Attention Neural Architecture Search for Task-based Brain Functional Network Representation. 661-667
Special Session 3B: Weakly Supervised Learning for Medical Image Analysis
- Qian Wu, Yufei Chen, Ning Huang, Xiaodong Yue:
Weakly-supervised Cerebrovascular Segmentation Network with Shape Prior and Model Indicator. 668-676
Doctoral Symposium
- Runsheng Zhang:
FreqCAM: Frequent Class Activation Map for Weakly Supervised Object Localization. 677-680
Reproducibility Paper
- Yunqing He, Xu Sun, Hui Jiang, Tongwei Ren, Gangshan Wu, Maria Sinziana Astefanoaei, Andreas Leibetseder:
Reproducibility Companion Paper: Human Object Interaction Detection via Multi-level Conditioned Network. 681-684
Workshop Summaries
- Cathal Gurrin, Liting Zhou, Graham Healy, Björn Þór Jónsson, Duc-Tien Dang-Nguyen, Jakub Lokoc, Minh-Triet Tran, Wolfgang Hürst, Luca Rossetto, Klaus Schöffmann:
Introduction to the Fifth Annual Lifelog Search Challenge, LSC'22. 685-687 - Bogdan Ionescu, Giorgos Kordopatis-Zilos, Adrian Popescu, Luca Cuccovillo, Symeon Papadopoulos:
MAD '22 Workshop: Multimedia AI against Disinformation. 688-689 - Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Cathal Gurrin, Yuta Nakashima, Mianxiong Dong:
ICDAR'22: Intelligent Cross-Data Analysis and Retrieval. 690-691 - Naoko Nitta, Anita Min-Chun Hu, Kensuke Tobitani:
MMArt-ACM 2022: 5th Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia. 692-693
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.