default search action
MMAsia 2023: Tainan, Taiwan
- Wen-Huang Cheng, Wei-Ta Chu, Min-Chun Hu, Jiaying Liu, Munchurl Kim, Wei Zhang:
ACM Multimedia Asia 2023, MMAsia 2023, Tainan, Taiwan, December 6-8, 2023. ACM 2023
Full Papers
- Yu-Jou Chen, Yu-Shuen Wang:
TrackNetV3: Enhancing ShuttleCock Tracking with Augmentations and Trajectory Rectification. 1:1-1:7 - Pengju Wang, Bochao Liu, Dan Zeng, Chenggang Yan, Shiming Ge:
Personalized Federated Learning via Backbone Self-Distillation. 2:1-2:7 - Naifu Xue, Yuan Zhang:
Lambda-Domain Rate Control for Neural Image Compression. 3:1-3:7 - Weijie Luo, Zihao Liu, Guohao Dai, Ningyi Xu:
History-Detr: Optimize Query Initialization Strategy by Using Historical Information and Kinematics. 4:1-4:7 - Gaohuan Dong, Qing Xie, Jiachen Li, Yanchun Ma, Yuhan Liu, Yongjian Liu:
A Multi-scale and Dense Object Detector for Tibetan Thangka Images. 5:1-5:7 - Yidan Fan, Yongxin Yu, Wenhuan Lu, Yahong Han:
A Cross-modal and Redundancy-reduced Network for Weakly-Supervised Audio-Visual Violence Detection. 6:1-6:7 - Siqi Zhang, Jing Liu, Zhihua Wei:
From Pixels to Explanations: Uncovering the Reasoning Process in Visual Question Answering. 7:1-7:9 - Hong Chen, Bin Huang, Xin Wang, Yuwei Zhou, Wenwu Zhu:
Global-Local GraphFormer: Towards Better Understanding of User Intentions in Sequential Recommendation. 8:1-8:7 - Jie Liu, Qin Jiang, Qinglin Wang:
Guided Spatio-Temporal Learning Method for 4K Video Super-Resolution. 9:1-9:7 - Jiansong Sha, Haoyu Zhang, Yuchen Pan, Guang Kou, Xiaodong Yi:
NeRF-IS: Explicit Neural Radiance Fields in Semantic Space. 10:1-10:7 - Qiuwen Wang, Shuai Guo, Haoning Wu, Rong Xie, Li Song, Wenjun Zhang:
NeRF-SDP: Efficient Generalizable Neural Radiance Field with Scene Depth Perception. 11:1-11:7 - Zhengtao Yu, Jia Zhao, Huiling Wang, Chenliang Guo, Tong Zhou, Chongxiang Sun:
Adaptive Fusion for Visual Question Answering: Integrating Multi-Label Classification and Similarity Matching. 12:1-12:7 - Dongyang Yu, Yunshi Xie, Wangpeng An, Zhang Li, Yufeng Yao:
Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach. 13:1-13:8 - Yang Fan Chiang, Pei-Xuan Li, Ding-You Wu, Hsun-Ping Hsieh, Ching-Chung Ko:
Exploring Feature Fusion from A Contrastive Multi-Modality Learner for Liver Cancer Diagnosis. 14:1-14:7 - Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu:
Learning Snippet-to-Motion Progression for Skeleton-based Human Motion Prediction. 15:1-15:8 - Jianping Zhong, Zhaobo Qi, Weigang Zhang, Qingming Huang:
Semantic-Aware Dynamic Feature Selection and Fusion for Object Detection in UAV Videos. 16:1-16:7 - Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu:
Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Prediction. 17:1-17:7 - Yuqing Song, Jinyong Cheng:
Self-supervised anomaly detection of medical images based on dual-module discrepancy. 18:1-18:7 - Lijie Li, Caiyue Hu, Haitao Zhang, Akshita Maradapu Vera Venkata Sai:
Cross-modal Image-Recipe Retrieval via Multimodal Fusion. 19:1-19:7 - Zitan Chen, Zhuang Qi, Xiangxian Li, Yuqing Wang, Lei Meng, Xiangxu Meng:
Class-aware Convolution and Attentive Aggregation for Image Classification. 20:1-20:7 - Xiu Li, Chengyu Zheng, Jie Nie, Ruoyu Zhang, Xinyue Liang, Zhiqiang Wei:
Relevance and Irrelevance Considered Subspace Mapping Neural Networks for Remote Sensing Text-Image Retrieval. 21:1-21:7 - Haixin Wang, Jian Yang, Ryohei Katayama, Michiya Matsusaki, Tomoyuki Miyao, Jinjia Zhou:
NuclSeg: nuclei segmentation using semi-supervised stain deconvolution. 22:1-22:6 - Kosuke Iwama, Ryugo Morita, Jinjia Zhou:
Block based Adaptive Compressive Sensing with Sampling Rate Control. 23:1-23:7 - Jie Yang, Aihua Ke, Bo Cai:
Adapting Hierarchical Transformer for Scene-Level Sketch-Based Image Retrieval. 24:1-24:7 - I-Ju Hsieh, Yo-Chung Lau, Peng-Yuan Kao, Shih-Ping Hung, Yi-Ping Hung:
Domain-Adaptive Mean Teacher for Category-Level Object Pose Estimation. 25:1-25:8 - Guangxing Wu, Junxi Chen, Wentao Zhang, Ruixuan Wang:
Feature Adaptation with CLIP for Few-shot Classification. 26:1-26:7 - Jun Li, Yi Bin, Jie Zou, Jiwei Wei, Guoqing Wang, Yang Yang:
Cross-modal Consistency Learning with Fine-grained Fusion Network for Multimodal Fake News Detection. 27:1-27:7 - Yusheng Huang, Zhouhan Lin:
I2SRM: Intra- and Inter-Sample Relationship Modeling for Multimodal Information Extraction. 28:1 - Ci-Yin Zhang, Wei-Ta Chu:
Occlusion-Aware Manga Character Re-identification with Self-Paced Contrastive Learning. 29:1-29:7 - Yipeng Leng, Qiangjuan Huang, Zhiyuan Wang, Yangyang Liu, Haoyu Zhang:
DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation. 30:1-30:7 - Qiaowei Ma, Jinghui Zhong, Yitao Yang, Weiheng Liu, Ying Gao, Wing W. Y. Ng:
A Lightweight and Efficient Model for Audio Anti-Spoofing. 31:1-31:7 - Yuxiang Wan, Banghai Wang, Lunke Fei:
SOFTCUTMIX: Data Augmentation and Algorithmic Enhancements for Cross-Modality Person Re-Identification. 32:1-32:7 - Zehan Tan, Weidong Yang, Zhiwei Wang:
Reimagining 3D Visual Grounding: Instance Segmentation and Transformers for Fragmented Point Cloud Scenarios. 33:1-33:7 - Peng Zhang, Yida Chen, Meijuan Li, Hui Zhao, Jianqiang Zhang, Fuqiang Wang, Xiaoming Wu:
Speech Spoofing Detection Based on Graph Attention Networks with Spectral and Temporal Information. 34:1-34:7 - Jingyi Cao, Bo Liu, Yunqian Wen, Rong Xie, Li Song:
Achieving Privacy-Preserving Multi-View Consistency with Advanced 3D-Aware Face De-identification. 35:1-35:7 - Suzanne Kobeisse, Lars Erik Holmquist:
Moving Inside the Box: Interacting with Interpretation of Historical Artefacts Through Tangible Augmented Reality. 36:1-36:7 - Songhui Zhao, Sujuan Hou, Baisong Zhang:
A Decoupled Cross-layer Fusion Network with Bidirectional Guidance for Detecting Small Logos. 37:1-37:8 - Guangtong Zhang, Qihua Liang, Ning Li, Zhiyi Mo, Bineng Zhong:
Robust Tracking via Unifying Pretrain-Finetuning and Visual Prompt Tuning. 38:1-38:7 - Jie-Ying Li, Herman Prawiro, Chia-Chen Chiang, Hsin-Yu Chang, Tse-Yu Pan, Chih-Tsun Huang, Min-Chun Hu:
Efficient Hand Gesture Recognition using Multi-Task Multi-Modal Learning and Self-Distillation. 39:1-39:7 - Takumi Nishiyasu, Wataru Shimoda, Yoichi Sato:
Image Cropping under Design Constraints. 40:1-40:7 - Lin Wang, Hongyi Zhang, Xingfu Wang, Yan Xiong:
Learning a Contextualized Multimodal Embedding for Zero-shot Cooking Video Caption Generation. 41:1-41:8 - Jingwen Cui, Qian Huang, Chang Li, Yunfei Zhang:
MA-Net: Multi-Attention Network for Skeleton-Based Action Recognition. 42:1-42:7 - Zhenglin Tang, Hai-Miao Hu:
A Spatial-Spectral Decoupling Fusion Framework for Visible and Near-Infrared Images. 43:1-43:7 - Huaizhuo Liu, Hai-Miao Hu:
From Global to Local: An Adaptive Environmental Illumination Estimation for Non-uniform Scattering. 44:1-44:7 - Wei Guo, Hao Wang:
Key Parts Spatio-Temporal Learning for Video Person Re-identification. 45:1-45:6 - Zhongtao Chen, Yuma Honbu, Keiji Yanai:
Mask-based Food Image Synthesis with Cross-Modal Recipe Embeddings. 46:1-46:7 - Yuki Matsuura, Takahiro Hayashi:
AniCropify: Image Matting for Anime-Style Illustration. 47:1-47:7 - Fei Zhu, Wanqian Zhang, Dayan Wu, Lin Wang, Bo Li, Weiping Wang:
Targeted Transferable Attack against Deep Hashing Retrieval. 48:1-48:7 - Zhewen Deng, Dongyue Chen, Shizhuo Deng:
Prior Knowledge Guided Network for Video Anomaly Detection. 49:1-49:7 - Peng Liu, Chuanxu Wang, Jianwei Qin, Guocheng Lin:
Feature Enhancement and Foreground-Background Separation for Weakly Supervised Temporal Action Localization. 50:1-50:7 - Longfei Ma, Honggang Zhao, Zheng Jiang, Mingyong Li:
Multi-view-enhanced modal fusion hashing for Unsupervised cross-modal retrieval. 51:1-51:7 - Weiliang Xie, Qian Huang, Chang Li, Yanfang Wang, Yanwei Liu:
Hierarchical Multi-Scale Adaptive Conv-LSTM Network for Human Action Recognition Based on Wearable Sensors. 52:1-52:8 - Po-Han Huang, Yue-Hua Han, Ernie Chu, Jun-Cheng Chen, Kai-Lung Hua:
Multi-Task Self-Blended Images for Face Forgery Detection. 53:1-53:7 - Zichen Zhu, Stefano Petrangeli, Viswanathan Swaminathan, Sheng Wei:
Power Efficient Mobile VTuber Live Streaming. 54:1-54:7 - Quang Long Nguyen, Duc Nguyen, Huong Thu Truong:
Toward Optimal Real-time Dynamic Point Cloud Streaming over Bandwidth-constrained Networks. 55:1-55:7 - Jialiang Shi, Takahiro Komamizu, Keisuke Doman, Haruya Kyutoku, Ichiro Ide:
RecipeMeta: Metapath-enhanced Recipe Recommendation on Heterogeneous Recipe Network. 56:1-56:7 - Shifeng Xie, Yi Liu, Wenjing Shuai:
FTUnet: Feature Transferred U-Net For Single HDR Image Reconstruction. 57:1-57:7 - Bin Zheng, He Zhang, Lu Jin:
Research on Multi-Person Pose Estimation Based on YOLO and Decoupled Multi-Level Feature Layers Fusion. 58:1-58:7 - Yi Zheng, Zuqiang Meng:
Towards Representation Alignment and Uniformity in Long-tailed Classification. 59:1-59:7 - Shangwang Liu, Danyang Liu, Yinghai Lin, Ziqi Wei:
SFNet: Saliency fast Fourier convolutional Network for medical image segmentation. 60:1-60:7 - Peng-Fei Zhang, Zi Helen Huang:
Multi-head Siamese Prototype Learning against both Data and Label Corruption. 61:1-61:7 - Shengli Zhang, Shikui Wei, Shiyin Zhang, Sen Xu, Weiyan Xu, Yao Zhao:
Rethinking Parking Slot Detection with Rotated Bounding Box. 62:1-62:7 - Yiming Huang, Aozhe Jia, Xiaodan Zhang, Jiawei Zhang:
Generic Attention-model Explainability by Weighted Relevance Accumulation. 63:1-63:7 - Miaomiao Dai, Hao Yin, Ran Yi, Lizhuang Ma:
Geometric Style Transfer for Face Portraits. 64:1-64:7 - Huashan Sun, Qian Huang, Yiming Wang, Xiaotong Guo, Ruoyu Hao:
Optical Flow based Feature Prediction and Decomposed Context for Video Compression. 65:1-65:7 - Yaqun Fang, Ruichao Hou, Jia Bei, Tongwei Ren, Gangshan Wu:
ADNet: An Asymmetric Dual-Stream Network for RGB-T Salient Object Detection. 66:1-66:7 - Boyue Xu, Yi Xu, Ruichao Hou, Jia Bei, Tongwei Ren, Gangshan Wu:
RGB-D Tracking via Hierarchical Modality Aggregation and Distribution Network. 67:1-67:7 - Yun Liang, Shijie Peng, Xinjie Xiao, Lianghui Li:
Dual-domain Feature Learning and Cross Dimension Interaction Attention for Nighttime Image Dehazing. 68:1-68:7 - Ping-Chen Chan, Po-Wei Chen, Von-Wun Soo:
Improve Singing Quality Prediction Using Self-supervised Transfer Learning and Human Perception Feedback. 69:1-69:7 - Xiaotong Guo, Qian Huang, Yiming Wang, Huashan Sun:
End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation. 70:1-70:7 - Yun Liang, Ming Junhui, Jintu Zheng:
SASSM: Semantic Awareness and Self-Support Matching for Semi-Supervised Video Object Segmentation. 71:1-71:7 - Yun Liang, Fumian Long, Qiaoqiao Li, Dong Wang:
GTTrack: Gaussian Transformer Tracker for Visual Tracking. 72:1-72:7 - Iuan Kai Fang, Bo-Hao Zhang, Te Lun Liu, Hao Tan, Wei Syun Chen, Che-Rung Lee:
MontageNet: Annotated Dataset of Furniture Components in Real-World Images. 73:1-73:7 - Avinash Anand, Raj Jaiswal, Mohit Gupta, Siddhesh S. Bangar, Pijush Bhuyan, Naman Lal, Rajeev Singh, Ritika Jha, Rajiv Ratn Shah, Shin'ichi Satoh:
RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization. 74:1-74:6 - Ft Zheng, Le Hui, Jin Xie, Haofeng Zhang:
Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation. 75:1-75:7 - Zhe Chen, Jiyi Li, Fumiyo Fukumoto, Peng Liu, Yoshimi Suzuki:
Vision-Language Navigation for Quadcopters with Conditional Transformer and Prompt-based Text Rephraser. 76:1-76:7 - Jiajie Lin, Zhuopan Yang, Zhenguo Yang, Xiaoping Li, Fu Lee Wang, Wenyin Liu:
Confidence-guided Boundary Adaption Network for Multimodal Fake News Detection. 77:1-77:7 - Satayu Parinayok, Yoko Yamakata, Kiyoharu Aizawa:
Open-Vocabulary Segmentation Approach for Transformer-Based Food Nutrient Estimation. 78:1-78:7 - Lijuan Zhou, Jianing Mao:
Improving Class Representation for Zero-Shot Action Recognition. 79:1-79:7 - Yan Niu, Lixue Zhang, Chenlai Li:
Independent and Collaborative Demosaicking Neural Networks. 80:1-80:7 - Xinyi Yuan, Liansheng Zhuang:
Learning a Robust Model with Pseudo Boundaries for Noisy Temporal Action Localization. 81:1-81:7 - Sung Kwon On, Songhyon Kim, Kwangjin Yang, Younggun Lee:
Monocular 3D Pose Estimation of Very Small Airplane in the Air. 82:1-82:7 - Sheng Yan, Yang Liu, Haoqiang Wang, Xin Du, Mengyuan Liu, Hong Liu:
Cross-Modal Retrieval for Motion and Text via DropTriple Loss. 83:1-83:7 - Hamed Alimohammadzadeh, Heather Culbertson, Shahram Ghandeharizadeh:
An Evaluation of Decentralized Group Formation Techniques for Flying Light Specks. 84:1-84:7
Short Papers
- Ying Shen, Wei Li, Zhaoquan Yuan, Xiao Wu:
Learning Surface-awareness Network for X-Ray Prohibited Item Detection. 85:1-85:5 - Mingjin Wu, Shijun Xiang:
An Efficient CNN-based Prediction for Reversible Data Hiding. 86:1-86:5 - Kuo-Yu Liu, Yuanshan Chen, Ming-Fang Lin, Li-Jung Daphne Huang, Cheah Ping Xiang:
Developing a VR-based contextualized language learning system to Enhance Junior High School Students' Pragmatic Competence. 87:1-87:5 - Shunta Sakaue, Taiju Kimura, Hiroki Nishino:
Reducing Objective Difficulty Without Influencing Subjective Difficulty in a Video Game. 88:1-88:5 - Keita Suzuki, Satoshi Suzuki, Ryo Masumura, Atsushi Ando, Naoki Makishima:
Multi-region CNN-Transformer for Micro-gesture Recognition in Face and Upper Body. 89:1-89:5 - Ryota Kaji, Keiji Yanai:
VQ-VDM: Video Diffusion Models with 3D VQGAN. 90:1-90:5 - Xianhao Chen, Kuan Chen, Yuzhe Mao, Linna Zhou, Weike You:
Facial Parameter Splicing: A Novel Approach to Efficient Talking Face Generation. 91:1-91:5 - Nouf Alrasheed, Shraboni Sarker, Viviana Grieco, Praveen Rao:
Few-Shot Learning for Word Recognition in Handwritten Seventeenth-Century Spanish American Notary Records. 92:1-92:5 - Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He:
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization. 93:1-93:5 - Xiaojiao Chen, Sheng Li, Jiyi Li, Yang Cao, Hao Huang, Liang He:
GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System. 94:1-94:5 - Karanvir Singh, Mukesh Saini:
Towards Digital Twin of Crops for Growth Modelling using Virtual Reality. 95:1-95:5 - Jeonguk Hong, Gyewon Jeon, Sangwon Lee:
Exploring User-oriented Social Recommendation System through Granting Users Control over a Social Group. 96:1-96:5 - Taiwei Wu, Jianhao Zhang, Lian Duan, Yuanzhe Cai:
Music-Graph2Vec: An Efficient Method for Embedding Pitch Segment. 97:1-97:5 - Luyang Liu, Hiroki Nishikawa, Jinjia Zhou, Ittetsu Taniguchi, Takao Onoye:
Adaptive Sampling for Computer Vision-Oriented Compressive Sensing. 98:1-98:5 - Yan Li, Shibin Wang:
EmAGAN: Embedded Blocks Search and Mask Attention GAN for Makeup Transfer. 99:1-99:5 - Jingbin Xu, Junwen Chen, Keiji Yanai:
Contextual Associated Triplet Queries for Panoptic Scene Graph Generation. 100:1-100:5 - Liangyu Wang, Yoko Yamakata, Kiyoharu Aizawa:
Automatic Dataset Creation from User-generated Recipes for Ingredient-centric Food Image Analysis. 101:1-101:5
Demo Papers
- Yen-Pin Cheng, Tsung-Hsun Tsai, Tai-Chen Tsai, Yi-Hsuan Chiu, Hung-Kuo Chu, Min-Chun Hu:
OmniScorer: Real-Time Shot Spot Analysis for Court View Basketball Videos. 102:1-102:3 - Chen-Wei Fu, Wei-Lun Huang, Pin-Xuan Liu, Yu-Hsuan Chen, Ming-Cong Su, Andrew Chen, Ping-Hsuan Han, Tse-Yu Pan:
TelEmoScatter: Enabling Remote Interaction and Emotional Connections in Virtual and Physical Music Performance. 103:1-103:3 - Wenlong Du, Qingquan Li, Jian Zhou, Xu Ding, Xuewei Wang, Zhongjun Zhou, Jin Liu:
FinGuard: A Multimodal AIGC Guardrail in Financial Scenarios. 104:1-104:3 - Fan Yu, Huanyu Xing, Jia Bei, Tongwei Ren:
Easy Travelogue: A Travelogue Editor with Automatic Image Recommendation and Insertion. 105:1-105:3 - Shota Okubo, Tomoaki Konno, Toshiharu Horiuchi, Tatsuya Kobayashi:
Directional Sound Source Representation Using Paired Microphone Array with Different Characteristics Suitable for Volumetric Video Capture. 106:1-106:3 - Guan-Yu Wu, Chun-Ho Hung, Hsuan-Wei Chen, Wei-Ta Chu:
A Trajectory-based Statistics and Tactics Analysis System for Table Tennis. 107:1-107:3 - Ryo Kawai, Noboru Yoshida, Jianquan Liu:
A consulting system for guiding various image recognitions. 108:1-108:3 - Yiyun Zhang, Zijian Wang:
VLM-BCD: Unsupervised Building Change Detection. 109:1-109:3 - Yu-Hsi Chen:
One-Epoch Training for Object Detection in Fisheye Images. 110:1-110:5 - Chih-Chung Hsu, Wen-Hai Tseng, Ming-Hsuan Wu, Chia-Ming Lee, Wei-Hao Huang:
Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach. 111:1-111:6 - Yi-Zeng Hsieh, Hau-Ching Chen, Yi-Hung Yeh:
Object Detection via Fisheye Camera. 112:1-112:7 - Yu-Shu Ni, Chia-Chi Tsai, Jyun-Syu Lin, Hsien-Po Meng, Po-Chi Hu, Jiun-Shiung Chen, Kun-Hung Lin, Chih-Yuan Chuang, Jiun-In Guo:
Summary of the 2023 PAIR-LITEON Competition: Embedded AI Object Detection Model Design Contest on Fish-eye Around-view Cameras. 113:1-113:7
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.