default search action
ICMR 2018: Yokohama, Japan
- Kiyoharu Aizawa, Michael S. Lew, Shin'ichi Satoh:
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, ICMR 2018, Yokohama, Japan, June 11-14, 2018. ACM 2018
Keynote 1
- Kohji Mitani:
The Ongoing Evolution of Broadcast Technology. 1
Keynote 2
- Shunji Yamanaka:
Prototyping for Envisioning the Future. 2
Industrial Talks
- Yusuke Fujisaka:
Orion: An Integrated Multimedia Content Moderation System for Web Services. 3 - Tomokazu Murakami:
Industrial Applications of Image Recognition and Retrieval Technologies for Public Safety and IT Services. 4 - Kota Iwamoto:
NEC's Object Recognition Technologies and their Industrial Applications. 5 - Yoji Kiyota:
Promoting Open Innovations in Real Estate Tech: Provision of the LIFULL HOME'S Data Set and Collaborative Studies. 6
Tutorials
- Hanwang Zhang, Qianru Sun:
Objects, Relationships, and Context in Visual Data. 7 - Xiangnan He, Hanwang Zhang, Tat-Seng Chua:
Recommendation Technologies for Multimedia Content. 8 - Guo-Jun Qi:
Multimedia Content Understanding by Learning from Very Few Examples: Recent Progress on Unsupervised, Semi-Supervised and Supervised Deep Learning Approaches. 9
Best Paper Session
- Gonçalo Marcelino, Ricardo Pinto, João Magalhães:
Ranking News-Quality Multimedia. 10-18 - Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury:
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval. 19-27 - Shizhe Chen, Jia Chen, Qin Jin, Alexander G. Hauptmann:
Class-aware Self-Attention for Audio Event Recognition. 28-36 - Andrea Ceroni, Chenyang Ma, Ralph Ewerth:
Mining Exoticism from Visual Content with Fusion-based Deep Neural Networks. 37-45
Oral Session 1: Multimedia Retrieval
- Xing Xu, Jingkuan Song, Huimin Lu, Yang Yang, Fumin Shen, Zi Huang:
Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval. 46-54 - Kevin Joslyn, Kai Li, Kien A. Hua:
Cross-Modal Retrieval Using Deep De-correlated Subspace Ranking Hashing. 55-63 - Ge Song, Xiaoyang Tan:
Learning Multilevel Semantic Similarity for Large-Scale Multi-Label Image Retrieval. 64-72 - Limeng Cui, Zhensong Chen, Jiawei Zhang, Lifang He, Yong Shi, Philip S. Yu:
Multi-view Collective Tensor Decomposition for Cross-modal Hashing. 73-81 - Lei Zhou, Xiao Bai, Xianglong Liu, Jun Zhou:
Binary Coding by Matrix Classifier for Efficient Subspace Retrieval. 82-90 - Zhongyan Zhang, Lei Wang, Yang Wang, Luping Zhou, Jianjia Zhang, Fang Chen:
Instance Image Retrieval by Aggregating Sample-based Discriminative Characteristics. 91-99
Oral Session 2: Multimedia Content Analysis
- Wenjie Zhang, Junchi Yan, Xiangfeng Wang, Hongyuan Zha:
Deep Extreme Multi-label Learning. 100-107 - Feiran Huang, Xiaoming Zhang, Chaozhuo Li, Zhoujun Li, Yueying He, Zhonghua Zhao:
Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder. 108-116 - Devanshu Arya, Marcel Worring:
Exploiting Relational Information in Social Networks using Geometric Deep Learning on Hypergraphs. 117-125 - Matthias Zeppelzauer, Miroslav Despotovic, Muntaha Sakeena, David Koch, Mario Döller:
Automatic Prediction of Building Age from Photographs. 126-134 - Kejun Zhang, Hui Zhang, Simeng Li, Chang-yuan Yang, Lingyun Sun:
The PMEmo Dataset for Music Emotion Recognition. 135-142
Oral Session 3: Multimedia Applications
- Zunlei Feng, Zhenyun Yu, Yezhou Yang, Yongcheng Jing, Junxiao Jiang, Mingli Song:
Interpretable Partitioned Embedding for Customized Multi-item Fashion Outfit Composition. 143-151 - Peirui Cheng, Weiqiang Wang:
A Multi-Oriented Scene Text Detector with Position-Sensitive Segmentation. 152-159 - Lan Wang, Yang Wang, Susu Shan, Feng Su:
Scene Text Detection and Tracking in Video with Background Cues. 160-168
Oral Session 4: Video Analysis
- Yang Mi, Kang Zheng, Song Wang:
Recognizing Actions in Wearable-Camera Videos by Training Classifiers on Fixed-Camera Videos. 169-177 - Romain Cohendet, Karthik Yadati, Ngoc Q. K. Duong, Claire-Hélène Demarty:
Annotating, Understanding, and Predicting Long-term Video Memorability. 178-186 - Daniel Rotman, Dror Porat, Gal Ashour, Udi Barzelay:
Optimally Grouped Deep Features Using Normalized Cost for Video Scene Detection. 187-195
Poster Paper Session
- Hanjiang Lai:
Transductive Zero-Shot Hashing via Coarse-to-Fine Similarity Mining. 196-203 - Xin Luo, Peng-Fei Zhang, Ye Wu, Zhen-Duo Chen, Hua-Junjie Huang, Xin-Shun Xu:
Asymmetric Discrete Cross-Modal Hashing. 204-212 - Xiang Zhang, Guohua Dong, Yimo Du, Chengkun Wu, Zhigang Luo, Canqun Yang:
Collaborative Subspace Graph Hashing for Cross-modal Retrieval. 213-221 - Ye Wu, Xin Luo, Xin-Shun Xu, Shanqing Guo, Yuliang Shi:
Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval. 222-230 - Bingqing Ke, Jie Shao, Zi Huang, Heng Tao Shen:
Feature Reconstruction by Laplacian Eigenmaps for Efficient Instance Search. 231-239 - Zachary Seymour, Zhongfei (Mark) Zhang:
Image Annotation Retrieval with Text-Domain Label Denoising. 240-248 - Zachary Seymour, Zhongfei (Mark) Zhang:
Multi-label Triplet Embeddings for Image Annotation from User-Generated Tags. 249-256 - Chandramani Chaudhary, Poonam Goyal, Joel Ruben Antony Moniz, Navneet Goyal, Yi-Ping Phoebe Chen:
Linguistic Patterns and Cross Modality-based Image Retrieval for Complex Queries. 257-265 - Minh-Son Dao, Pham Quang Nhat Minh, Asem Kasem, Mohamed Saleem Haja Nazmudeen:
A Context-Aware Late-Fusion Approach for Disaster Image Retrieval from Social Media. 266-273 - Yugo Sato, Tsukasa Fukusato, Shigeo Morishima:
Face Retrieval Framework Relying on User's Visual Memory. 274-282 - Xueping Wang, Weixin Li, Guodong Mu, Di Huang, Yunhong Wang:
Facial Expression Synthesis by U-Net Conditional Generative Adversarial Networks. 283-290 - Hongzhi Li, Joseph G. Ellis, Lei Zhang, Shih-Fu Chang:
PatternNet: Visual Pattern Mining with Deep Neural Network. 291-299 - Mingjie Zheng, Sheng-hua Zhong, Songtao Wu, Jianmin Jiang:
Steganographer Detection based on Multiclass Dilated Residual Networks. 300-308 - Maguell L. T. L. Sandifort, Jianquan Liu, Shoji Nishimura, Wolfgang Hürst:
An Entropy Model for Loiterer Retrieval across Multiple Surveillance Cameras. 309-317 - Philipp Harzig, Christian Eggert, Rainer Lienhart:
Visual Question Answering With a Hybrid Convolution Recurrent Model. 318-325 - Shuai Liao, Efstratios Gavves, Cees G. M. Snoek:
Searching and Matching Texture-free 3D Shapes in Images. 326-334 - Duc-Tien Dang-Nguyen, Michael Riegler, Liting Zhou, Cathal Gurrin:
Challenges and Opportunities within Personal Life Archives. 335-343 - Xu Sun, Yuantian Wang, Tongwei Ren, Zhi Liu, Zheng-Jun Zha, Gangshan Wu:
Object Trajectory Proposal via Hierarchical Volume Grouping. 344-352 - Sungeun Hong, Woobin Im, Hyun Seung Yang:
CBVMR: Content-Based Video-Music Retrieval Using Soft Intra-Modal Structure Constraint. 353-361 - Yi Tang, Wenbin Zou, Zhi Jin, Xia Li:
Multi-Scale Spatiotemporal Conv-LSTM Network for Video Saliency Detection. 362-369 - Jianfei Xue, Koji Eguchi:
Supervised Nonparametric Multimodal Topic Modeling Methods for Multi-class Video Classification. 370-378 - Baohan Xu, Hao Ye, Yingbin Zheng, Heng Wang, Tianyu Luwang, Yu-Gang Jiang:
Dense Dilated Network for Few Shot Action Recognition. 379-387 - Haonan Qiu, Yingbin Zheng, Hao Ye, Yao Lu, Feng Wang, Liang He:
Precise Temporal Action Localization by Evolving Temporal Proposals. 388-396
Special Session 1: Predicting User Perceptions of Multimedia Content
- Dmitry Kuzovkin, Tania Pouli, Rémi Cozot, Olivier Le Meur, Jonathan Kervec, Kadi Bouatouch:
Image Selection in Photo Albums. 397-404 - Yasemin Timar, Nihan Karslioglu, Heysem Kaya, Albert Ali Salah:
Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips. 405-412 - Sarath Sivaprasad, Tanmayee Joshi, Rishabh Agrawal, Niranjan Pedanekar:
Multimodal Continuous Prediction of Emotions in Movies using Long Short-Term Memory Networks. 413-419 - Yang Liu, Zhonglei Gu, Tobey H. Ko, Kien A. Hua:
Learning Perceptual Embeddings with Two Related Tasks for Joint Predictions of Media Interestingness and Emotions. 420-427 - Jayneel Parekh, Harshvardhan Tibrewal, Sanjeel Parekh:
Deep Pairwise Classification and Ranking for Predicting Media Interestingness. 428-433 - Iván González-Díaz, Jenny Benois-Pineau, Jean-Philippe Domenger, Aymar de Rugy:
Perceptually-guided Understanding of Egocentric Video Content: Recognition of Objects to Grasp. 434-441 - Wenlu Yang, Maria Rifqi, Christophe Marsala, Andréa Pinna:
Towards Better Understanding of Player's Game Experience. 442-449
Special Session 2: Social-Media Visual Summarization / Large-Scale 3D Multimedia Analysis and Applications
- Po-Yao Huang, Junwei Liang, Jean-Baptiste Lamare, Alexander G. Hauptmann:
Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis. 450-457 - Xiangyu Yue, Bichen Wu, Sanjit A. Seshia, Kurt Keutzer, Alberto L. Sangiovanni-Vincentelli:
A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving. 458-464 - Guoyu Lu, Jingkuan Song:
3D Image-based Indoor Localization Joint With WiFi Positioning. 465-472 - Zhiwei Li, Lei Yu:
Compare Stereo Patches Using Atrous Convolutional Neural Networks. 473-480
Doctoral Symposium Session
- Wan-Lun Tsai:
Personal Basketball Coach: Tactic Training through Wireless Virtual Reality. 481-484 - Andreas Leibetseder, Klaus Schoeffmann:
Extracting and Using Medical Expert Knowledge to Advance in Video Processing for Gynecologic Endoscopy. 485-488 - Noa Garcia:
Temporal Aggregation of Visual Features for Large-Scale Image-to-Video Retrieval. 489-492 - Naoki Saito, Takahiro Ogawa, Satoshi Asamizu, Miki Haseyama:
Tourism Category Classification on Image Sharing Services Through Estimation of Existence of Reliable Results. 493-496 - Rashmi Gupta:
Considering Documents in Lifelog Information Retrieval. 497-500
Demonstration Session
- Longhui Wei, Xiaobin Liu, Jianing Li, Shiliang Zhang:
VP-ReID: Vehicle and Person Re-Identification System. 501-504 - Maguell L. T. L. Sandifort, Jianquan Liu, Shoji Nishimura, Wolfgang Hürst:
VisLoiter+: An Entropy Model-Based Loiterer Retrieval System with User-Friendly Interfaces. 505-508 - Kengo Makino, Wenjie Duan, Rui Ishiyama, Toru Takahashi, Yuta Kudo, Pieter Jonker:
Automated Scanning and Individual Identification System for Parts without Marking or Tagging. 509-512 - Nico Hezel, Kai Uwe Barthel:
Dynamic Construction and Manipulation of Hierarchical Quartic Image Graphs. 513-516 - Jonas Krause, Gavin Sugita, Kyungim Baek, Lipyeow Lim:
WTPlant (What's That Plant?): A Deep Learning System for Identifying Plants in Natural Images. 517-520 - Matthew Cooper, Jian Zhao, Chidansh Amitkumar Bhatt, David A. Shamma:
MOOCex: Exploring Educational Video via Recommendation. 521-524 - Yangbangyan Jiang, Qianqian Xu, Xiaochun Cao, Qingming Huang:
Who to Ask: An Intelligent Fashion Consultant. 525-528 - Po-Wen Chou, Fu-Neng Lin, Keh-Ning Chang, Herng-Yow Chen:
A Simple Score Following System for Music Ensembles Using Chroma and Dynamic Time Warping. 529-532
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.