Machine learning approaches

Applied Filters

People

Publications

Conferences

Publication Date

46 Results for: Book/Issue: MM '18: Proceedings of the 26th ACM international conference on MultimediaEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,784,778 records)|Limit your search to The ACM Full-Text Collection (765,875 records)

Showing 1 - 20of46 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
October 2018
Unprecedented Usage of Pre-trained CNNs on Beauty Product
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 2068–2072https://doi.org/10.1145/3240508.3266433

How does a pre-trained Convolution Neural Network (CNN) model perform on beauty and personal care items (i.e Perfect-500K) This is the question we attempt to answer in this paper by adopting several well known deep learning models pre-trained on ...
5
198
Metrics
Total Citations5
Total Downloads198
Last 12 Months6
Last 6 weeks1
Get Access
research-article
October 2018
On Reducing Effort in Evaluating Laparoscopic Skills
- Sabrina Kletz
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 815–819https://doi.org/10.1145/3240508.3243934

Training and evaluation of laparoscopic skills have become an important aspect of young surgeons' education. The evaluation process is currently performed manually by experienced surgeons through reviewing video recordings of laparoscopic procedures for ...
0
140
Metrics
Total Citations0
Total Downloads140
Last 12 Months4
Last 6 weeks1
Get Access
research-article
October 2018
VIVID: Virtual Environment for Visual Deep Learning
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1356–1359https://doi.org/10.1145/3240508.3243653

Due to the advances in deep reinforcement learning and the demand of large training data, virtual-to-real learning has gained lots of attention from computer vision community recently. As state-of-the-art 3D engines can generate photo-realistic images ...
15
494
Metrics
Total Citations15
Total Downloads494
Last 12 Months34
Last 6 weeks6
Get Access
research-article
October 2018
Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1976–1983https://doi.org/10.1145/3240508.3241911

Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, ...
16
358
Metrics
Total Citations16
Total Downloads358
Last 12 Months39
Last 6 weeks1
Get Access
demonstration
October 2018
Magical Rice Bowl: A Real-time Food Category Changer
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1244–1246https://doi.org/10.1145/3240508.3241391

In this demo, we demonstrate "Real-time Food Category Change'' based on a Conditional Cycle GAN (cCycle GAN) with a large-scale food image data collected from the Twitter Stream. Conditional Cycle GAN is an extension of CycleGAN, which enables "Food ...
2
159
Metrics
Total Citations2
Total Downloads159
Last 12 Months9
Last 6 weeks2
Get Access
research-article
October 2018
Partial Multi-view Subspace Clustering
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1794–1801https://doi.org/10.1145/3240508.3240679

For many real-world multimedia applications, data are often described by multiple views. Therefore, multi-view learning researches are of great significance. Traditional multi-view clustering methods assume that each view has complete data. However, ...
41
688
Metrics
Total Citations41
Total Downloads688
Last 12 Months55
Last 6 weeks5
Get Access
research-article
October 2018
A Large-scale RGB-D Database for Arbitrary-view Human Action Recognition
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1510–1518https://doi.org/10.1145/3240508.3240675

Current researches mainly focus on single-view and multiview human action recognition, which can hardly satisfy the requirements of human-robot interaction (HRI) applications to recognize actions from arbitrary views. The lack of databases also sets up ...
45
556
Metrics
Total Citations45
Total Downloads556
Last 12 Months45
Last 6 weeks8
Get Access
research-article
October 2018
Generating Defensive Plays in Basketball Games
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1580–1588https://doi.org/10.1145/3240508.3240670

In this paper, we present a method to generate realistic defensive plays in a basketball game based on the ball and the offensive team's movements. Our system allows players and coaches to simulate how the opposing team will react to a newly developed ...
13
338
Metrics
Total Citations13
Total Downloads338
Last 12 Months37
Last 6 weeks4
1
Supplementary Material
fp0811.zip
Get Access
research-article
October 2018
SibNet: Sibling Convolutional Encoder for Video Captioning
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1425–1434https://doi.org/10.1145/3240508.3240667

Video captioning is a challenging task owing to the complexity of understanding the copious visual information in videos and describing it using natural language. Different from previous work that encodes video information using a single flow, in this ...
56
505
Metrics
Total Citations56
Total Downloads505
Last 12 Months23
Last 6 weeks1
Get Access
research-article
October 2018
Learning Local Descriptors with Adversarial Enhancer from Volumetric Geometry Patches
- Jing Zhu,
- Yi Fang
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1466–1474https://doi.org/10.1145/3240508.3240666

Local matching problems (e.g. key point matching, geometry registration) are significant but challenging tasks in computer vision field. In this paper, we propose to learn a robust local 3D descriptor from volumetric point patches to tackle the local ...
0
185
Metrics
Total Citations0
Total Downloads185
Last 12 Months6
Last 6 weeks1
Get Access
research-article
October 2018
Enhancing Visual Question Answering Using Dropout
- Zhiwei Fang,
- Jing Liu,
- Yanyuan Qiao,
- Qu Tang,
- Yong Li,
- Hanqing Lu
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1002–1010https://doi.org/10.1145/3240508.3240662

Using dropout in Visual Question Answering (VQA) is a common practice to prevent overfitting. However, in multi-path networks, the current way to use dropout may cause two problems: the co-adaptations of neurons and the explosion of output variance. In ...
4
332
Metrics
Total Citations4
Total Downloads332
Last 12 Months7
Last 6 weeks1
Get Access
research-article
October 2018
User-Guided Deep Anime Line Art Colorization with Conditional Adversarial Networks
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1536–1544https://doi.org/10.1145/3240508.3240661

Scribble colors based line art colorization is a challenging computer vision problem since neither greyscale values nor semantic information is presented in line arts, and the lack of authentic illustration-line art training pairs also increases ...
109
937
Metrics
Total Citations109
Total Downloads937
Last 12 Months63
Last 6 weeks9
Get Access
research-article
October 2018
Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 984–992https://doi.org/10.1145/3240508.3240653

Modeling statistical regularity plays an essential role in ill-posed image processing problems. Recently, deep learning based methods have been presented to implicitly learn statistical representation of pixel distributions in natural images and ...
28
240
Metrics
Total Citations28
Total Downloads240
Last 12 Months10
Last 6 weeks1
Get Access
research-article
October 2018
Attentive Recurrent Neural Network for Weak-supervised Multi-label Image Classification
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1092–1100https://doi.org/10.1145/3240508.3240649

Multi-label image classification is a fundamental and challenging task in computer vision, and recently achieved significant progress by exploiting semantic relations among labels. However, the spatial positions of labels for multi-labels images are ...
30
451
Metrics
Total Citations30
Total Downloads451
Last 12 Months19
Last 6 weeks2
Get Access
research-article
October 2018
End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 546–554https://doi.org/10.1145/3240508.3240643

Blind video quality assessment (BVQA) algorithms are traditionally designed with a two-stage approach - a feature extraction stage that computes typically hand-crafted spatial and/or temporal features, and a regression stage working in the feature space ...
78
595
Metrics
Total Citations78
Total Downloads595
Last 12 Months47
Last 6 weeks5
Get Access
research-article
October 2018
ThoughtViz: Visualizing Human Thoughts Using Generative Adversarial Network
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 950–958https://doi.org/10.1145/3240508.3240641

Studying human brain signals has always gathered great attention from the scientific community. In Brain Computer Interface (BCI) research, for example, changes of brain signals in relation to specific tasks (e.g., thinking something) are detected and ...
50
1,576
Metrics
Total Citations50
Total Downloads1,576
Last 12 Months403
Last 6 weeks56
Get Access
research-article
Public Access
October 2018
An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks
- Pu Zhao,
- Sijia Liu,
- Yanzhi Wang,
- Xue Lin
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1065–1073https://doi.org/10.1145/3240508.3240639

Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. In a ...
19
665
Metrics
Total Citations19
Total Downloads665
Last 12 Months83
Last 6 weeks7
View online with eReader
PDF
research-article
October 2018
Learning and Fusing Multimodal Deep Features for Acoustic Scene Categorization
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1892–1900https://doi.org/10.1145/3240508.3240631

Convolutional Neural Networks (CNNs) have been widely applied to audio classification recently where promising results have been obtained. Previous CNN-based systems mostly learn from two-dimensional time-frequency representations such as MFCC and ...
26
500
Metrics
Total Citations26
Total Downloads500
Last 12 Months23
Last 6 weeks1
Get Access
research-article
October 2018
BeautyGAN: Instance-level Facial Makeup Transfer with Deep Generative Adversarial Network
- Tingting Li,
- Ruihe Qian,
- Chao Dong,
- Si Liu,
- Qiong Yan,
- Wenwu Zhu,
- Liang Lin
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 645–653https://doi.org/10.1145/3240508.3240618

Facial makeup transfer aims to translate the makeup style from a given reference makeup face image to another non-makeup one while preserving face identity. Such an instance-level transfer problem is more challenging than conventional domain-level ...
157
2,455
Metrics
Total Citations157
Total Downloads2,455
Last 12 Months296
Last 6 weeks30
1
Supplementary Material
fp0538.zip
Get Access
research-article
October 2018
WildFish: A Large Benchmark for Fish Recognition in the Wild
MM '18: Proceedings of the 26th ACM international conference on MultimediaPages 1301–1309https://doi.org/10.1145/3240508.3240616

Fish recognition is an important task to understand the marine ecosystem and biodiversity. It is often challenging to identify fish species in the wild, due to the following difficulties. First, most fish benchmarks are small-scale, which may limit the ...
42
979
Metrics
Total Citations42
Total Downloads979
Last 12 Months154
Last 6 weeks21
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Unprecedented Usage of Pre-trained CNNs on Beauty Product

On Reducing Effort in Evaluating Laparoscopic Skills

VIVID: Virtual Environment for Visual Deep Learning

Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

Magical Rice Bowl: A Real-time Food Category Changer

Partial Multi-view Subspace Clustering

A Large-scale RGB-D Database for Arbitrary-view Human Action Recognition

Generating Defensive Plays in Basketball Games

SibNet: Sibling Convolutional Encoder for Video Captioning

Learning Local Descriptors with Adversarial Enhancer from Volumetric Geometry Patches

Enhancing Visual Question Answering Using Dropout

User-Guided Deep Anime Line Art Colorization with Conditional Adversarial Networks

Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images

Attentive Recurrent Neural Network for Weak-supervised Multi-label Image Classification

End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks

ThoughtViz: Visualizing Human Thoughts Using Generative Adversarial Network

An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

Learning and Fusing Multimodal Deep Features for Acoustic Scene Categorization

BeautyGAN: Instance-level Facial Makeup Transfer with Deep Generative Adversarial Network

WildFish: A Large Benchmark for Fish Recognition in the Wild