research-article

Public Access

Flipper: A Systematic Approach to Debugging Training Sets

Authors:

Christopher De Sa,

Christopher RéAuthors Info & Claims

HILDA '17: Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics

Article No.: 5, Pages 1 - 5

https://doi.org/10.1145/3077257.3077263

Published: 14 May 2017 Publication History

Abstract

As machine learning methods gain popularity across different fields, acquiring labeled training datasets has become the primary bottleneck in the machine learning pipeline. Recently generative models have been used to create and label large amounts of training data, albeit noisily. The output of these generative models is then used to train a discriminative model of choice, such as logistic regression or a complex neural network. However, any errors in the generative model can propagate to the subsequent model being trained. Unfortunately, these generative models are not easily interpretable and are therefore difficult to debug for users. To address this, we present our vision for Flipper, a framework that presents users with high-level information about why their training set is inaccurate and informs their decisions as they improve their generative model manually. We present potential tools within the Flipper framework, inspired by observing biomedical experts working with generative models, which allow users to analyze the errors in their training data in a systematic fashion. Finally, we discuss a prototype of Flipper and report results of a user study where users create a training set for a classification task and improve the discriminative model's accuracy by 2.4 points in less than an hour with feedback from Flipper.

References

[1]

Mark G. Core, H. Chad Lane, Michael van Lent, Dave Gomboc, Steve Solomon, and Milton Rosenberg. 2006. Building Explainable Artificial Intelligence Systems. In Proceedings of the 18th Conference on Innovative Applications of Artificial Intelligence - Volume 2 (IAAI'06). AAAI Press, 1766--1773. http://dl.acm.org/citation.cfm?id=1597122.1597135

Digital Library

[2]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.

[3]

Henry R Ehrenberg, Jaeho Shin, Alexander J Ratner, Jason A Fries, and Christopher Re. 2016. Data programming with DDLite: putting humans in a different part of the loop. In HILDA@ SIGMOD. 13.

Digital Library

[4]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672--2680.

Digital Library

[5]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.

Digital Library

[6]

Andrej Karpathy and Li Fei-Fei. 2015. Deep Visual-Semantic Alignments for Generating Image Descriptions. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]

Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014).

[8]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

Digital Library

[9]

Thomas K Landauer, Peter W Foltz, and Darrell Laham. 1998. An introduction to latent semantic analysis. Discourse processes 25, 2--3 (1998), 259--284.

[10]

Yann LeCun, Corinna Cortes, and Christopher JC Burges. 1998. The MNIST database of handwritten digits. (1998).

[11]

Carlos Guestrin Marco Tlio Ribeiro, Sameer Singh. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In KDD. 1135--1144.

Digital Library

[12]

Matt Post, Gaurav Kumar, Adam Lopez, Damianos Karakos, Chris Callison-Burch, and Sanjeev Khudanpur. 2013. Improved Speech-to-Text Translation with the Fisher and Callhome Spanish--English Speech Translation Corpus. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Heidelberg, Germany.

[13]

Alexander J Ratner, Christopher M De Sa, Sen Wu, Daniel Selsam, and Christopher Re. 2016. Data Programming: Creating Large Training Sets, Quickly. In Advances in Neural Information Processing Systems. 3567--3575.

Digital Library

[14]

Paroma Varma, Bryan He, Dan Iter, Peng Xu, Rose Yu, Christopher De Sa, and Christopher Re. 2017. Socratic Learning: Correcting Misspecified Generative Models using Discriminative Models. arXiv preprint arXiv:1610.08123 (2017).

[15]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156--3164.

[16]

Chih-Hsuan Wei, Yifan Peng, Robert Leaman, Allan Peter Davis, Carolyn J Mattingly, Jiao Li, Thomas C Wiegers, and Zhiyong Lu. 2015. Overview of the BioCreative V chemical disease relation (CDR) task. In Proceedings of the fifth BioCreative challenge evaluation workshop. 154--166.

[17]

Tong Xiao, Tian Xia, Yi Yang, Chang Huang, and Xiaogang Wang. 2015. Learning from massive noisy labeled data for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2691--2699.

[18]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C Courville, Ruslan Salakhutdinov, Richard S Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In ICML, Vol. 14. 77--81.

Digital Library

[19]

Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision. Springer, 818--833.

Cited By

Berman GGoyal NMadaio M(2024)A Scoping Study of Evaluation Practices for Responsible AI Tools: Steps Towards Effectiveness EvaluationsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642398(1-24)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642398
Denham BLai ESinha RNaeem M(2022)WitanProceedings of the VLDB Endowment10.14778/3551793.355179715:11(2334-2347)Online publication date: 29-Sep-2022
https://dl.acm.org/doi/10.14778/3551793.3551797
Galhotra SFariha ALourenço RFreire JMeliou ASrivastava DIves ZBonifati AEl Abbadi A(2022)DataPrism: Exposing Disconnect between Data and SystemsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517864(217-231)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517864
Show More Cited By

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Configuration-Space Flipper Planning on 3D Terrain
2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR)
Flippers are essential components of tracked robot locomotion systems for unstructured terrain, especially within a rescue scenario. Achieving full and semi-autonomy for such rescue robots is the goal of many research efforts. In this work, we propose an ...
Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

In multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

HILDA '17: Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics

May 2017

89 pages

ISBN:9781450350297

DOI:10.1145/3077257

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

SIGMOD/PODS'17

Sponsor:

SIGMOD

SIGMOD/PODS'17: International Conference on Management of Data

May 14 - 19, 2017

IL, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 28 of 56 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
685
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)9

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Berman GGoyal NMadaio M(2024)A Scoping Study of Evaluation Practices for Responsible AI Tools: Steps Towards Effectiveness EvaluationsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642398(1-24)Online publication date: 11-May-2024
Denham BLai ESinha RNaeem M(2022)WitanProceedings of the VLDB Endowment10.14778/3551793.355179715:11(2334-2347)Online publication date: 29-Sep-2022
Galhotra SFariha ALourenço RFreire JMeliou ASrivastava DIves ZBonifati AEl Abbadi A(2022)DataPrism: Exposing Disconnect between Data and SystemsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517864(217-231)Online publication date: 10-Jun-2022
Chai CWang JLuo YNiu ZLi G(2022)Data Management for Machine Learning: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3148237(1-1)Online publication date: 2022
Li PLi JXie HPei YFeng H(2022)Recognition and Diagnosis of Computed Tomography Images Using Reconstructive TechniquesFrontier Computing10.1007/978-981-16-0115-6_1(1-11)Online publication date: 1-Jan-2022
Boehm MKumar AYang J(2022)Data Management in Machine Learning SystemsundefinedOnline publication date: 26-Feb-2022
Lai TGrundy J(2021)Towards the generation of machine learning defect reportsProceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE51524.2021.9678592(1038-1042)Online publication date: 15-Nov-2021
Li PLi JChen YPei YFu GXie H(2021)Classification and recognition of computed tomography images using image reconstruction and information fusion methodsThe Journal of Supercomputing10.1007/s11227-020-03367-y77:3(2645-2666)Online publication date: 1-Mar-2021
Boehm MKumar AYang J(2019)Data Management in Machine Learning SystemsSynthesis Lectures on Data Management10.2200/S00895ED1V01Y201901DTM05714:1(1-173)Online publication date: 25-Feb-2019
Thirumuruganathan SOuzzani MTang N(2019)Explaining Entity Resolution PredictionsProceedings of the Workshop on Human-In-the-Loop Data Analytics10.1145/3328519.3329130(1-6)Online publication date: 5-Jul-2019
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents