Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3514221.3517907acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

Published: 11 June 2022 Publication History

Abstract

ML is being deployed in complex, real-world scenarios where errors have impactful consequences. In these systems, thorough testing of the ML pipelines is critical. A key component in ML deployment pipelines is the curation of labeled training data. Common practice in the ML literature assumes that labels are the ground truth. However, in our experience in a large autonomous vehicle development center, we have found that vendors can often provide erroneous labels, which can lead to downstream safety risks in trained models.
To address these issues, we propose a new abstraction, learned observation assertions, and implement it in a system called Fixy. Fixy leverages existing organizational resources, such as existing (possibly noisy) labeled datasets or previously trained ML models, to learn a probabilistic model for finding errors in human- or model-generated labels. Given user-provided features and these existing resources, Fixy learns feature distributions that specify likely and unlikely values (e.g., that a speed of 30mph is likely but 300mph is unlikely). It then uses these feature distributions to score labels for potential errors. We show that Fixy can automatically rank potential errors in real datasets with up to 2x higher precision compared to recent work on model assertions and standard techniques such as uncertainty sampling. Furthermore, Fixy can uncover labeling errors in 70% of scenes in a popular autonomous vehicle dataset.

Supplemental Material

MP4 File
Video for LOA
PDF File
Read me
ZIP File
Source Code

References

[1]
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291--300.
[2]
Denis Baylor, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo, Zakaria Haque, Salem Haykal, Mustafa Ispir, Vihan Jain, Levent Koc, et al. 2017. Tfx: A tensorflow-based production-scale machine learning platform. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1387--1395.
[3]
Leopoldo Bertossi. 2006. Consistent query answering in databases. ACM Sigmod Record, Vol. 35, 2 (2006), 68--76.
[4]
George Beskales, Ihab F Ilyas, and Lukasz Golab. 2010. Sampling the repairs of functional dependency violations under hard constraints. Proceedings of the VLDB Endowment, Vol. 3, 1--2 (2010), 197--207.
[5]
Philip Bohannon, Wenfei Fan, Michael Flaster, and Rajeev Rastogi. 2005. A cost-based model and effective heuristic for repairing constraints by value modification. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 143--154.
[6]
Chiao-Lun Cheng. 2019. Training Data - Quantity is no Panacea. (2019). https://scale.com/blog/training-data-quantity-is-no-panacea
[7]
Xu Chu, Ihab F Ilyas, Sanjay Krishnan, and Jiannan Wang. 2016. Data cleaning: Overview and emerging challenges. In Proceedings of the 2016 international conference on management of data. 2201--2206.
[8]
Frank Dellaert, Michael Kaess, et al. 2017. Factor graphs for robot perception. Foundations and Trends® in Robotics, Vol. 6, 1--2 (2017), 1--139.
[9]
Alireza Heidari, Joshua McGrath, Ihab F Ilyas, and Theodoros Rekatsinas. 2019. Holodetect: Few-shot learning for error detection. In Proceedings of the 2019 International Conference on Management of Data. 829--846.
[10]
Nick Hynes, D Sculley, and Michael Terry. 2017. The data linter: Lightweight, automated sanity checking for ml data sets. In NIPS MLSys Workshop .
[11]
Daniel Kang, Deepti Raghavan, Peter Bailis, and Matei Zaharia. 2020. Model Assertions for Monitoring and Improving ML Model. MLSys (2020).
[12]
Andrej Kaparthy. 2018. Building the Software 2.0 Stack. (2018).
[13]
R. Kesten, M. Usman, J. Houston, T. Pandya, K. Nadhamuni, A. Ferreira, M. Yuan, B. Low, A. Jain, P. Ondruska, S. Omari, S. Shah, A. Kulkarni, A. Kazakova, C. Tao, L. Platinsky, W. Jiang, and V. Shet. 2019. Lyft Level 5 Perception Dataset 2020. https://level5.lyft.com/dataset/.
[14]
Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J Franklin, and Ken Goldberg. 2016. Activeclean: Interactive data cleaning while learning convex loss models. arXiv preprint arXiv:1601.03797 (2016).
[15]
Frank R Kschischang, Brendan J Frey, and H-A Loeliger. 2001. Factor graphs and the sum-product algorithm. IEEE Transactions on information theory, Vol. 47, 2 (2001), 498--519.
[16]
Alex H Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. 2019. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12697--12705.
[17]
Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE transactions on robotics, Vol. 31, 5 (2015), 1147--1163.
[18]
Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. Tensorfuzz: Debugging neural networks with coverage-guided fuzzing. In International Conference on Machine Learning. 4901--4911.
[19]
Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1027--1038.
[20]
Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2017. Data management challenges in production machine learning. In Proceedings of the 2017 ACM International Conference on Management of Data. 1723--1726.
[21]
Neoklis Polyzotis, Martin Zinkevich, Sudip Roy, Eric Breck, and Steven Whang. 2019. Data validation for machine learning. MLSys (2019).
[22]
Johannes Pöschmann, Tim Pfeifer, and Peter Protzel. 2020. Factor Graph based 3D Multi-Object Tracking in Point Clouds. arXiv preprint arXiv:2008.05309 (2020).
[23]
Erhard Rahm and Hong Hai Do. 2000. Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., Vol. 23, 4 (2000), 3--13.
[24]
Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. 2020. Snorkel: rapid training data creation with weak supervision. The VLDB Journal, Vol. 29, 2 (2020), 709--730.
[25]
Theodoros Rekatsinas, Xu Chu, Ihab F Ilyas, and Christopher Ré. 2017. Holoclean: Holistic data repairs with probabilistic inference. arXiv preprint arXiv:1702.00820 (2017).
[26]
Burr Settles. 2009. Active learning literature survey. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.
[27]
Vinay Shet. 2019. Lyft Level 5 Self-Driving Perception Dataset Competition Now Open. https://medium.com/wovenplanetlevel5/lyft-level-5-self-driving-dataset-competition-now-open-97493e9f154a. (2019).
[28]
Sahaana Suri, Raghuveer Chanda, Neslihan Bulut, Pradyumna Narayana, Yemao Zeng, Peter Bailis, Sugato Basu, Girija Narlikar, Christopher Ré, and Abishek Sethi. 2020. Leveraging organizational resources to adapt models to new data modalities. arXiv preprint arXiv:2008.09983 (2020).
[29]
Daisuke Wakabayashi. 2018. Self-Driving Uber Car Kills Pedestrian in Arizona, Where Robots Roam. https://www.nytimes.com/2018/03/19/technology/uber-driverless-fatality.html .
[30]
Ulla Wandinger. 2005. Introduction to lidar. In Lidar. Springer, 1--18.
[31]
Weiming Xiang, Patrick Musau, Ayana A Wild, Diego Manzanas Lopez, Nathaniel Hamilton, Xiaodong Yang, Joel Rosenfeld, and Taylor T Johnson. 2018. Verification for machine learning, autonomy, and neural networks survey. arXiv preprint arXiv:1810.01989 (2018).
[32]
Jie M Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering (2020).
[33]
Benjin Zhu, Zhengkai Jiang, Xiangxin Zhou, Zeming Li, and Gang Yu. 2019. Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection. arXiv preprint arXiv:1908.09492 (2019).

Cited By

View all
  • (2024)spade: Synthesizing Data Quality Assertions for Large Language Model PipelinesProceedings of the VLDB Endowment10.14778/3685800.368583517:12(4173-4186)Online publication date: 1-Aug-2024
  • (2024)Datactive: Data Fault Localization for Object Detection SystemsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680329(895-907)Online publication date: 11-Sep-2024
  • (2024)Data Management for ML-Based Analytics and BeyondACM / IMS Journal of Data Science10.1145/36110931:1(1-23)Online publication date: 16-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
June 2022
2597 pages
ISBN:9781450392495
DOI:10.1145/3514221
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. autonomous vehicles
  2. datasets
  3. error finding
  4. perception

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)5
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)spade: Synthesizing Data Quality Assertions for Large Language Model PipelinesProceedings of the VLDB Endowment10.14778/3685800.368583517:12(4173-4186)Online publication date: 1-Aug-2024
  • (2024)Datactive: Data Fault Localization for Object Detection SystemsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680329(895-907)Online publication date: 11-Sep-2024
  • (2024)Data Management for ML-Based Analytics and BeyondACM / IMS Journal of Data Science10.1145/36110931:1(1-23)Online publication date: 16-Jan-2024
  • (2024)Identifying Label Errors in Object Detection Datasets by Loss Inspection2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00452(4570-4579)Online publication date: 3-Jan-2024
  • (2024)A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and DirectionsIEEE Access10.1109/ACCESS.2023.334913212(3768-3789)Online publication date: 2024
  • (2024)Seeing the invisible: test prioritization for object detection systemEmpirical Software Engineering10.1007/s10664-024-10539-429:6Online publication date: 23-Sep-2024
  • (2023)From Bias to Repair: Error as a Site of Collaboration and Negotiation in Applied Data Science WorkProceedings of the ACM on Human-Computer Interaction10.1145/35796077:CSCW1(1-32)Online publication date: 16-Apr-2023
  • (2023)From Concept to Implementation: The Data-Centric Development Process for AI in Industry2023 10th IEEE Swiss Conference on Data Science (SDS)10.1109/SDS57534.2023.00017(73-76)Online publication date: Jun-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media