Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3469096.3474933acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

Engineering of an artificial intelligence safety data sheet document processing system for environmental, health, and safety compliance

Published: 16 August 2021 Publication History

Abstract

Chemical Safety Data Sheets (SDS) are the primary method by which chemical manufacturers communicate the ingredients and hazards of their products to the public. These SDSs are used for a wide variety of purposes ranging from environmental calculations to occupational health assessments to emergency response measures. Although a few companies have provided direct digital data transfer platforms using xml or equivalent schemata, the vast majority of chemical ingredient and hazard communication to product users still occurs through the use of millions of PDF documents that are largely loaded through manual data entry into downstream user databases. This research focuses on the reverse engineering of SDS document types to adapt to various layouts and the harnessing of meta-algorithmic and neural network approaches to provide a means of moving industrial institutions towards a digital universal SDS processing methodology. The complexities of SDS documents including the lack of format standardization, text and image combinations, and multi-lingual translation needs, combined, limit the accuracy and precision of optical character recognition tools.
The approach in this document is to translate entire SDSs from thousands of chemical vendors, each with distinct formatting, to machine-encoded text with a high degree of accuracy and precision. Then the system will "read" and assess these documents as a human would; that is, ensuring that the documents are compliant, determining whether chemical formulations have changed, ensuring reported values are within expected thresholds, and comparing them to similar products for more environmentally friendly alternatives.

Supplementary Material

ZIP File (a12-fenton.zip)
Supplemental material.

References

[1]
GHS Requirements. Occupational Safety and Health Administration, 2015, https://www.osha.gov/dsg/hazcom/ghs.html
[2]
Marinai, M. Gori and G. Soda, "Artificial neural networks for document analysis and recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 23--35, Jan. 2005
[3]
Munsayac, Francisco & Alonzo, Lea & Lindo, Delfin & Baldovino, Renann & Bugtai, Nilo. (2017). Implementation of a normalized cross-correlation coefficient-based template matching algorithm in number system conversion. 1--4.
[4]
OpenCV, Template Matching, https://docs.opencv.org/3.4/de/da9/tutorial_template_match-ing.html, 2020
[5]
Evelina Maria De Almeida Neves, A. G. A Multi-Font Character Recognition Based on its Fundamental Features by Artificial Neural Networks. IEEE. 1997.
[6]
Yu Bei, Pan, David Z., Matsunawa, T., Zeng, Xuan. "Machine Learning and Pattern Matching in Physical Design". 20th Asia and South Pacific Design Automation Conference. 2015.
[7]
Chakraborty, S., Lakshminarayanan, S. Nyarko, Y., Extraction of (Key,Value) Pairs from Unstructured Ads. 2014 AAAI Fall Symposium. 2014.
[8]
Simske, Steven J. 2013. Meta-Algorithmics: Patterns for Robust, Low Cost, High Quality Systems. Wiley-IEEE Press ISBN: 978-1-118-62669-6
[9]
Hazard Communication Standard: Safety Data Sheets. Occupational Safety and Health Administration, 2012, https://www.osha.gov/sites/default/files/publications/OSHA3514.pdf

Cited By

View all
  • (2024)CatalogBank: A Structured and Interoperable Catalog Dataset with a Semi-Automatic Annotation Tool (DocumentLabeler) for Engineering System DesignProceedings of the ACM Symposium on Document Engineering 202410.1145/3685650.3685665(1-9)Online publication date: 20-Aug-2024
  • (2023)Intelligent Document Processing in End-to-End RPA Contexts: A Systematic Literature ReviewConfluence of Artificial Intelligence and Robotic Process Automation10.1007/978-981-19-8296-5_5(95-131)Online publication date: 14-Mar-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering
August 2021
178 pages
ISBN:9781450385961
DOI:10.1145/3469096
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. EHS compliance
  2. meta-algorithmics
  3. neural networks
  4. optical character recognition
  5. safety data sheets
  6. validation

Qualifiers

  • Short-paper

Conference

DocEng '21
Sponsor:
DocEng '21: ACM Symposium on Document Engineering 2021
August 24 - 27, 2021
Limerick, Ireland

Acceptance Rates

Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)CatalogBank: A Structured and Interoperable Catalog Dataset with a Semi-Automatic Annotation Tool (DocumentLabeler) for Engineering System DesignProceedings of the ACM Symposium on Document Engineering 202410.1145/3685650.3685665(1-9)Online publication date: 20-Aug-2024
  • (2023)Intelligent Document Processing in End-to-End RPA Contexts: A Systematic Literature ReviewConfluence of Artificial Intelligence and Robotic Process Automation10.1007/978-981-19-8296-5_5(95-131)Online publication date: 14-Mar-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media