Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/HICSS.2014.330guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Text Simplification Tools: Using Machine Learning to Discover Features that Identify Difficult Text

Published: 06 January 2014 Publication History

Abstract

Although providing understandable information is a critical component in healthcare, few tools exist to help clinicians identify difficult sections in text. We systematically examine sixteen features for predicting the difficulty of health texts using six different machine learning algorithms. Three represent new features not previously examined: medical concept density, specificity (calculated using word-level depth in MeSH), and ambiguity (calculated using the number of UMLS Metathesaurus concepts associated with a word). We examine these features for a binary prediction task on 118,000 simple and difficult sentences from a sentence-aligned corpus. Using all features, random forests is the most accurate with 84% accuracy. Model analysis of the six models and a complementary ablation study shows that the specificity and ambiguity features are the strongest predictors (24% combined impact on accuracy). Notably, a training size study showed that even with a 1% sample (1,062 sentences) an accuracy of 80% can be achieved.

Cited By

View all
  • (2023)Research on Individual Recognition and Prediction of Cocaine Addiction Based on Convolutional Neural NetworksProceedings of the 3rd International Conference on Electronic Information Technology and Smart Agriculture10.1145/3641343.3641440(452-457)Online publication date: 8-Dec-2023
  • (2020)Proposal and Comparison of Health Specific Features for the Automatic Assessment of ReadabilityProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401187(1973-1976)Online publication date: 25-Jul-2020
  • (2019)Multi-class Text Complexity Evaluation via Deep Neural NetworksIntelligent Data Engineering and Automated Learning – IDEAL 201910.1007/978-3-030-33617-2_32(313-322)Online publication date: 14-Nov-2019
  • Show More Cited By
  1. Text Simplification Tools: Using Machine Learning to Discover Features that Identify Difficult Text

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    HICSS '14: Proceedings of the 2014 47th Hawaii International Conference on System Sciences
    January 2014
    5085 pages
    ISBN:9781479925049

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 06 January 2014

    Author Tags

    1. machine learning
    2. text readability
    3. text simplification

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Research on Individual Recognition and Prediction of Cocaine Addiction Based on Convolutional Neural NetworksProceedings of the 3rd International Conference on Electronic Information Technology and Smart Agriculture10.1145/3641343.3641440(452-457)Online publication date: 8-Dec-2023
    • (2020)Proposal and Comparison of Health Specific Features for the Automatic Assessment of ReadabilityProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401187(1973-1976)Online publication date: 25-Jul-2020
    • (2019)Multi-class Text Complexity Evaluation via Deep Neural NetworksIntelligent Data Engineering and Automated Learning – IDEAL 201910.1007/978-3-030-33617-2_32(313-322)Online publication date: 14-Nov-2019
    • (2017)NegAITJournal of Biomedical Informatics10.1016/j.jbi.2017.03.01469:C(55-62)Online publication date: 1-May-2017

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media