Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2063576.2063846acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Information extraction from pathology reports in a hospital setting

Published: 24 October 2011 Publication History

Abstract

As more health data becomes available, information extraction aims to make an impact on the workflows of hospitals and care centers. One of the targeted areas is the management of pathology reports, which are employed for cancer diagnosis and staging. In this work we integrate text mining tools in the workflow of the Royal Melbourne Hospital, to extract information from pathology reports with minimal expert intervention. Our framework relies on coarse-grained annotation (at document level), making it highly portable. Our evaluation shows that the kind of language used in these reports makes it feasible to extract information with high precision and recall, by means of state-of-the-art classification methods, and feature engineering.

References

[1]
A. R. Aronson. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In AMIA Annual Symposium Proceedings, pages 17--21, Washington DC, 2001.
[2]
W. W. Chapman, W. Bridewellb, P. Hanburya, G. F. Cooperb, and B. G. Buchananb. A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34(5):301--310, October 2001.
[3]
A. Coden, G. Savova, I. Sominsky, M. Tanenblatt, J. Masanz, K. Schuler, J. Cooper, W. Guan, and P. C. de Groen. Automatically extracting cancer disease characteristics from pathology reports into a disease knowledge representation model. Journal of Biomedical Informatics, 42:937--949, 2009.
[4]
N. C. Davis and R. C. Newland. Terminology and classification of colorectal adenocarcinoma: The australian clinico-pathological staging system. Australian and New Zealand Journal of Surgery, 53(3):211--221, 1983.
[5]
M. Hall. Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, Department of Computer Science, University of Waikato, New Zealand, 1999.
[6]
D. A. Lindberg. The unified medical language system. Method of Information in Medicine, 32(4):281--291, 1993.
[7]
I. A. McCowan, D. C. Moore, A. N. Nguyen, R. V. Bowman, B. E. Clarke, E. E. Duhig, and M.-J. Fry. Collection of cancer stage data by classifying free-text medical reports. Journal of the American Medical Informatics Association (JAMIA), 14:736--745, 2007.
[8]
A. Nguyen, D. Moore, I. McCowan, and M.-J. Courage. Multi-class classification of cancer stages from free-text histology reports using support vector machines. Proceedings of the IEEE Engineering in Medicine and Biology Society Conference, 2007:5140--5143, 2007.
[9]
A. N. Nguyen, M. J. Lawley, D. P. Hansen, R. V. Bowman, B. E. Clarke, E. E. Duhig, and S. Colquist. Symbolic rule-based classification of lung cancer stages from free-text pathology reports. Journal of the American Medical Informatics Association (JAMIA), 17:440--445, 2010.
[10]
M. H. Stanfill, M. Williams, S. H. Fenton, R. A. Jenders, and W. R. Hersh. A systematic literature review of automated clinical coding and classification systems. J Am Med Inform Assoc, 17(6):646--651, Nov 2010.
[11]
Y. Tsuruoka, Y. Tateishi, J.-D. Kim, T. Ohta, J. McNaught, S. Ananiadou, and J. Tsujii. Developing a robust part-of-speech tagger for biomedical text. In Advances in Informatics -- 10th Panhellenic Conference on Informatics, pages 382--392, Volas, Greece, 2005.
[12]
I. H. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, USA, 2005.

Cited By

View all
  • (2024)Development of message passing-based graph convolutional networks for classifying cancer pathology reportsBMC Medical Informatics and Decision Making10.1186/s12911-024-02662-524:S5Online publication date: 17-Sep-2024
  • (2023)Quantitative feature extraction of unstructured data from GitLab BioAI pathology reports of cancer using an enhanced RPA NLP methodJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23162545:4(5265-5276)Online publication date: 4-Oct-2023
  • (2021)A Text Mining Approach in the Classification of Free-Text Cancer Pathology Reports from the South African National Health Laboratory ServicesInformation10.3390/info1211045112:11(451)Online publication date: 30-Oct-2021
  • Show More Cited By

Index Terms

  1. Information extraction from pathology reports in a hospital setting

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
    October 2011
    2712 pages
    ISBN:9781450307178
    DOI:10.1145/2063576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. machine learning
    2. text mining

    Qualifiers

    • Research-article

    Conference

    CIKM '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 28 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Development of message passing-based graph convolutional networks for classifying cancer pathology reportsBMC Medical Informatics and Decision Making10.1186/s12911-024-02662-524:S5Online publication date: 17-Sep-2024
    • (2023)Quantitative feature extraction of unstructured data from GitLab BioAI pathology reports of cancer using an enhanced RPA NLP methodJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23162545:4(5265-5276)Online publication date: 4-Oct-2023
    • (2021)A Text Mining Approach in the Classification of Free-Text Cancer Pathology Reports from the South African National Health Laboratory ServicesInformation10.3390/info1211045112:11(451)Online publication date: 30-Oct-2021
    • (2021)Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping ReviewJournal of Medical Internet Research10.2196/2459423:1(e24594)Online publication date: 26-Jan-2021
    • (2021)BioIE: Biomedical Information Extraction with Multi-head Attention Enhanced Graph Convolutional Network2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM52615.2021.9669650(2080-2087)Online publication date: 9-Dec-2021
    • (2021)Cancer Registry Coding via Hybrid Neural Symbolic Systems in the Cross-Hospital SettingIEEE Access10.1109/ACCESS.2021.30991759(112081-112096)Online publication date: 2021
    • (2021)Prognostic elements extraction from documents to detect prognostic stageComputer Methods in Biomechanics and Biomedical Engineering10.1080/10255842.2021.195535925:4(371-386)Online publication date: 28-Jul-2021
    • (2021)Information extraction for prognostic stage prediction from breast cancer medical records using NLP and MLMedical & Biological Engineering & Computing10.1007/s11517-021-02399-7Online publication date: 23-Jul-2021
    • (2020)Artificial intelligence-driven structurization of diagnostic information in free-text pathology reportsJournal of Pathology Informatics10.4103/jpi.jpi_30_1911:1(4)Online publication date: 2020
    • (2020)Generating high-quality data abstractions from scanned clinical records: text-mining-assisted extraction of endometrial carcinoma pathology features as proof of principleBMJ Open10.1136/bmjopen-2020-03774010:6(e037740)Online publication date: 11-Jun-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media