Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (312)

Search Parameters:
Keywords = named entity recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 2841 KiB  
Article
Named Entity Recognition for Equipment Fault Diagnosis Based on RoBERTa-wwm-ext and Deep Learning Integration
by Feifei Gao, Lin Zhang, Wenfeng Wang, Bo Zhang, Wei Liu, Jingyi Zhang and Le Xie
Electronics 2024, 13(19), 3935; https://doi.org/10.3390/electronics13193935 (registering DOI) - 5 Oct 2024
Viewed by 256
Abstract
Equipment fault diagnosis NER is to extract specific entities from Chinese equipment fault diagnosis text, which is the premise of constructing an equipment fault diagnosis knowledge graph. Named entity recognition for equipment fault diagnosis can also provide important data support for equipment maintenance [...] Read more.
Equipment fault diagnosis NER is to extract specific entities from Chinese equipment fault diagnosis text, which is the premise of constructing an equipment fault diagnosis knowledge graph. Named entity recognition for equipment fault diagnosis can also provide important data support for equipment maintenance support. Equipment fault diagnosis text has complex semantics, fuzzy entity boundaries, and limited data size. In order to extract entities from the equipment fault diagnosis text, this paper presents an NER model for equipment fault diagnosis based on RoBERTa-wwm-ext and Deep Learning network integration. Firstly, this model uses the RoBERTa-wwm-ext to extract context-sensitive embeddings of text sequences. Secondly, the context feature information is obtained through the BiLSTM network. Thirdly, the CRF is combined to output the label sequence with a constraint relationship, improve the accuracy of sequence labeling task, and complete the entity recognition task. Finally, experiments and predictions are carried out on the constructed dataset. The results show that the model can effectively identify five types of equipment fault diagnosis entities and has higher evaluation indexes than the traditional model. Its precision, recall, and F1 value are 94.57%, 95.39%, and 94.98%, respectively. The case study proves that the model can accurately recognize the entity of the input text. Full article
Show Figures

Figure 1

Figure 1
<p>NER model based on RoBERTa-wwm-ext-BiLSTM-CRF network.</p>
Full article ">Figure 2
<p>The structure of Transformer.</p>
Full article ">Figure 3
<p>The structure of BERT.</p>
Full article ">Figure 4
<p>Input representation of BERT.</p>
Full article ">Figure 5
<p>Input representation of RoBERTa-wwm-ext.</p>
Full article ">Figure 6
<p>The internal structure of an LSTM network unit.</p>
Full article ">Figure 7
<p>The structure of BiLSTM network.</p>
Full article ">Figure 8
<p>The structure of CRF.</p>
Full article ">Figure 9
<p>Variation of F1 value as the number of training epochs increases.</p>
Full article ">Figure 10
<p>Entity recognition of the input text by the model.</p>
Full article ">
36 pages, 13506 KiB  
Article
ChatGeoAI: Enabling Geospatial Analysis for Public through Natural Language, with Large Language Models
by Ali Mansourian and Rachid Oucheikh
ISPRS Int. J. Geo-Inf. 2024, 13(10), 348; https://doi.org/10.3390/ijgi13100348 - 1 Oct 2024
Viewed by 733
Abstract
Large Language Models (LLMs) such as GPT, BART, and Gemini stand at the forefront of Generative Artificial Intelligence, showcasing remarkable prowess in natural language comprehension and task execution. This paper proposes a novel framework developed on the foundation of Llama 2, aiming to [...] Read more.
Large Language Models (LLMs) such as GPT, BART, and Gemini stand at the forefront of Generative Artificial Intelligence, showcasing remarkable prowess in natural language comprehension and task execution. This paper proposes a novel framework developed on the foundation of Llama 2, aiming to bridge the gap between natural language queries and executable code for geospatial analyses within the PyQGIS environment. It empowers non-expert users to leverage GIS technology without requiring deep knowledge of geospatial programming or tools. Through cutting-edge Natural Language Processing (NLP) techniques, including tailored entity recognition and ontology mapping, the framework accurately interprets user intents and translates them into specific GIS operations. Integration of geospatial ontologies enriches semantic comprehension, ensuring precise alignment between user descriptions, geospatial datasets, and geospatial analysis tasks. A code generation module empowered by Llama 2 converts these interpretations into PyQGIS scripts, enabling the execution of geospatial analysis and results visualization. Rigorous testing across a spectrum of geospatial analysis tasks, with incremental complexity, evaluates the framework and the performance of such a system, with LLM at its core. The proposed system demonstrates proficiency in handling various geometries, spatial relationships, and attribute queries, enabling accurate and efficient analysis of spatial datasets. Moreover, it offers robust error-handling mechanisms and supports tasks related to map styling, visualization, and data manipulation. However, it has some limitations, such as occasional struggles with ambiguous attribute names and aliases, which leads to potential inaccuracies in the filtering and retrieval of features. Despite these limitations, the system presents a promising solution for applications integrating LLMs into GIS and offers a flexible and user-friendly approach to geospatial analysis. Full article
Show Figures

Figure 1

Figure 1
<p>The proposed system architecture for ChatGeoAI.</p>
Full article ">Figure 2
<p>Workflow of the implementation of the proposed system.</p>
Full article ">Figure 3
<p>User interface components for ChatGeoAI.</p>
Full article ">Figure 4
<p>Training and validation loss curves of Llama 2 model during the fine-tuning.</p>
Full article ">Figure 5
<p>Results on the desktop application show the map with pharmacies within 1000 m of the Grand Hotel.</p>
Full article ">Figure 6
<p>Map showing the pharmacies within 1000 m of the Grand Hotel using a GIS tool (i.e., QGIS).</p>
Full article ">Figure 7
<p>Shortest path between Lund Cathedral and Monumentet.</p>
Full article ">Figure 8
<p>Shortest path, obtained using QGIS, between Lund Cathedral and Monumentet.</p>
Full article ">Figure 9
<p>Recommended hotels based on the user’s query.</p>
Full article ">Figure 10
<p>Maps displaying hotels meeting the user’s requirements.</p>
Full article ">Figure 11
<p>Playground distribution in Lund.</p>
Full article ">Figure 12
<p>No results were returned due to AttributeError.</p>
Full article ">Figure 13
<p>Playground areas in Lund parks.</p>
Full article ">Figure A1
<p>Parks which are larger than 50,000 sq.m and have more than 100 buildings within their 300 m radius of them.</p>
Full article ">Figure A2
<p>Schools located within a 1-km radius of the hospital Skånes Universitetssjukhus in Lund.</p>
Full article ">Figure A3
<p>Restaurants displayed in response to a user who wants to eat in the centre of Lund.</p>
Full article ">
19 pages, 500 KiB  
Article
Comparative Analysis of Large Language Models in Chinese Medical Named Entity Recognition
by Zhichao Zhu, Qing Zhao, Jianjiang Li, Yanhu Ge, Xingjian Ding, Tao Gu, Jingchen Zou, Sirui Lv, Sheng Wang and Ji-Jiang Yang
Bioengineering 2024, 11(10), 982; https://doi.org/10.3390/bioengineering11100982 - 29 Sep 2024
Viewed by 295
Abstract
The emergence of large language models (LLMs) has provided robust support for application tasks across various domains, such as name entity recognition (NER) in the general domain. However, due to the particularity of the medical domain, the research on understanding and improving the [...] Read more.
The emergence of large language models (LLMs) has provided robust support for application tasks across various domains, such as name entity recognition (NER) in the general domain. However, due to the particularity of the medical domain, the research on understanding and improving the effectiveness of LLMs on biomedical named entity recognition (BNER) tasks remains relatively limited, especially in the context of Chinese text. In this study, we extensively evaluate several typical LLMs, including ChatGLM2-6B, GLM-130B, GPT-3.5, and GPT-4, on the Chinese BNER task by leveraging a real-world Chinese electronic medical record (EMR) dataset and a public dataset. The experimental results demonstrate the promising yet limited performance of LLMs with zero-shot and few-shot prompt designs for Chinese BNER tasks. More importantly, instruction fine-tuning significantly enhances the performance of LLMs. The fine-tuned offline ChatGLM2-6B surpassed the performance of the task-specific model BiLSTM+CRF (BC) on the real-world dataset. The best fine-tuned model, GPT-3.5, outperforms all other LLMs on the publicly available CCKS2017 dataset, even surpassing half of the baselines; however, it still remains challenging for it to surpass the state-of-the-art task-specific models, i.e., Dictionary-guided Attention Network (DGAN). To our knowledge, this study is the first attempt to evaluate the performance of LLMs on Chinese BNER tasks, which emphasizes the prospective and transformative implications of utilizing LLMs on Chinese BNER tasks. Furthermore, we summarize our findings into a set of actionable guidelines for future researchers on how to effectively leverage LLMs to become experts in specific tasks. Full article
Show Figures

Figure 1

Figure 1
<p>The performance of LLMs with the different sizes of the PCHD dataset.</p>
Full article ">Figure 2
<p>The performance of LLMs with the different sizes of the CCKS2017 dataset.</p>
Full article ">
24 pages, 1240 KiB  
Article
Hospital Re-Admission Prediction Using Named Entity Recognition and Explainable Machine Learning
by Safaa Dafrallah and Moulay A. Akhloufi
Diagnostics 2024, 14(19), 2151; https://doi.org/10.3390/diagnostics14192151 - 27 Sep 2024
Viewed by 240
Abstract
Early hospital readmission refers to unplanned emergency admission of patients within 30 days of discharge. Predicting early readmission risk before discharge can help to reduce the cost of readmissions for hospitals and decrease the death rate for Intensive Care Unit patients. In this [...] Read more.
Early hospital readmission refers to unplanned emergency admission of patients within 30 days of discharge. Predicting early readmission risk before discharge can help to reduce the cost of readmissions for hospitals and decrease the death rate for Intensive Care Unit patients. In this paper, we propose a novel approach for prediction of unplanned hospital readmissions using discharge notes from the MIMIC-III database. This approach is based on first extracting relevant information from clinical reports using a pretrained Named Entity Recognition model called BioMedical-NER, which is built on Bidirectional Encoder Representations from Transformers architecture, with the extracted features then used to train machine learning models to predict unplanned readmissions. Our proposed approach achieves better results on clinical reports compared to the state-of-the-art methods, with an average precision of 88.4% achieved by the Gradient Boosting algorithm. In addition, explainable Artificial Intelligence techniques are applied to provide deeper comprehension of the predictive results. Full article
Show Figures

Figure 1

Figure 1
<p>Histogram of unplanned readmissions over 365 days.</p>
Full article ">Figure 2
<p>Top ten diagnoses for all admissions.</p>
Full article ">Figure 3
<p>Top 25% primary diagnoses of readmitted patients.</p>
Full article ">Figure 4
<p>Top 25% of post-readmission diagnoses.</p>
Full article ">Figure 5
<p>Diagnoses with the highest readmission rate.</p>
Full article ">Figure 6
<p>Diagnoses with the highest death rate.</p>
Full article ">Figure 7
<p>Diagnoses with the highest death rate after readmission.</p>
Full article ">Figure 8
<p>Preprocessing applied to the MIMIC dataset.</p>
Full article ">Figure 9
<p>Top ten diagnoses in the “Readmitted” class.</p>
Full article ">Figure 10
<p>Top ten diagnoses in the “Not readmitted” class.</p>
Full article ">Figure 11
<p>Preprocessing of clinical texts using NLP techniques.</p>
Full article ">Figure 12
<p>Negation handling using the NegspaCy model: (<b>a</b>) before customization and (<b>b</b>) after customization.</p>
Full article ">Figure 13
<p>Precision–recall curve of ML models trained on all features: (<b>a</b>) KNN, (<b>b</b>) Decision Tree, (<b>c</b>) Random Forest, (<b>d</b>) Gradient Boosting, and (<b>e</b>)XGBoost.</p>
Full article ">Figure 14
<p>Confusion matrix of the Gradient Boosting model trained on all features.</p>
Full article ">Figure 15
<p>Comparison of the Receiver Operating Characteristic (ROC) curves of the ML models using all features.</p>
Full article ">Figure 16
<p>Learning curves for ML models trained on all features: (<b>a</b>) KNN, (<b>b</b>) Decision Tree, (<b>c</b>) Random Forest, (<b>d</b>) Random Forest with hyperparameter tuning, (<b>e</b>) Gradient Boosting, and (<b>f</b>) XGBoost.</p>
Full article ">Figure 17
<p>Feature importance using SHAP.</p>
Full article ">Figure 18
<p>Impact of the “Pain” and “Infarct” features on the prediction.</p>
Full article ">Figure 19
<p>Local explanation for class readmission of correct positive and negative predictions: (<b>a</b>) local explanation for class readmission of correct positive prediction and (<b>b</b>) local explanation for class readmission of correct negative prediction.</p>
Full article ">
16 pages, 886 KiB  
Article
Exploring the Potential of Neural Machine Translation for Cross-Language Clinical Natural Language Processing (NLP) Resource Generation through Annotation Projection
by Jan Rodríguez-Miret, Eulàlia Farré-Maduell, Salvador Lima-López, Laura Vigil, Vicent Briva-Iglesias and Martin Krallinger
Information 2024, 15(10), 585; https://doi.org/10.3390/info15100585 - 25 Sep 2024
Viewed by 650
Abstract
Recent advancements in neural machine translation (NMT) offer promising potential for generating cross-language clinical natural language processing (NLP) resources. There is a pressing need to be able to foster the development of clinical NLP tools that extract key clinical entities in a comparable [...] Read more.
Recent advancements in neural machine translation (NMT) offer promising potential for generating cross-language clinical natural language processing (NLP) resources. There is a pressing need to be able to foster the development of clinical NLP tools that extract key clinical entities in a comparable way for a multitude of medical application scenarios that are hindered by lack of multilingual annotated data. This study explores the efficacy of using NMT and annotation projection techniques with expert-in-the-loop validation to develop named entity recognition (NER) systems for an under-resourced target language (Catalan) by leveraging Spanish clinical corpora annotated by domain experts. We employed a state-of-the-art NMT system to translate three clinical case corpora. The translated annotations were then projected onto the target language texts and subsequently validated and corrected by clinical domain experts. The efficacy of the resulting NER systems was evaluated against manually annotated test sets in the target language. Our findings indicate that this approach not only facilitates the generation of high-quality training data for the target language (Catalan) but also demonstrates the potential to extend this methodology to other languages, thereby enhancing multilingual clinical NLP resource development. The generated corpora and components are publicly accessible, potentially providing a valuable resource for further research and application in multilingual clinical settings. Full article
(This article belongs to the Special Issue Machine Translation for Conquering Language Barriers)
Show Figures

Figure 1

Figure 1
<p>Overview of the clinical case report translation and entity annotation projection approach.</p>
Full article ">Figure 2
<p>Example screenshot of the brat interface side-by-side view used for the clinical expert validation and correction of entities. By double-clicking on each annotation’s label name, annotators were able to provide alternative translations using the Notes section. The text translation in English would read “47-year-old woman with a history of glycogenosis type V (McArdle’s disease), Chiari malformation, chronic renal failure without filiation, retinal atrophy and arterial hypertension. A peritoneal catheter was implanted on 19 May 2000 in another centre, and dialysis with a 2-L infusion was started 15 days later”.</p>
Full article ">
23 pages, 7374 KiB  
Article
A Chinese Nested Named Entity Recognition Model for Chicken Disease Based on Multiple Fine-Grained Feature Fusion and Efficient Global Pointer
by Xiajun Wang, Cheng Peng, Qifeng Li, Qinyang Yu, Liqun Lin, Pingping Li, Ronghua Gao, Wenbiao Wu, Ruixiang Jiang, Ligen Yu, Luyu Ding and Lei Zhu
Appl. Sci. 2024, 14(18), 8495; https://doi.org/10.3390/app14188495 - 20 Sep 2024
Viewed by 556
Abstract
Extracting entities from large volumes of chicken epidemic texts is crucial for knowledge sharing, integration, and application. However, named entity recognition (NER) encounters significant challenges in this domain, particularly due to the prevalence of nested entities and domain-specific named entities, coupled with a [...] Read more.
Extracting entities from large volumes of chicken epidemic texts is crucial for knowledge sharing, integration, and application. However, named entity recognition (NER) encounters significant challenges in this domain, particularly due to the prevalence of nested entities and domain-specific named entities, coupled with a scarcity of labeled data. To address these challenges, we compiled a corpus from 50 books on chicken diseases, covering 28 different disease types. Utilizing this corpus, we constructed the CDNER dataset and developed a nested NER model, MFGFF-BiLSTM-EGP. This model integrates the multiple fine-grained feature fusion (MFGFF) module with a BiLSTM neural network and employs an efficient global pointer (EGP) to predict the entity location encoding. In the MFGFF module, we designed three encoders: the character encoder, word encoder, and sentence encoder. This design effectively captured fine-grained features and improved the recognition accuracy of nested entities. Experimental results showed that the model performed robustly, with F1 scores of 91.98%, 73.32%, and 82.54% on the CDNER, CMeEE V2, and CLUENER datasets, respectively, outperforming other commonly used NER models. Specifically, on the CDNER dataset, the model achieved an F1 score of 79.68% for nested entity recognition. This research not only advances the development of a knowledge graph and intelligent question-answering system for chicken diseases, but also provides a viable solution for extracting disease information that can be applied to other livestock species. Full article
(This article belongs to the Special Issue Applied Intelligence in Natural Language Processing)
Show Figures

Figure 1

Figure 1
<p>An example of nested entity annotation, with different colored lines representing different entities.</p>
Full article ">Figure 2
<p>MFGFF-BiLSTM-EGP model framework.</p>
Full article ">Figure 3
<p>Character encoder framework.</p>
Full article ">Figure 4
<p>Word encoder framework.</p>
Full article ">Figure 5
<p>Sentence encoder framework.</p>
Full article ">Figure 6
<p>An example of EGP prediction for nested entities, where the end position of the entity part of the label is coded 1, and the non-entity part is coded 0.</p>
Full article ">Figure 7
<p>Radar plots of the entity-level assessment results of the MFGFF-BiLSTM-EGP model on three datasets including precision, recall, and F1.</p>
Full article ">Figure 8
<p>Visualization of the confusion matrix, where the ‘Other’ category represents missing classification.</p>
Full article ">Figure 9
<p>Visualization of token representations in feature space using t-SNE for data dimensionality reduction, with different colored points representing labeled entities of different types including word vector features and MFGFF features visualized on three datasets.</p>
Full article ">Figure 10
<p>P, R, and F1 evaluation results of the fine-tuning method and MFGFF method on the CDNER and CMeEE V2 datasets.</p>
Full article ">Figure 11
<p>Comparison of the evaluation results of 5 pre-trained models under the MFGFF-BiLSTM-EGP model.</p>
Full article ">
13 pages, 787 KiB  
Article
Chinese Medical Named Entity Recognition Based on Context-Dependent Perception and Novel Memory Units
by Yufeng Kang, Yang Yan and Wenbo Huang
Appl. Sci. 2024, 14(18), 8471; https://doi.org/10.3390/app14188471 - 20 Sep 2024
Viewed by 278
Abstract
Medical named entity recognition (NER) focuses on extracting and classifying key entities from medical texts. Through automated medical information extraction, NER can effectively improve the efficiency of electronic medical record analysis, medical literature retrieval, and intelligent medical question–answering systems, enabling doctors and researchers [...] Read more.
Medical named entity recognition (NER) focuses on extracting and classifying key entities from medical texts. Through automated medical information extraction, NER can effectively improve the efficiency of electronic medical record analysis, medical literature retrieval, and intelligent medical question–answering systems, enabling doctors and researchers to obtain the required medical information more quickly and thereby helping to improve the accuracy of diagnosis and treatment decisions. The current methods have certain limitations in dealing with contextual dependencies and entity memory and fail to fully consider the contextual relevance and interactivity between entities. To address these issues, this paper proposes a Chinese medical named entity recognition model that combines contextual dependency perception and a new memory unit. The model combines the BERT pre-trained model with a new memory unit (GLMU) and a recall network (RMN). The GLMU can efficiently capture long-distance dependencies, while the RMN enhances multi-level semantic information processing. The model also incorporates fully connected layers (FC) and conditional random fields (CRF) to further optimize the performance of entity classification and sequence labeling. The experimental results show that the model achieved F1 values of 91.53% and 64.92% on the Chinese medical datasets MCSCSet and CMeEE, respectively, surpassing other related models and demonstrating significant advantages in the field of medical entity recognition. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Overview framework of our proposed model.</p>
Full article ">Figure 2
<p>Comparison of ablation experiment results.</p>
Full article ">
14 pages, 3457 KiB  
Article
Named Entity Recognition for Crop Diseases and Pests Based on Gated Fusion Unit and Manhattan Attention
by Wentao Tang, Xianhuan Wen and Zelin Hu
Agriculture 2024, 14(9), 1565; https://doi.org/10.3390/agriculture14091565 - 10 Sep 2024
Viewed by 440
Abstract
Named entity recognition (NER) is a crucial step in building knowledge graphs for crop diseases and pests. To enhance NER accuracy, we propose a new NER model—GatedMan—based on the gated fusion unit and Manhattan attention. GatedMan utilizes RoBERTa as a pre-trained model and [...] Read more.
Named entity recognition (NER) is a crucial step in building knowledge graphs for crop diseases and pests. To enhance NER accuracy, we propose a new NER model—GatedMan—based on the gated fusion unit and Manhattan attention. GatedMan utilizes RoBERTa as a pre-trained model and enhances it using bidirectional long short-term memory (BiLSTM) to extract features from the context. It uses a gated unit to perform weighted fusion between the outputs of RoBERTa and BiLSTM, thereby enriching the information flow. The fused output is then fed into a novel Manhattan attention mechanism to capture the long-range dependencies. The global optimum tagging sequence is obtained using the conditional random fields layer. To enhance the model’s robustness, we incorporate adversarial training using the fast gradient method. This introduces adversarial examples, allowing the model to learn more disturbance-resistant feature representations, thereby improving its performance against unknown inputs. GatedMan achieved F1 scores of 93.73%, 94.13%, 93.98%, and 96.52% on the AgCNER, Peoples_daily, MSRA, and Resume datasets, respectively, thereby outperforming the other models. Experimental results demonstrate that GatedMan accurately identifies entities related to crop diseases and pests and exhibits high generalizability in other domains. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>GatedMan model structure.</p>
Full article ">Figure 2
<p>RoBERTa model structure.</p>
Full article ">Figure 3
<p>LSTM structure.</p>
Full article ">Figure 4
<p>Gated fusion unit structure.</p>
Full article ">Figure 5
<p>Manhattan attention structure.</p>
Full article ">Figure 6
<p>Comparison of different model performances.</p>
Full article ">Figure 7
<p>Comparison of different evaluation mechanisms.</p>
Full article ">Figure 8
<p>Comparison of different feature fusion methods.</p>
Full article ">Figure 9
<p>Ablation study.</p>
Full article ">Figure 10
<p>Results of entity recognition.</p>
Full article ">
14 pages, 505 KiB  
Article
Few-Shot Learning Sensitive Recognition Method Based on Prototypical Network
by Guoquan Yuan, Xinjian Zhao, Liu Li, Song Zhang and Shanming Wei
Mathematics 2024, 12(17), 2791; https://doi.org/10.3390/math12172791 - 9 Sep 2024
Viewed by 442
Abstract
Traditional machine learning-based entity extraction methods rely heavily on feature engineering by experts, and the generalization ability of the model is poor. Prototype networks, on the other hand, can effectively use a small amount of labeled data to train models while using category [...] Read more.
Traditional machine learning-based entity extraction methods rely heavily on feature engineering by experts, and the generalization ability of the model is poor. Prototype networks, on the other hand, can effectively use a small amount of labeled data to train models while using category prototypes to enhance the generalization ability of the models. Therefore, this paper proposes a prototype network-based named entity recognition (NER) method, namely the FSPN-NER model, to solve the problem of difficult recognition of sensitive data in data-sparse text. The model utilizes the positional coding model (PCM) to pre-train the data and perform feature extraction, then computes the prototype vectors to achieve entity matching, and finally introduces a boundary detection module to enhance the performance of the prototype network in the named entity recognition task. The model in this paper is compared with LSTM, BiLSTM, CRF, Transformer and their combination models, and the experimental results on the test dataset show that the model outperforms the comparative models with an accuracy of 84.8%, a recall of 85.8% and an F1 value of 0.853. Full article
Show Figures

Figure 1

Figure 1
<p>The overall architecture of the model.</p>
Full article ">Figure 2
<p>Effect of training sample size on F1 scores.</p>
Full article ">Figure 3
<p>Comparing the performance of word vectors with different dimensions.</p>
Full article ">
24 pages, 4734 KiB  
Article
A Benchmark Evaluation of Multilingual Large Language Models for Arabic Cross-Lingual Named-Entity Recognition
by Mashael Al-Duwais, Hend Al-Khalifa and Abdulmalik Al-Salman
Electronics 2024, 13(17), 3574; https://doi.org/10.3390/electronics13173574 - 9 Sep 2024
Viewed by 634
Abstract
Multilingual large language models (MLLMs) have demonstrated remarkable performance across a wide range of cross-lingual Natural Language Processing (NLP) tasks. The emergence of MLLMs made it possible to achieve knowledge transfer from high-resource to low-resource languages. Several MLLMs have been released for cross-lingual [...] Read more.
Multilingual large language models (MLLMs) have demonstrated remarkable performance across a wide range of cross-lingual Natural Language Processing (NLP) tasks. The emergence of MLLMs made it possible to achieve knowledge transfer from high-resource to low-resource languages. Several MLLMs have been released for cross-lingual transfer tasks. However, no systematic evaluation comparing all models for Arabic cross-lingual Named-Entity Recognition (NER) is available. This paper presents a benchmark evaluation to empirically investigate the performance of the state-of-the-art multilingual large language models for Arabic cross-lingual NER. Furthermore, we investigated the performance of different MLLMs adaptation methods to better model the Arabic language. An error analysis of the different adaptation methods is presented. Our experimental results indicate that GigaBERT outperforms other models for Arabic cross-lingual NER, while language-adaptive pre-training (LAPT) proves to be the most effective adaptation method across all datasets. Our findings highlight the importance of incorporating language-specific knowledge to enhance the performance in distant language pairs like English and Arabic. Full article
Show Figures

Figure 1

Figure 1
<p>A taxonomy for transfer learning for NLP. Modified from [<a href="#B20-electronics-13-03574" class="html-bibr">20</a>].</p>
Full article ">Figure 2
<p>Language-adaptive pre-training (LAPT).</p>
Full article ">Figure 3
<p>GPU memory allocated for (<b>a</b>) LAPT and (<b>b</b>) adapter training.</p>
Full article ">Figure 4
<p>Zero-shot cross-lingual fine-tuning on CoNLL2003.</p>
Full article ">Figure 5
<p>Zero-shot cross-lingual performance on WikiANN dataset.</p>
Full article ">Figure 6
<p>Few-shot transfer performance of (<b>a</b>) mBERT and (<b>b</b>) XLM-R models with varying the number of target language examples.</p>
Full article ">Figure 7
<p>Performance of different adaptation methods on datasets.</p>
Full article ">Figure 8
<p>Classification reports for the different adaption methods implemented. (<b>a</b>) XLM-R base model; (<b>b</b>) XLM-R + LAPT; (<b>c</b>) XLM-R +Adapter; (<b>d</b>) XLM-R + morphology tokenizer.</p>
Full article ">Figure 9
<p>Confusion matrixes for the different adaption methods implemented. (<b>a</b>) XLM-R base model; (<b>b</b>) XLM-R + LAPT; (<b>c</b>) XLM-R +Adapter; (<b>d</b>) XLM-R + morphology tokenizer.</p>
Full article ">Figure 10
<p>Examples of errors produced by XLM-R base model on CLEANANERCorp dataset.</p>
Full article ">Figure 11
<p>Distribution of errors for each model.</p>
Full article ">Figure 12
<p>Distribution of errors for each error type.</p>
Full article ">
19 pages, 2828 KiB  
Article
KCB-FLAT: Enhancing Chinese Named Entity Recognition with Syntactic Information and Boundary Smoothing Techniques
by Zhenrong Deng, Zheng Huang, Shiwei Wei and Jinglin Zhang
Mathematics 2024, 12(17), 2714; https://doi.org/10.3390/math12172714 - 30 Aug 2024
Viewed by 366
Abstract
Named entity recognition (NER) is a fundamental task in Natural Language Processing (NLP). During the training process, NER models suffer from over-confidence, and especially for the Chinese NER task, it involves word segmentation and introduces erroneous entity boundary segmentation, exacerbating over-confidence and reducing [...] Read more.
Named entity recognition (NER) is a fundamental task in Natural Language Processing (NLP). During the training process, NER models suffer from over-confidence, and especially for the Chinese NER task, it involves word segmentation and introduces erroneous entity boundary segmentation, exacerbating over-confidence and reducing the model’s overall performance. These issues limit further enhancement of NER models. To tackle these problems, we proposes a new model named KCB-FLAT, designed to enhance Chinese NER performance by integrating enriched semantic information with the word-Boundary Smoothing technique. Particularly, we first extract various types of syntactic data and utilize a network named Key-Value Memory Network, based on syntactic information to functionalize this, integrating it through an attention mechanism to generate syntactic feature embeddings for Chinese characters. Subsequently, we employed an encoder named Cross-Transformer to thoroughly combine syntactic and lexical information to address the entity boundary segmentation errors caused by lexical information. Finally, we introduce a Boundary Smoothing module, combined with a regularity-conscious function, to capture the internal regularity of per entity, reducing the model’s overconfidence in entity probabilities through smoothing. Experimental results demonstrate that the proposed model achieves exceptional performance on the MSRA, Resume, Weibo, and self-built ZJ datasets, as verified by the F1 score. Full article
Show Figures

Figure 1

Figure 1
<p>The general structure of KCB-FLAT.</p>
Full article ">Figure 2
<p>The process of extracting syntactic information: POS Labels, syntactic constituents, and 211 dependencies of “科” (science) in the lexical labels.</p>
Full article ">Figure 3
<p>Encoding Syntactic Information with KVMN.</p>
Full article ">Figure 4
<p>The Cross-Transformer module.</p>
Full article ">Figure 5
<p>The Boundary Smoothing module.</p>
Full article ">Figure 6
<p>LOSS values of MSRA during training.</p>
Full article ">Figure 7
<p>LOSS values of Resume during training.</p>
Full article ">
19 pages, 3640 KiB  
Article
Recognition of Chinese Electronic Medical Records for Rehabilitation Robots: Information Fusion Classification Strategy
by Jiawei Chu, Xiu Kan, Yan Che, Wanqing Song, Kudreyko Aleksey and Zhengyuan Dong
Sensors 2024, 24(17), 5624; https://doi.org/10.3390/s24175624 - 30 Aug 2024
Viewed by 466
Abstract
Named entity recognition is a critical task in the electronic medical record management system for rehabilitation robots. Handwritten documents often contain spelling errors and illegible handwriting, and healthcare professionals frequently use different terminologies. These issues adversely affect the robot’s judgment and precise operations. [...] Read more.
Named entity recognition is a critical task in the electronic medical record management system for rehabilitation robots. Handwritten documents often contain spelling errors and illegible handwriting, and healthcare professionals frequently use different terminologies. These issues adversely affect the robot’s judgment and precise operations. Additionally, the same entity can have different meanings in various contexts, leading to category inconsistencies, which further increase the system’s complexity. To address these challenges, a novel medical entity recognition algorithm for Chinese electronic medical records is developed to enhance the processing and understanding capabilities of rehabilitation robots for patient data. This algorithm is based on a fusion classification strategy. Specifically, a preprocessing strategy is proposed according to clinical medical knowledge, which includes redefining entities, removing outliers, and eliminating invalid characters. Subsequently, a medical entity recognition model is developed to identify Chinese electronic medical records, thereby enhancing the data analysis capabilities of rehabilitation robots. To extract semantic information, the ALBERT network is utilized, and BILSTM and MHA networks are combined to capture the dependency relationships between words, overcoming the problem of different meanings for the same entity in different contexts. The CRF network is employed to determine the boundaries of different entities. The research results indicate that the proposed model significantly enhances the recognition accuracy of electronic medical texts by rehabilitation robots, particularly in accurately identifying entities and handling terminology diversity and contextual differences. This model effectively addresses the key challenges faced by rehabilitation robots in processing Chinese electronic medical texts, and holds important theoretical and practical value. Full article
(This article belongs to the Special Issue Dynamics and Control System Design for Robot Manipulation)
Show Figures

Figure 1

Figure 1
<p>Flowchart of data preprocessing.</p>
Full article ">Figure 2
<p>Result of data processing.</p>
Full article ">Figure 3
<p>Result of data processing.</p>
Full article ">Figure 4
<p>ALBERT-BiLSTM-MHA-CRF framework diagram.</p>
Full article ">Figure 5
<p>ALBERT-BiLSTM-MHA-CRF model loss and accuracy variation curve.</p>
Full article ">Figure 6
<p>Loss and accuracy of ablation experiments variation curve.</p>
Full article ">Figure 7
<p>Ablation experiment results.</p>
Full article ">Figure 8
<p>Weighted average results of ablation experiments.</p>
Full article ">Figure 9
<p>Loss and accuracy of comparative experiments variation curve.</p>
Full article ">Figure 10
<p>P, R, and F1 of five types in comparative experiments.</p>
Full article ">Figure 11
<p>Cross validation of box plot by comparison experiment.</p>
Full article ">
21 pages, 4584 KiB  
Article
CSMNER: A Toponym Entity Recognition Model for Chinese Social Media
by Yuyang Qi, Renjian Zhai, Fang Wu, Jichong Yin, Xianyong Gong, Li Zhu and Haikun Yu
ISPRS Int. J. Geo-Inf. 2024, 13(9), 311; https://doi.org/10.3390/ijgi13090311 - 29 Aug 2024
Viewed by 439
Abstract
In the era of information explosion, Chinese social media has become a repository for massive geographic information; however, its unique unstructured nature and diverse expressions are challenging to toponym entity recognition. To address this problem, we propose a Chinese social media named entity [...] Read more.
In the era of information explosion, Chinese social media has become a repository for massive geographic information; however, its unique unstructured nature and diverse expressions are challenging to toponym entity recognition. To address this problem, we propose a Chinese social media named entity recognition (CSMNER) model to improve the accuracy and robustness of toponym recognition in Chinese social media texts. By combining the BERT (Bidirectional Encoder Representations from Transformers) pre-trained model with an improved IDCNN-BiLSTM-CRF (Iterated Dilated Convolutional Neural Network- Bidirectional Long Short-Term Memory- Conditional Random Field) architecture, this study innovatively incorporates a boundary extension module to effectively extract the local boundary features and contextual semantic features of the toponym, successfully addressing the recognition challenges posed by noise interference and language expression variability. To verify the effectiveness of the model, experiments were carried out on three datasets: WeiboNER, MSRA, and the Chinese social named entity recognition (CSNER) dataset, a self-built named entity recognition dataset. Compared with the existing models, CSMNER achieves significant performance improvement in toponym recognition tasks. Full article
Show Figures

Figure 1

Figure 1
<p>Word frequency statistics of the CSNER (Chinese social named entity recognition) dataset generated using the word cloud. The image lists the top 2000 words with the highest frequency of occurrence, including a wide range of words such as personal names, toponyms, names of institutions and organizations, locatives, adverbs of place, nouns, but excluding many meaningless verbs, adjectives, and intonational auxiliaries.</p>
Full article ">Figure 2
<p>Corpus annotation process.</p>
Full article ">Figure 3
<p>The architecture of the CSMNER (Chinese social media named entity recognition) model. Inputting the sentence “[CSL] I am at the Palace Museum [SEP]” demonstrates the overall structure of the model.</p>
Full article ">Figure 4
<p>Comparison between dilated convolution and traditional convolution.</p>
Full article ">Figure 5
<p>Comparison of ReLU and GELU activation functions.</p>
Full article ">Figure 6
<p>Structure diagram of the toponym BE module.</p>
Full article ">Figure 7
<p>Statistical graphs of labeled entity information for each part of the experimental dataset, the three experimental datasets of WeiboNER, MSRA, and CSNER are all divided into three parts: Test, Dev, and Train.</p>
Full article ">Figure 8
<p>F1 score performance of different models on CSNER dataset.</p>
Full article ">Figure 9
<p>Loss performance of different models on CSNER dataset.</p>
Full article ">
16 pages, 956 KiB  
Article
Assessing Fine-Tuned NER Models with Limited Data in French: Automating Detection of New Technologies, Technological Domains, and Startup Names in Renewable Energy
by Connor MacLean and Denis Cavallucci
Mach. Learn. Knowl. Extr. 2024, 6(3), 1953-1968; https://doi.org/10.3390/make6030096 - 27 Aug 2024
Viewed by 1429
Abstract
Achieving carbon neutrality by 2050 requires unprecedented technological, economic, and sociological changes. With time as a scarce resource, it is crucial to base decisions on relevant facts and information to avoid misdirection. This study aims to help decision makers quickly find relevant information [...] Read more.
Achieving carbon neutrality by 2050 requires unprecedented technological, economic, and sociological changes. With time as a scarce resource, it is crucial to base decisions on relevant facts and information to avoid misdirection. This study aims to help decision makers quickly find relevant information related to companies and organizations in the renewable energy sector. In this study, we propose fine-tuning five RNN and transformer models trained for French on a new category, “TECH”. This category is used to classify technological domains and new products. In addition, as the model is fine-tuned on news related to startups, we note an improvement in the detection of startup and company names in the “ORG” category. We further explore the capacities of the most effective model to accurately predict entities using a small amount of training data. We show the progression of the model from being trained on several hundred to several thousand annotations. This analysis allows us to demonstrate the potential of these models to extract insights without large corpora, allowing us to reduce the long process of annotating custom training data. This approach is used to automatically extract new company mentions as well as to extract technologies and technology domains that are currently being discussed in the news in order to better analyze industry trends. This approach further allows to group together mentions of specific energy domains with the companies that are actively developing new technologies in the field. Full article
Show Figures

Figure 1

Figure 1
<p>Pipeline.</p>
Full article ">Figure 2
<p>spaCy’s language processing pipeline [<a href="#B3-make-06-00096" class="html-bibr">3</a>].</p>
Full article ">Figure 3
<p>Training spaCy’s included models [<a href="#B3-make-06-00096" class="html-bibr">3</a>].</p>
Full article ">Figure 4
<p>Correct annotations outside of training data predicted by the trained model.</p>
Full article ">Figure 5
<p>Energy domains correctly annotated by the model.</p>
Full article ">Figure 6
<p>Co-occurrence of organizations and technological domains in the same article.</p>
Full article ">
16 pages, 3062 KiB  
Article
A Method for Extracting Fine-Grained Knowledge of the Wheat Production Chain
by Jing Lu, Wanxia Yang, Liang He, Quan Feng, Tingwei Zhang and Seng Yang
Agronomy 2024, 14(9), 1903; https://doi.org/10.3390/agronomy14091903 - 25 Aug 2024
Viewed by 517
Abstract
The knowledge within wheat production chain data has multiple levels and complex semantic relationships, making it difficult to extract knowledge from them. Therefore, this paper proposes a fine-grained knowledge extraction method for the wheat production chain based on ontology. For the first time, [...] Read more.
The knowledge within wheat production chain data has multiple levels and complex semantic relationships, making it difficult to extract knowledge from them. Therefore, this paper proposes a fine-grained knowledge extraction method for the wheat production chain based on ontology. For the first time, the conceptual layers of ploughing, planting, managing, and harvesting were defined around the main agricultural activities of the wheat production chain. Based on this, the entities, relationships, and attributes in the conceptual layers were defined at a fine-grained level, and a spatial–temporal association pattern layer with four conceptual layers, twenty-eight entities, and forty-two relationships was constructed. Then, based on the characteristics of the self-constructed dataset, the Word2vec-BiLSTM-CRF model was designed for extracting the knowledge within it, i.e., the entity–relationship–attribute model and the Word2vec-BiLSTM-CRF model in this paper were compared with the four SOTA models. The results show that the accuracy and F1 value improved by 8.44% and 8.89%, respectively, compared with the BiLSTM-CRF model. Furthermore, the entities of the pest and disease dataset were divided into two different granularities for the comparison experiment; the results show that for entities with “disease names” and “pest names”, the recognition accuracy at the fine-grained level is improved by 32.71% and 31.58%, respectively, compared to the coarse-grained level, and the recognition performance of various fine-grained entities has been improved. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Flow of knowledge graph construction for fine-grained wheat production chain. (Blue circles represent properties and orange circles represent entities).</p>
Full article ">Figure 2
<p>Flowchart of ontology construction.</p>
Full article ">Figure 3
<p>(<b>a</b>) Classification of entity types in the wheat production chain. (<b>b</b>) Screenshot of protégé part. ((<b>b</b>) in <a href="#agronomy-14-01903-f003" class="html-fig">Figure 3</a> is the ontology diagram example drawn by protege software version 5.5.0. The first layer is wheat, the second layer is wheat-related entities, and the third layer is the attributes separated by one of the entities).</p>
Full article ">Figure 3 Cont.
<p>(<b>a</b>) Classification of entity types in the wheat production chain. (<b>b</b>) Screenshot of protégé part. ((<b>b</b>) in <a href="#agronomy-14-01903-f003" class="html-fig">Figure 3</a> is the ontology diagram example drawn by protege software version 5.5.0. The first layer is wheat, the second layer is wheat-related entities, and the third layer is the attributes separated by one of the entities).</p>
Full article ">Figure 4
<p>Wheat production chain pattern layer.</p>
Full article ">Figure 5
<p>Structure of Word2vec-BiLSTM-CRF model (Dis and Alt are abbreviations of the following labels: Dis—disease name; Alt—alternative name. The definition of “小麦全蚀病又称立枯病” is “Wheat take-all, also known as standing blight”).</p>
Full article ">Figure 6
<p>Example of annotation, Dis, Alt, Eit, Har, and Sym are the abbreviations of the labels: (<b>a</b>) example of BIO annotation; (<b>b</b>) example of BMES annotation; (<b>c</b>) example of BIOES annotation. (Dis—disease name; Alt—alternative name; Eit—etiology; Har—harmful parts; Sym—symptoms. The definition of “小麦全蚀病又称立枯病,全蚀病菌土壤寄居菌,病叶矮小” is “Wheat take-all, also known as standing blight, total erosion bacteria soil-dwelling bacteria, disease leaves dwarf”).</p>
Full article ">Figure 6 Cont.
<p>Example of annotation, Dis, Alt, Eit, Har, and Sym are the abbreviations of the labels: (<b>a</b>) example of BIO annotation; (<b>b</b>) example of BMES annotation; (<b>c</b>) example of BIOES annotation. (Dis—disease name; Alt—alternative name; Eit—etiology; Har—harmful parts; Sym—symptoms. The definition of “小麦全蚀病又称立枯病,全蚀病菌土壤寄居菌,病叶矮小” is “Wheat take-all, also known as standing blight, total erosion bacteria soil-dwelling bacteria, disease leaves dwarf”).</p>
Full article ">Figure 7
<p>Comparison of different model experiments.</p>
Full article ">Figure 8
<p>Named entity recognition results: (<b>a</b>) fine-grained ontology; (<b>b</b>) coarse-grained ontology.</p>
Full article ">Figure 9
<p>Knowledge graph of parts of the wheat production chain. (The red circle represents the pest name in the production chain, the green circle represents the attribute value in the pest entity, the horizontal line represents the attribute, and the arrow indicates the direction).</p>
Full article ">
Back to TopTop