Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (352)

Search Parameters:
Keywords = named entity recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 9740 KiB  
Article
Construction of a Geological Fault Corpus and Named Entity Recognition
by Huainuo Wang, Ruiqing Niu, Yongyao Han and Qinglu Deng
Appl. Sci. 2025, 15(5), 2465; https://doi.org/10.3390/app15052465 - 25 Feb 2025
Abstract
The rapid and effective extraction of fault entities is a fundamental process in constructing a fault knowledge graph. As a key method for recording and preserving fault data, a fault investigation report holds significant potential for extracting valuable information. This paper proposes a [...] Read more.
The rapid and effective extraction of fault entities is a fundamental process in constructing a fault knowledge graph. As a key method for recording and preserving fault data, a fault investigation report holds significant potential for extracting valuable information. This paper proposes a fault knowledge annotation system that incorporates geographic information, fault attribute, fault structure, fault activity, fault geomorphology, and fault hazard. The system is developed based on a comprehensive analysis of the textual characteristics of fault investigation reports. Additionally, we establish a fine-grained corpus tailored for this task and apply a combination of BERT and BiLSTM-CRF for named entity recognition in the fault domain. We compare the performance of our model with a non-pre-training baseline model. The experimental results demonstrate that (1) the F1 value of entity recognition based on the faulty corpus exceeds 80%, which validates the efficacy of the faulty corpus; (2) the BERT model can effectively utilize available information. The corpus to adjust the subsequent tasks, thus improving the model output; (3) the proposed BERT-BiLSTM-CRF model and ALBERT-BiLSTM-CRF models have superior extraction performance in comparison to the no-pre-training model. This study not only provides a theoretical basis for the effectiveness of the BERT-BiLSTM-CRF model in fault entity identification, but also establishes a solid data foundation for the subsequent construction of the fault knowledge map. In addition, it offers reliable technical support for practical application areas such as geological surveys, disaster early warning, and urban planning, thereby promoting the advancement of data-driven research in the field of geology. Full article
(This article belongs to the Section Earth Sciences)
Show Figures

Figure 1

Figure 1
<p>Flowchart of fault corpus collection and annotation.</p>
Full article ">Figure 2
<p>Doccano annotation interface. The red box shows the annotated text of the geological survey report in Chinese.</p>
Full article ">Figure 3
<p>Knowledge system of the fault corpus.</p>
Full article ">Figure 4
<p>The proportion of different entities.</p>
Full article ">Figure 5
<p>Architecture of the BiLSTM-CRF model.</p>
Full article ">Figure 6
<p>Architecture of the BERT-BiLSTM-CRF model.</p>
Full article ">Figure 7
<p>BiLSTM model.</p>
Full article ">Figure 8
<p>Architecture of the ALBERT-BiLSTM-CRF model.</p>
Full article ">Figure 9
<p>Recognition effect of different entities. (<b>a</b>) The precision of different entities; (<b>b</b>) the recall of different entities; (<b>c</b>) the F1-Score of different entities.</p>
Full article ">Figure 9 Cont.
<p>Recognition effect of different entities. (<b>a</b>) The precision of different entities; (<b>b</b>) the recall of different entities; (<b>c</b>) the F1-Score of different entities.</p>
Full article ">
24 pages, 3298 KiB  
Article
Construction of an LNG Carrier Port State Control Inspection Knowledge Graph by a Dynamic Knowledge Distillation Method
by Langxiong Gan, Qihao Yang, Yi Xu, Qiongyao Mao and Chengyong Liu
J. Mar. Sci. Eng. 2025, 13(3), 426; https://doi.org/10.3390/jmse13030426 - 25 Feb 2025
Abstract
The Port State Control (PSC) inspection of liquefied natural gas (LNG) carriers is crucial in maritime transportation. PSC inspection requires rapid and accurate identification of defects with limited resources, necessitating professional knowledge and efficient technical methods. Knowledge distillation, as a model lightweighting approach [...] Read more.
The Port State Control (PSC) inspection of liquefied natural gas (LNG) carriers is crucial in maritime transportation. PSC inspection requires rapid and accurate identification of defects with limited resources, necessitating professional knowledge and efficient technical methods. Knowledge distillation, as a model lightweighting approach in the field of artificial intelligence, offers the possibility of enhancing the responsiveness of LNG carrier PSC inspections. In this study, a knowledge distillation method is introduced, namely, the multilayer dynamic multi-teacher weighted knowledge distillation (MDMD) model. This model fuses multilayer soft labels from multi-teacher models by extracting intermediate feature soft labels and minimizing intermediate feature knowledge fusion. It also employs a comprehensive dynamic weight allocation scheme that combines global loss weight allocation with label weight allocation based on the inner product, enabling dynamic weight allocation across multiple teachers. The experimental results show that the MDMD model achieves a 90.6% accuracy rate in named entity recognition, which is 6.3% greater than that of the direct training method. In addition, under the same experimental conditions, the proposed model achieves a prediction speed that is approximately 64% faster than that of traditional models while reducing the number of model parameters by approximately 55%. To efficiently assist in PSC inspections, an LNG carrier PSC inspection knowledge graph is constructed on the basis of the recognition results to quickly and effectively support knowledge queries and assist PSC personnel in making decisions at inspection sites. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>Multilayer soft label knowledge fusion.</p>
Full article ">Figure 2
<p>Architecture of the proposed knowledge distillation model.</p>
Full article ">Figure 3
<p>Result for different models.</p>
Full article ">Figure 4
<p>F1 Storage of the results with a triple in Neo4j (partial).</p>
Full article ">Figure 5
<p>F1 score for different labels in the test dataset.</p>
Full article ">Figure 6
<p>Quantity for different labels in the test dataset.</p>
Full article ">Figure 7
<p>Result for hyperparameter sensitivity in distillation models.</p>
Full article ">
15 pages, 2169 KiB  
Article
Named Entity Recognition in the Field of Small Sample Electric Submersible Pump Based on FLAT
by Faming Gong, Siyuan Tong, Chengze Du, Zhenghao Wan and Shiyu Qiu
Appl. Sci. 2025, 15(5), 2359; https://doi.org/10.3390/app15052359 - 22 Feb 2025
Abstract
In special industrial fields such as electric submersible pump (ESP) wells, named entity recognition (NER) often suffers from low accuracy and incomplete entity recognition due to the scarcity of high-quality corpora and the prevalence of rare words and nested entities. To address these [...] Read more.
In special industrial fields such as electric submersible pump (ESP) wells, named entity recognition (NER) often suffers from low accuracy and incomplete entity recognition due to the scarcity of high-quality corpora and the prevalence of rare words and nested entities. To address these issues, this study introduces a character-level convolutional neural network (char-CNN) into the Flat-Lattice Transformer (FLAT) model and constructs nested entity matching rules for the ESP well domain, forming the char-CNN-FLAT-CRF model. This model achieves NER in the low-resource context of ESP wells. Through multiple experiments, the char-CNN-FLAT-CRF model demonstrates superior performance in this NER task compared to mainstream models and shows good recognition capabilities for rare words and nested entities. This research provides a methodological and conceptual reference for NER in other industrial fields that lack sufficient high-quality corpora. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of char-CNN.</p>
Full article ">Figure 2
<p>Structure of the FLAT Layer.</p>
Full article ">Figure 3
<p>Flowchart of nested entity matching.</p>
Full article ">Figure 4
<p>The accuracy and loss waveforms of multiple models on the training and validating sets: (<b>a</b>) Training accuracy waveform; (<b>b</b>) Validation accuracy waveform; (<b>c</b>) Training loss waveform; (<b>d</b>) Validation loss waveform.</p>
Full article ">Figure 5
<p>Comparison experiment results of rare words and nested entities: (<b>a</b>) Accuracy in recognizing system entities; (<b>b</b>) Accuracy in recognizing component entities; (<b>c</b>) Accuracy in recognizing fault symptom entities; (<b>d</b>) Accuracy in recognizing fault entities.</p>
Full article ">
20 pages, 1878 KiB  
Article
Research and Construction of Knowledge Map of Golden Pomfret Based on LA-CANER Model
by Xiaohong Peng, Hongbin Jiang, Jing Chen, Mingxin Liu and Xiao Chen
J. Mar. Sci. Eng. 2025, 13(3), 400; https://doi.org/10.3390/jmse13030400 - 21 Feb 2025
Abstract
To address the issues of fragmented species information, low knowledge extraction efficiency, and insufficient utilization in the aquaculture domain, the main objective of this study is to construct the first knowledge graph for the Golden Pomfret aquaculture field and optimize the named entity [...] Read more.
To address the issues of fragmented species information, low knowledge extraction efficiency, and insufficient utilization in the aquaculture domain, the main objective of this study is to construct the first knowledge graph for the Golden Pomfret aquaculture field and optimize the named entity recognition (NER) methods used in the construction process. The dataset contains challenges such as long text processing, strong local context dependencies, and entity sample imbalance, which result in low information extraction efficiency, recognition errors or omissions, and weak model generalization. This paper proposes a novel named entity recognition model, LA-CANER (Local Attention-Category Awareness NER), which combines local attention mechanisms with category awareness to improve both the accuracy and speed of NER. The constructed knowledge graph provides significant scientific knowledge support to Golden Pomfret aquaculture workers. First, by integrating and standardizing multi-source information, the knowledge graph offers comprehensive and accurate data, supporting decision-making for aquaculture management. The graph enables precise reasoning based on disease symptoms, environmental factors, and historical production data, helping workers identify potential risks early and take preventive actions. Furthermore, the knowledge graph can be integrated with large models like GPT-4 and DeepSeek-R1. By providing structured knowledge and rules, the graph enhances the reasoning and decision-making capabilities of these models. This promotes the application of smart aquaculture technologies and enables precision farming, ultimately increasing overall industry efficiency. Full article
(This article belongs to the Section Marine Aquaculture)
Show Figures

Figure 1

Figure 1
<p>Attributes of the Golden Pomfret.</p>
Full article ">Figure 2
<p>Ontology of the Golden Pomfret.</p>
Full article ">Figure 3
<p>Partial Knowledge Graph of the Golden Pomfret.</p>
Full article ">Figure 4
<p>Systematic Framework for the Construction and Application of the Golden Pomfret Knowledge Graph.</p>
Full article ">Figure 5
<p>Distribution of Entity Label Counts.</p>
Full article ">Figure 6
<p>Variation of F1 Score and Accuracy on the Test Set under Different Window Sizes.</p>
Full article ">Figure 7
<p>F1 Scores for Entity Recognition by Different Models.</p>
Full article ">
26 pages, 6629 KiB  
Article
Named Entity Recognition in Track Circuits Based on Multi-Granularity Fusion and Multi-Scale Retention Mechanism
by Yanrui Chen, Guangwu Chen and Peng Li
Electronics 2025, 14(5), 828; https://doi.org/10.3390/electronics14050828 - 20 Feb 2025
Abstract
To enhance the efficiency of reusing massive unstructured operation and maintenance (O&M) data generated during routine railway maintenance inspections, this paper proposes a Named Entity Recognition (NER) method that integrates multi-granularity semantics and a Multi-Scale Retention (MSR) mechanism. The proposed approach effectively transforms [...] Read more.
To enhance the efficiency of reusing massive unstructured operation and maintenance (O&M) data generated during routine railway maintenance inspections, this paper proposes a Named Entity Recognition (NER) method that integrates multi-granularity semantics and a Multi-Scale Retention (MSR) mechanism. The proposed approach effectively transforms expert knowledge extracted from manually processed fault data into structured triplet information, enabling the in-depth mining of track circuit O&M text data. Given the specific characteristics of railway domain texts, which include a high prevalence of technical terms, ambiguous entity boundaries, and complex semantics, we first construct a domain-specific lexicon stored in a Trie tree structure. A lexicon adapter is then introduced to incorporate these terms as external knowledge into the base encoding process of RoBERTa-wwm-ext, forming the lexicon-enhanced LE-RoBERTa-wwm model. Subsequently, a hidden feature extractor captures semantic representations from all 12 output layers of LE-RoBERTa-wwm, performing weighted fusion to fully leverage multi-granularity semantic information across encoding layers. Furthermore, in the downstream processing stage, two computational paradigms are designed based on the MSR mechanism and the Regularized Dropout (R-Drop) mechanism, enabling low-cost inference and efficient parallel training. Comparative experiments conducted on the public Resume and Weibo datasets demonstrate that the model achieves F1 scores of 96.75% and 72.06%, respectively. Additional experiments on a track circuit dataset further validate the model’s superior recognition performance and generalization capability. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Overall structure of the model in this paper.</p>
Full article ">Figure 2
<p>LE-RoBERTa-wwm model structure.</p>
Full article ">Figure 3
<p>Char–words pair sequence of a truncated Chinese sentence “轨道电路 (track circuit)”.</p>
Full article ">Figure 4
<p>The process of fusing character information with lexicon information.</p>
Full article ">Figure 5
<p>The structure of the multi-layered dilated gated convolutional neural network.</p>
Full article ">Figure 6
<p>Retentive network structure.</p>
Full article ">Figure 7
<p>Experimental results of character and lexicon fusion for different coding layers.</p>
Full article ">Figure 8
<p>Comparison experiment of hidden feature extraction layer. (<b>a</b>) <span class="html-italic">Loss</span> function variation curves for each model; (<b>b</b>) <span class="html-italic">Precision</span> curves of evaluation metrics for each model; (<b>c</b>) <span class="html-italic">Recall</span> curves of evaluation metrics for each model; (<b>d</b>) <span class="html-italic">F</span>1 curves of evaluation metrics for each model.</p>
Full article ">Figure 9
<p>Downstream model comparison experiment. (<b>a</b>) <span class="html-italic">Loss</span> function variation curves for each model; (<b>b</b>) <span class="html-italic">Precision</span> curves of evaluation metrics for each model; (<b>c</b>) <span class="html-italic">Recall</span> curves of evaluation metrics for each model; (<b>d</b>) <span class="html-italic">F</span>1 curves of evaluation metrics for each model.</p>
Full article ">Figure 10
<p>MSR tuning experiment.</p>
Full article ">Figure 11
<p>System recognition result display.</p>
Full article ">
20 pages, 2026 KiB  
Article
RL–Fusion: The Large Language Model Fusion Method Based on Reinforcement Learning for Task Enhancing
by Zijian Wang, Jiayong Li, Yu Liu, Xuhang Li, Cairong Yan and Yanting Zhang
Appl. Sci. 2025, 15(4), 2186; https://doi.org/10.3390/app15042186 - 18 Feb 2025
Abstract
Model fusion is a technique of growing interest in the field of machine learning, which constructs a generalized model by merging the parameters of multiple independent models with different capabilities without the need to access the original training data or perform costly computations. [...] Read more.
Model fusion is a technique of growing interest in the field of machine learning, which constructs a generalized model by merging the parameters of multiple independent models with different capabilities without the need to access the original training data or perform costly computations. However, during model fusion, when the number of parameters in a large language model is high, the dimension of the parameter space increases, which makes it more challenging to find the optimal combination of weights. Meanwhile, there is considerable potential for further development in sustainable optimization schemes for task-specific performance enhancement through model fusion in this area. In this paper, we propose a large-scale language model fusion approach based on task-enhanced reinforcement learning (RL–Fusion) to efficiently explore and optimize model fusion configurations. The key innovation of RL–Fusion lies in its use of reinforcement learning to guide parameter selection during model fusion, enabling a more intelligent and adaptive exploration of the parameter space. Additionally, RL–Fusion introduces a dynamic evaluation mechanism that adjusts the evaluation dataset in real-time based on feedback from SOTA models, ensuring continuous enhancement of domain-specific capabilities. RL–Fusion outperforms the baseline model by improving 1.75% in the MMLU benchmark test, 1.8% in the C-eval test, and 1.8% in the Chinese Named Entity Recognition (NER) test on the Yayi NER dataset by 16%. The results show that RL–Fusion is an effective and scalable model fusion solution that improves performance without increasing the computational cost of traditional optimization methods and has a wide range of applications in AI research and practice. Full article
Show Figures

Figure 1

Figure 1
<p>Overall architecture diagram of the framework. The framework is built around a dynamic closed-loop optimization process, beginning with model fusion, where parameters from the source Large Language Model (LLM) are integrated to create an initial fused model. This fused model is then applied to domain-specific tasks (named entity recognition) and evaluated by a SOTA LLM, which provides performance scores and rankings. The evaluation results are fed into the reinforcement learning optimization module, where fusion parameters are dynamically adjusted. The updated parameters are then fed back into the model fusion phase, creating a continuous iterative loop that progressively enhances model performance. By leveraging dynamic parameter space exploration, real-time feedback-driven optimization, and a scalable closed-loop architecture, the framework significantly improves the model’s adaptability and task performance, while offering key advantages such as high automation, strong domain adaptability, and optimized computational efficiency.</p>
Full article ">Figure 2
<p>Details of SOTA LLM Evaluation. The framework systematically integrates model output, dataset feedback, and dynamic weighting mechanisms to provide a quantifiable and scientific evaluation path for optimizing the performance of large language models.</p>
Full article ">Figure 3
<p>Main steps of Q-learning in RL–Fusion.</p>
Full article ">Figure 4
<p>Results of ablation experiments on MMLU.</p>
Full article ">Figure 5
<p>Results of ablation experiments on C-eval.</p>
Full article ">Figure 6
<p>Results of ablation experiments on Yayi.</p>
Full article ">
16 pages, 2188 KiB  
Article
MCP: A Named Entity Recognition Method for Shearer Maintenance Based on Multi-Level Clue-Guided Prompt Learning
by Xiangang Cao, Luyang Shi, Xulong Wang, Yong Duan, Xin Yang and Xinyuan Zhang
Appl. Sci. 2025, 15(4), 2106; https://doi.org/10.3390/app15042106 - 17 Feb 2025
Abstract
The coal mining industry has accumulated a vast amount of knowledge on shearer accident analysis and handling during its development. Accurately identifying and extracting entity information related to shearer maintenance is crucial for advancing downstream tasks in intelligent shearer operations and maintenance. Currently, [...] Read more.
The coal mining industry has accumulated a vast amount of knowledge on shearer accident analysis and handling during its development. Accurately identifying and extracting entity information related to shearer maintenance is crucial for advancing downstream tasks in intelligent shearer operations and maintenance. Currently, named entity recognition in the field of shearer maintenance primarily relies on fine-tuning-based methods; however, a gap exists between pretraining and downstream tasks. In this paper, we introduce prompt learning and large language models (LLMs), proposing a named entity recognition method for shearer maintenance based on multi-level clue-guided prompt learning (MCP). This method consists of three key components: (1) the prompt learning layer, which encapsulates the information to be identified and forms multi-level sub-clues into structured prompts based on a predefined format; (2) the LLM layer, which employs a decoder-only architecture-based large language model to deeply process the connection between the structured prompts and the information to be identified through multiple stacked decoder layers; and (3) the answer layer, which maps the output of the LLM layer to a structured label space via a parser to obtain the recognition results of structured named entities in the shearer maintenance domain. By designing multi-level sub-clues, MCP enables the model to extract and learn trigger words related to entity recognition from the prompts, acquiring context-aware prompt tokens. This allows the model to make accurate predictions, bridging the gap between fine-tuning and pretraining while eliminating the reliance on labeled data for fine-tuning. Validation was conducted on a self-constructed knowledge corpus in the shearer maintenance domain. Experimental results demonstrate that the proposed method outperforms mainstream baseline models in the field of shearer maintenance. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison of the proposed and previous models.</p>
Full article ">Figure 2
<p>Example of an entity relationship diagram in the shearer maintenance domain.</p>
Full article ">Figure 3
<p>MCP model framework.</p>
Full article ">Figure 4
<p>Transformer-based Decoder-only LLMs.</p>
Full article ">Figure 5
<p>Distribution of entity types by proportion.</p>
Full article ">Figure 6
<p>Results of ablation experiments.</p>
Full article ">
17 pages, 2395 KiB  
Article
Automated Dataset-Creation and Evaluation Pipeline for NER in Russian Literary Heritage
by Kenan Kassab, Nikolay Teslya and Ekaterina Vozhik
Appl. Sci. 2025, 15(4), 2072; https://doi.org/10.3390/app15042072 - 16 Feb 2025
Abstract
Developing robust and reliable models for Named Entity Recognition (NER) in the Russian language presents significant challenges due to the linguistic complexity of Russian and the limited availability of suitable training datasets. This study introduces a semi-automated methodology for building a customized Russian [...] Read more.
Developing robust and reliable models for Named Entity Recognition (NER) in the Russian language presents significant challenges due to the linguistic complexity of Russian and the limited availability of suitable training datasets. This study introduces a semi-automated methodology for building a customized Russian dataset for NER specifically designed for literary purposes. The paper provides a detailed description of the methodology employed for collecting and proofreading the dataset, outlining the pipeline used for processing and annotating its contents. A comprehensive analysis highlights the dataset’s richness and diversity. Central to the proposed approach is the use of a voting system to facilitate the efficient elicitation of entities, enabling significant time and cost savings compared to traditional methods of constructing NER datasets. The voting system is described theoretically and mathematically to highlight its impact on enhancing the annotation process. The results of testing the voting system with various thresholds show its impact in increasing the overall precision by 28% compared to using only the state-of-the-art model for auto-annotating. The dataset is meticulously annotated and thoroughly proofread, ensuring its value as a high-quality resource for training and evaluating NER models. Empirical evaluations using multiple NER models underscore the dataset’s importance and its potential to enhance the robustness and reliability of NER models in the Russian language. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)
Show Figures

Figure 1

Figure 1
<p>The pipeline of annotating the dataset.</p>
Full article ">Figure 2
<p>The structure of the dataset.</p>
Full article ">Figure 3
<p>The distribution of entity types within the dataset.</p>
Full article ">Figure 4
<p>Visualizing sample from the dataset.</p>
Full article ">Figure 5
<p>The pipeline for the enhancement approach with the voting system.</p>
Full article ">Figure 6
<p>The results of the voting system by changing the threshold on the test set.</p>
Full article ">Figure 7
<p>The total number of entities fixed by each system.</p>
Full article ">Figure 8
<p>The percentages of missed and incorrect annotations using only DeepPavlov.</p>
Full article ">Figure 9
<p>The percentages of missed and incorrect annotations using the enhancement approach.</p>
Full article ">
40 pages, 5018 KiB  
Article
Global Dense Vector Representations for Words or Items Using Shared Parameter Alternating Tweedie Model
by Taejoon Kim and Haiyan Wang
Mathematics 2025, 13(4), 612; https://doi.org/10.3390/math13040612 - 13 Feb 2025
Abstract
In this article, we present a model for analyzing the co-occurrence count data derived from practical fields such as user–item or item–item data from online shopping platforms and co-occurring word–word pairs in sequences of texts. Such data contain important information for developing recommender [...] Read more.
In this article, we present a model for analyzing the co-occurrence count data derived from practical fields such as user–item or item–item data from online shopping platforms and co-occurring word–word pairs in sequences of texts. Such data contain important information for developing recommender systems or studying the relevance of items or words from non-numerical sources. Different from traditional regression models, there are no observations for covariates. Additionally, the co-occurrence matrix is typically of such high dimension that it does not fit into a computer’s memory for modeling. We extract numerical data by defining windows of co-occurrence using weighted counts on the continuous scale. Positive probability mass is allowed for zero observations. We present the Shared Parameter Alternating Tweedie (SA-Tweedie) model and an algorithm to estimate the parameters. We introduce a learning rate adjustment used along with the Fisher scoring method in the inner loop to help the algorithm stay on track with optimizing direction. Gradient descent with the Adam update was also considered as an alternative method for the estimation. Simulation studies showed that our algorithm with Fisher scoring and learning rate adjustment outperforms the other two methods. We applied SA-Tweedie to English-language Wikipedia dump data to obtain dense vector representations for WordPiece tokens. The vector representation embeddings were then used in an application of the Named Entity Recognition (NER) task. The SA-Tweedie embeddings significantly outperform GloVe, random, and BERT embeddings in the NER task. A notable strength of the SA-Tweedie embedding is that the number of parameters and training cost for SA-Tweedie are only a tiny fraction of those for BERT. Full article
(This article belongs to the Special Issue High-Dimensional Data Analysis and Applications)
Show Figures

Figure 1

Figure 1
<p>Illustration of model input and desired output. Left panel: Model input—the natural log of (weighted occurrence count +1) matrix for the top 300 words from Reuter Business news data. Right panel: Shared parameter Tweedie modeling process and output.</p>
Full article ">Figure 2
<p>Computed log(loss) and log(overall loss) from simulated dataset using the Fisher scoring with or without learning rate adjustment, and gradient descent algorithm with Adam method for parameter update. The left panel depicts how the loss changes over 10 epochs for one row of the parameter update. As the epoch number grows, the loss has a general decreasing trend, but the Adam’s loss has higher values and reduces slower than the other two updates. The right panel is for overall loss versus the number of iterations in log scale. All losses decrease as the iteration number increases, but the Adam update has higher values of the overall loss.</p>
Full article ">Figure 3
<p>Relationship between the log of the sample mean and the log of the sample variance from Wikipedia data with a 50 K vocabulary size. The three lines in each interval are the fitted linear regression line and upper and lower bounds with same slope.</p>
Full article ">Figure 4
<p>The loss reduction was compared within epochs among three different updates: the alternating Tweedie regression algorithm with and without learning rate adjustment and Adam update. The results are from the first iteration and first row of data matrix in our Algorithm 1.</p>
Full article ">Figure 5
<p>The overall loss over iterations among three different update methods: with or without learning rate adjustment and the Adam update. The Fisher scoring type update with or without learning rate adjustment started with lower overall loss than the Adam update and reduces the overall loss faster as the iteration number increases.</p>
Full article ">Figure 6
<p>The <math display="inline"><semantics> <msub> <mo form="prefix">log</mo> <mn>10</mn> </msub> </semantics></math> scaled norm of the score vector and the overall loss as iteration proceeds from simulated data. The top panel shows norm of the score vector on a <math display="inline"><semantics> <msub> <mo form="prefix">log</mo> <mn>10</mn> </msub> </semantics></math> scale for two cases: with learning rate and without learning rate. The bottom panel illustrates <math display="inline"><semantics> <msub> <mo form="prefix">log</mo> <mn>10</mn> </msub> </semantics></math> overall loss versus iteration for the two cases. Overall, both cases are reducing the overall loss and the norm of the Score vector. The case with no learning rate is faster to reduce the overall loss in earlier iterations but may not achieve the minimum overall loss in the end. The case with learning rate moves slowly in earlier iterations but shows advantage in the end by finding smaller value in the overall loss.</p>
Full article ">Figure 7
<p>The overall loss in <math display="inline"><semantics> <msub> <mo form="prefix">log</mo> <mn>10</mn> </msub> </semantics></math> scale during iteration between 110 and 170 for the two cases: with learning rate and without learning rate. The update without learning rate adjustment is stabilized at a certain value before reaching the minimum overall loss. The algorithm with learning rate adjustment continuously reduces the overall loss until satisfying the convergence criterion, even though it was less effective in earlier iterations compared with the one with no learning rate.</p>
Full article ">Figure 8
<p>Comparing performance of the alternating Tweedie regression algorithm with or without learning rate over eight simulated datasets. Each row is for one dataset. A label such as +3.279e2 on upper left corner of the plots in the right-most column means that the values on vertical axis need to add 327.9.</p>
Full article ">Figure 8 Cont.
<p>Comparing performance of the alternating Tweedie regression algorithm with or without learning rate over eight simulated datasets. Each row is for one dataset. A label such as +3.279e2 on upper left corner of the plots in the right-most column means that the values on vertical axis need to add 327.9.</p>
Full article ">Figure 9
<p>Histogram of skewness for each row in raw co-occurrence count matrix (left panel) and the log co-occurrence count (right panel) constructed from Wikipedia dump.</p>
Full article ">Figure 10
<p>Trajectory of the training process of SA-Tweedie. Two embedding dimensions (100 and 300) are considered. The model achieved lower loss with higher embedding dimension.</p>
Full article ">Figure 11
<p>Weighted F1 score on NER test set for different settings with seeds 12, 42, and 111. All embeddings used 300-dimensional representations.</p>
Full article ">Figure 12
<p>Training and validation loss along with training and validation weighted F1 score for 15 epochs. Top row: random embedding. Middle row: GloVe embedding. Bottom row: SA-Tweedie embedding. All parameter initialization used identical global seed 42. The loss and weighted F1 score for test data are marked with cross marks with value given beside them.</p>
Full article ">Figure 12 Cont.
<p>Training and validation loss along with training and validation weighted F1 score for 15 epochs. Top row: random embedding. Middle row: GloVe embedding. Bottom row: SA-Tweedie embedding. All parameter initialization used identical global seed 42. The loss and weighted F1 score for test data are marked with cross marks with value given beside them.</p>
Full article ">
16 pages, 2645 KiB  
Article
Automated Extraction of Key Entities from Non-English Mammography Reports Using Named Entity Recognition with Prompt Engineering
by Zafer Akcali, Hazal Selvi Cubuk, Arzu Oguz, Murat Kocak, Aydan Farzaliyeva, Fatih Guven, Mehmet Nezir Ramazanoglu, Efe Hasdemir, Ozden Altundag and Ahmet Muhtesem Agildere
Bioengineering 2025, 12(2), 168; https://doi.org/10.3390/bioengineering12020168 - 10 Feb 2025
Abstract
Objective: Named entity recognition (NER) offers a powerful method for automatically extracting key clinical information from text, but current models often lack sufficient support for non-English languages. Materials and Methods: This study investigated a prompt-based NER approach using Google’s Gemini 1.5 Pro, a [...] Read more.
Objective: Named entity recognition (NER) offers a powerful method for automatically extracting key clinical information from text, but current models often lack sufficient support for non-English languages. Materials and Methods: This study investigated a prompt-based NER approach using Google’s Gemini 1.5 Pro, a large language model (LLM) with a 1.5-million-token context window. We focused on extracting important clinical entities from Turkish mammography reports, a language with limited available natural language processing (NLP) tools. Our method employed many-shot learning, incorporating 165 examples within a 26,000-token prompt derived from 75 initial reports. We tested the model on a separate set of 85 unannotated reports, concentrating on five key entities: anatomy (ANAT), impression (IMP), observation presence (OBS-P), absence (OBS-A), and uncertainty (OBS-U). Results: Our approach achieved high accuracy, with a macro-averaged F1 score of 0.99 for relaxed match and 0.84 for exact match. In relaxed matching, the model achieved F1 scores of 0.99 for ANAT, 0.99 for IMP, 1.00 for OBS-P, 1.00 for OBS-A, and 0.99 for OBS-U. For exact match, the F1 scores were 0.88 for ANAT, 0.79 for IMP, 0.78 for OBS-P, 0.94 for OBS-A, and 0.82 for OBS-U. Discussion: These results indicate that a many-shot prompt engineering approach with large language models provides an effective way to automate clinical information extraction for languages where NLP resources are less developed, and as reported in the literature, generally outperforms zero-shot, five-shot, and other few-shot methods. Conclusion: This approach has the potential to significantly improve clinical workflows and research efforts in multilingual healthcare environments. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Figure 1

Figure 1
<p>Framework of the methodology.</p>
Full article ">Figure 2
<p>Example of manual annotation correction in Microsoft Excel for calculating relaxed F1 scores (shows the y_true column being edited.).</p>
Full article ">Figure 3
<p>Example of a raw (unannotated) Turkish mammography report used as input for the LLM.</p>
Full article ">Figure 4
<p>Annotated Turkish mammography report output from the LLM, displayed in HTML format.</p>
Full article ">Figure 5
<p>English translation of the annotated Turkish mammography report provided for reader convenience.</p>
Full article ">Figure 6
<p>Relaxed match recognition confusion matrix.</p>
Full article ">
16 pages, 1191 KiB  
Article
Leveraging Transformer Models for Enhanced Pharmacovigilance: A Comparative Analysis of ADR Extraction from Biomedical and Social Media Texts
by Oumayma Elbiach, Hanane Grissette and El Habib Nfaoui
AI 2025, 6(2), 31; https://doi.org/10.3390/ai6020031 - 7 Feb 2025
Abstract
The extraction of Adverse Drug Reactions from biomedical text is a critical task in the field of healthcare and pharmacovigilance. It serves as a cornerstone for improving patient safety by enabling the early identification and mitigation of potential risks associated with pharmaceutical treatments. [...] Read more.
The extraction of Adverse Drug Reactions from biomedical text is a critical task in the field of healthcare and pharmacovigilance. It serves as a cornerstone for improving patient safety by enabling the early identification and mitigation of potential risks associated with pharmaceutical treatments. This process not only helps in detecting harmful side effects that may not have been evident during clinical trials but also contributes to the broader understanding of drug safety in real-world settings, ultimately guiding regulatory actions and informing clinical practices. In this study, we conducted a comprehensive evaluation of eleven transformer-based models for ADR extraction, focusing on two widely used datasets: CADEC and SMM4H. The task was approached as a sequence labeling problem, where each token in the text is classified as part of an ADR or not. Various transformer architectures, including BioBERT, PubMedBERT, and SpanBERT, were fine-tuned and evaluated on these datasets. BioBERT demonstrated superior performance on the CADEC dataset, achieving an impressive F1 score of 86.13%, indicating its strong capability in recognizing ADRs within patient narratives. On the other hand, SpanBERT emerged as the top performer on the SMM4H dataset, with an F1 score of 84.29%, showcasing its effectiveness in processing the more diverse and challenging social media data. These results highlight the importance of selecting appropriate models based on the specific characteristics such as text formality, domain-specific language, and task complexity to achieve optimal ADR extraction performance. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

Figure 1
<p>NER Annotation Process for ADR Extraction in the CADEC Dataset.</p>
Full article ">Figure 2
<p>Detailed Representation of Transformer Architecture, Multi-head attention, and Attention Mechanism [<a href="#B17-ai-06-00031" class="html-bibr">17</a>].</p>
Full article ">Figure 3
<p>Confusion matrix of CADEC dataset.</p>
Full article ">Figure 4
<p>Confusion matrix of SMM4H dataset.</p>
Full article ">
26 pages, 1469 KiB  
Article
A Methodological Framework for AI-Driven Textual Data Analysis in Digital Media
by Douglas Cordeiro, Carlos Lopezosa and Javier Guallar
Future Internet 2025, 17(2), 59; https://doi.org/10.3390/fi17020059 - 3 Feb 2025
Abstract
The growing volume of textual data generated on digital media platforms presents significant challenges for the analysis and interpretation of information. This article proposes a methodological approach that combines artificial intelligence (AI) techniques and statistical methods to explore and analyze textual data from [...] Read more.
The growing volume of textual data generated on digital media platforms presents significant challenges for the analysis and interpretation of information. This article proposes a methodological approach that combines artificial intelligence (AI) techniques and statistical methods to explore and analyze textual data from digital media. The framework, titled DAFIM (Data Analysis Framework for Information and Media), includes strategies for data collection through APIs and web scraping, textual data processing, and data enrichment using AI solutions, including named entity recognition (people, locations, objects, and brands) and the detection of clickbait in news. Sentiment analysis and text clustering techniques are integrated to support content analysis. The potential applications of this methodology include social networks, news aggregators, news portals, and newsletters, offering a robust framework for studying digital data and supporting informed decision-making. The proposed framework is validated through a case study involving data extracted from the Google News aggregation platform, focusing on the Israel–Lebanon conflict. This demonstrates the framework’s capability to uncover narrative patterns, content trends, and clickbait detection while also highlighting its advantages and limitations. Full article
Show Figures

Figure 1

Figure 1
<p>General architecture.</p>
Full article ">Figure 2
<p>Web scraping data extraction scheme. Note: the diamond symbol with an “X” inside represents an exclusive OR (XOR) logical operation.</p>
Full article ">Figure 3
<p>Data preprocessing and enrichment scheme.</p>
Full article ">Figure 4
<p>Knowledge discovery scheme. Note: the diamond symbol with a “+” inside represents the execution of both paths.</p>
Full article ">Figure 5
<p>Daily volume of news aggregated by version on the Homepage.</p>
Full article ">Figure 6
<p>Mentions of Lebanon from Google News Israel.</p>
Full article ">Figure 7
<p>Mentions of Israel from Google News Lebanon.</p>
Full article ">
18 pages, 3251 KiB  
Article
Research and Implementation of Agronomic Entity and Attribute Extraction Based on Target Localization
by Xiuming Guo, Yeping Zhu, Shijuan Li, Sheng Wu, Yue E and Shengping Liu
Agronomy 2025, 15(2), 354; https://doi.org/10.3390/agronomy15020354 - 29 Jan 2025
Abstract
The agronomic knowledge graph can provide accurate and reliable service support for agricultural production management. Agronomic knowledge often comes from unstructured text data, and efficient annotation of agricultural text data and construction of knowledge extraction models suitable for the characteristics of agronomic knowledge [...] Read more.
The agronomic knowledge graph can provide accurate and reliable service support for agricultural production management. Agronomic knowledge often comes from unstructured text data, and efficient annotation of agricultural text data and construction of knowledge extraction models suitable for the characteristics of agronomic knowledge are two key points to create an agronomic knowledge graph. The proportion of attributes in agronomic knowledge is relatively high, but currently, the attribute annotation function of existing annotation tools is incomplete, and the annotation function and process are unclear. A scalable natural language annotation framework was proposed, which was able to flexibly configure the annotation process and annotation objects as needed, and the named entity was annotated in the corresponding mode. The current knowledge extraction models are mostly based on input text sequences, which has the problem of low feature utilization. However, the entities and attributes in agronomic knowledge have high similarity, and the position and type of entities and attributes can be directly calculated through their common features. An entity and attribute recognition model based on target localization, EntityDetectModel, was proposed. Firstly, Bert was used to extract text features with contextual information. Then, convolutional neural networks were used to extract features at different depths, and inter layer feature fusion was used to improve feature expression ability. Finally, the corresponding positions and types of named entities with different sizes were calculated based on the features at different depths. EntityDetectModel was compared with the other entity and relationship extraction models published in recent years and the results showed that the precision, recall, and F1 of EntityDetectModel were 91.0%, 83.4%, and 87.0%, respectively, which were superior to other comparison models. Using EntityDetectModel, a wheat agronomic knowledge graph was constructed. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison between the proposed EntityDetectModel and previous methods.</p>
Full article ">Figure 2
<p>Schema layer for wheat agronomy knowledge graph.</p>
Full article ">Figure 3
<p>Annotation process flowchart.</p>
Full article ">Figure 4
<p>EntityDetectModel’s network architecture diagram.</p>
Full article ">Figure 5
<p>Token local feature fusion network Fnet.</p>
Full article ">Figure 6
<p>Named entity length and offset.</p>
Full article ">Figure 7
<p>Scalable text annotation system.</p>
Full article ">Figure 8
<p>Part of Gaoyou 2018 wheat variety agronomy knowledge graph display.</p>
Full article ">
16 pages, 1756 KiB  
Article
Chinese Named Entity Recognition for Automobile Fault Texts Based on External Context Retrieving and Adversarial Training
by Shuhai Wang and Linfu Sun
Entropy 2025, 27(2), 133; https://doi.org/10.3390/e27020133 - 27 Jan 2025
Abstract
Identifying key concepts in automobile fault texts is crucial for understanding fault causes and enabling diagnosis. However, effective mining tools are lacking, leaving much latent information unexplored. To solve the problem, this paper proposes Chinese named entity recognition for automobile fault texts based [...] Read more.
Identifying key concepts in automobile fault texts is crucial for understanding fault causes and enabling diagnosis. However, effective mining tools are lacking, leaving much latent information unexplored. To solve the problem, this paper proposes Chinese named entity recognition for automobile fault texts based on external context retrieval and adversarial training. First, we retrieve external contexts by using a search engine. Then, the input sentence and its external contexts are respectively fed into Lexicon Enhanced BERT to improve the text embedding representation. Furthermore, the input sentence and its external contexts embedding representation are fused through the attention mechanism. Then, adversarial samples are generated by adding perturbations to the fusion vector representation. Finally, the fusion vector representation and adversarial samples are input into the BiLSTM-CRF layer as training data for entity labeling. Our model is evaluated on the automotive fault datasets, Weibo and Resume datasets, and achieves state-of-the-art results. Full article
Show Figures

Figure 1

Figure 1
<p>The overall schema of the proposed model.</p>
Full article ">Figure 2
<p>Comparison of different keyword extraction methods by F1-scores on D1 and D2.</p>
Full article ">Figure 3
<p>Ablation experimental results on D3.</p>
Full article ">Figure 4
<p>F1-scores of different entity types on D1 (%).</p>
Full article ">Figure 5
<p>F1-scores of different entity types on D2 (%).</p>
Full article ">Figure 6
<p>The curves of training loss on D1.</p>
Full article ">Figure 7
<p>The indicators of training process on D1 with and without AT (%).</p>
Full article ">
17 pages, 3811 KiB  
Article
A Named Entity Recognition Model for Chinese Electricity Violation Descriptions Based on Word-Character Fusion and Multi-Head Attention Mechanisms
by Lingwen Meng, Yulin Wang, Yuanjun Huang, Dingli Ma, Xinshan Zhu and Shumei Zhang
Energies 2025, 18(2), 401; https://doi.org/10.3390/en18020401 - 17 Jan 2025
Viewed by 357
Abstract
Due to the complexity and technicality of named entity recognition (NER) in the power grid field, existing methods are ineffective at identifying specialized terms in power grid operation record texts. Therefore, this paper proposes a Chinese power violation description entity recognition model based [...] Read more.
Due to the complexity and technicality of named entity recognition (NER) in the power grid field, existing methods are ineffective at identifying specialized terms in power grid operation record texts. Therefore, this paper proposes a Chinese power violation description entity recognition model based on word-character fusion and multi-head attention mechanisms. The model first utilizes a collected power grid domain corpus to train a Word2Vec model, which produces static word vector representations. These static word vectors are then integrated with the dynamic character vector features of the input text generated by the BERT model, thereby mitigating the impact of segmentation errors on the NER model and enhancing the model’s ability to identify entity boundaries. The combined vectors are subsequently input into a BiGRU model for learning contextual features. The output from the BiGRU layer is then passed to an attention mechanism layer to obtain enhanced semantic features, which highlight key semantics and improve the model’s contextual understanding ability. Finally, the CRF layer decodes the output to generate the globally optimal label sequence with the highest probability. Experimental results on the constructed power grid field operation violation description dataset demonstrate that the proposed NER model outperforms the traditional BERT-BiLSTM-CRF model, with an average improvement of 1.58% in precision, recall, and F1-score. This demonstrates the effectiveness of the model design and further enhances the accuracy of entity recognition in the power grid domain. Full article
(This article belongs to the Section A1: Smart Grids and Microgrids)
Show Figures

Figure 1

Figure 1
<p>Flowchart of Named Entity Recognition for electric grid violation descriptions.</p>
Full article ">Figure 2
<p>Architecture and examples of input vectors in the BERT model.</p>
Full article ">Figure 3
<p>Architecture diagram of Word2Vec model.</p>
Full article ">Figure 4
<p>Internal structure diagram of GRU model.</p>
Full article ">Figure 5
<p>Diagrams of single-head and multi-head attention mechanisms.</p>
Full article ">Figure 6
<p>YEDDA Annotation Process.</p>
Full article ">Figure 7
<p>Model performance comparison on the test set for electric grid violation description recognition.</p>
Full article ">
Back to TopTop