Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Regulation of Stromal Cells by Sex Steroid Hormones in the Breast Cancer Microenvironment
Previous Article in Journal
Patient-Representative Cell Line Models in a Heterogeneous Disease: Comparison of Signaling Transduction Pathway Activity Between Ovarian Cancer Cell Lines and Ovarian Cancer
Previous Article in Special Issue
Value of Spinal Cord Diffusion Imaging and Tractography in Providing Predictive Factors for Tumor Resection in Patients with Intramedullary Tumors: A Pilot Study
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multimodal MRI Deep Learning for Predicting Central Lymph Node Metastasis in Papillary Thyroid Cancer

by
Xiuyu Wang
1,2,
Heng Zhang
2,
Hang Fan
3,
Xifeng Yang
3,
Jiansong Fan
3,
Puyeh Wu
4,
Yicheng Ni
1,* and
Shudong Hu
2,*
1
Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Department of Radiology, Zhongda Hospital, Medical School, Southeast University, Nanjing 210018, China
2
Department of Radiology, Affiliated hospital of Jiangnan University, Wuxi 214121, China
3
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214121, China
4
GE Healthcare, Beijing 100000, China
*
Authors to whom correspondence should be addressed.
Cancers 2024, 16(23), 4042; https://doi.org/10.3390/cancers16234042
Submission received: 19 October 2024 / Revised: 26 November 2024 / Accepted: 28 November 2024 / Published: 2 December 2024
(This article belongs to the Special Issue Recent Advances in Oncology Imaging: 2nd Edition)
Figure 1
<p>Inclusion and exclusion flowchart.</p> ">
Figure 2
<p>Segmentation of the ROI in axial T1 and T2 images. The red arrows in the MRI images indicate the location of the primary lesion.</p> ">
Figure 3
<p>Distribution of LASSO coefficients for T1 features, T2 features, and combined T1 + T2 features. (<b>a</b>,<b>b</b>) represent T1 features, (<b>c</b>,<b>d</b>) represent T2 features and (<b>e</b>,<b>f</b>) represent the combined T1 + T2 features.</p> ">
Figure 4
<p>Deep learning model workflow.</p> ">
Figure 5
<p>Histograms of best feature coefficients for T1, T2, and T1 + T2. (<b>a</b>) The best feature coefficients for T1; (<b>b</b>) the best feature coefficients for T2; (<b>c</b>) the best feature coefficients for T1 + T2.</p> ">
Figure 6
<p>The ROC curves of ML and DL models on the training set and test set. (<b>a</b>,<b>b</b>) The ROC curves of the SVM models on the training set and test set, (<b>c</b>,<b>d</b>) the ROC curves of the LR models on the training set and test set, (<b>e</b>,<b>f</b>) the ROC curves of RF models on training set and test set, (<b>g</b>,<b>h</b>) ROC curves of DL models on training set and test set.</p> ">
Figure 7
<p>ML and DL models’ DCA curves on the test set. (<b>a</b>) The DCA curves of the SVM models on the test set, (<b>b</b>) the DCA curves of the LR models on the test set, (<b>c</b>) the DCA curves of RF models on the test set, and (<b>d</b>) the DCA curves of the DL modes on the test set.</p> ">
Versions Notes

Simple Summary
Papillary thyroid cancer (PTC) is the most common subtype of thyroid cancer (TC), accounting for approximately 80–90% of all TC cases. PTC is generally considered a low-grade malignancy. However, aggressive behaviors, such as cervical lymph node metastasis, can be observed in some cases, with central lymph node metastasis (CLNM) regarded as the primary site of lymph node metastasis. In current clinical practice in China, prophylactic central lymph node dissection is commonly performed regardless of whether CLNM is present. This approach is associated with an increased risk of complications and unnecessary lymph node removal, raising concerns about overtreatment. Therefore, there is an urgent need for an efficient method to predict CLNM in PTC patients, which could guide clinical diagnosis and treatment strategies. In this study, we developed a fusion model based on the attention mechanism-based multimodal classification network (AMMCNet) architecture, integrating MRI images and clinicopathological data to efficiently predict CLNM in PTC patients.
Abstract
Background: Central lymph node metastasis (CLNM) in papillary thyroid cancer (PTC) significantly influences surgical decision-making strategies. Objectives: This study aims to develop a predictive model for CLNM in PTC patients using magnetic resonance imaging (MRI) and clinicopathological data. Methods: By incorporating deep learning (DL) algorithms, the model seeks to address the challenges in diagnosing CLNM and reduce overtreatment. The results were compared with traditional machine learning (ML) models. In this retrospective study, preoperative MRI data from 105 PTC patients were divided into training and testing sets. A radiologist manually outlined the region of interest (ROI) on MRI images. Three classic ML algorithms (support vector machine [SVM], logistic regression [LR], and random forest [RF]) were employed across different data modalities. Additionally, an AMMCNet utilizing convolutional neural networks (CNNs) was proposed to develop DL models for CLNM. Predictive performance was evaluated using receiver operator characteristic (ROC) curve analysis, and clinical utility was assessed through decision curve analysis (DCA). Results: Lesion diameter was identified as an independent risk factor for CLNM. Among ML models, the RF-(T1WI + T2WI, T1WI + T2WI + Clinical) models achieved the highest area under the curve (AUC) at 0.863. The DL fusion model surpassed all ML fusion models with an AUC of 0.891. Conclusions: A fusion model based on the AMMCNet architecture using MRI images and clinicopathological data was developed, effectively predicting CLNM in PTC patients.

1. Introduction

The global incidence of thyroid cancer (TC) has risen significantly over the past few decades [1]. Papillary thyroid carcinoma (PTC), accounting for 80–90% of all TC cases, is the most common subtype [2,3]. Despite generally a favorable prognosis with PTC, a number of patients experience aggressive forms of the disease, including cervical lymph node metastasis (LNM) [4]. Cervical LNM is strongly associated with higher rates of local recurrence and distant metastasis, negatively impacting disease-free survival (DFS) and overall survival (OS) rates [5,6].
Central lymph node metastasis (CLNM) is recognized as the primary site of cervical LNM in PTC [5,7]. However, current diagnostic methods for accurately detecting CLNM in PTC are inadequate. Ultrasonography (US), the predominant tool for preoperative lymph node assessment, exhibits variable sensitivity (26% to 47%) in identifying CLNM, making it insufficient for precise evaluation [8,9]. The effectiveness of US also heavily depends on the skill of the operator and is limited in imaging deeper anatomical structures [6]. Contrast-enhanced computed tomography (CT) provides enhancement over US by overcoming some of these limitations [6,10]. Nevertheless, both US and CT rely on morphological indicators to identify CLNM, which introduces subjectivity inherent in qualitative assessments. As a result, the detection rate for CLNM remains low, leading to misdiagnoses in 30% to 65% of PTC patients [11]. Due to this unsatisfactory detection rate, prophylactic central lymph node dissection (pCLND) is often recommended, despite the increased risks of complications such as recurrent laryngeal nerve injury and hypoparathyroidism, as well as potential overtreatment with unnecessary lymph node dissections [5,12]. Therefore, an accurate preoperative determination of CLNM status is critical.
Radiomics, an emerging technology, utilizes a high-throughput extraction of large-scale quantitative features from medical images and a machine learning (ML) algorithm for the classification [13]. Furthermore, based on convolutional neural networks (CNNs), the deep learning (DL) technique enables the automatic learning of crucial information from raw image data to perform tasks such as detection, classification, and segmentation [14]. A series of studies have been made regarding the application of ML radiomics and DL in thyroid disease research. Zhao et al. [15] developed ML radiomics models based on US and shear wave elastography images for thyroid nodules diagnosis and achieved a satisfactory diagnostic performance. Zhang et al. [16] used US-based ML radiomics, particularly the random forest classifier, demonstrating better thyroid nodules classification results than radiologists. Wang et al. [17] and Kwon et al. [18] constructed reliable US-based radiomics models for the prediction of extrathyroidal extension (ETE) and BRAF gene mutations, respectively. Similarly, Wu et al. [19] utilized US-based DL methods for the classification of thyroid nodules, demonstrating their effectiveness in improving diagnostic accuracy. Wang et al. [20] utilized CT-based DL methods for predicting cervical LNM in PTC, showcasing their potential to enhance preoperative diagnostic precision. Moreover, both ML radiomics and DL approaches are increasingly used in predicting CLNM in PTC [5,6,21].
Magnetic resonance imaging (MRI) is a non-invasive modality known for its outstanding soft tissue contrast. It is crucial for tumor detection, differentiation, evaluating treatment responses, and forecasting prognosis. MRI provides essential qualitative and quantitative insights at the cellular level, thus enhancing diagnostic and therapeutic outcomes [22]. Previous research has employed traditional ML radiomics methods based on MRI images to construct predictive models for LNM in PTC patients, achieving good predictive performance [22,23]. However, there are currently no studies using DL methods to develop predictive models for CLNM in PTC patients based on MRI images. According to previous research findings [24], DL methods can significantly enhance predictive performance compared to classical ML radiomics [25], which is primarily due to the ability of DL algorithms to capture complex information.
Therefore, we aim to develop a DL predictive model for CLNM in PTC patients based on multimodal MRI data and compare its performance with those of classical ML radiomics models.

2. Materials and Methods

2.1. Study Population and Clinical Pathological Characteristics

This retrospective study, approved by the Ethics Committee of Jiangnan University Affiliated Hospital (Approval Number: LS2020066), adheres to the principles of the Declaration of Helsinki. As this was a retrospective study, the requirement for patient informed consent was waived. Between August 2021 and August 2023, 148 patients who underwent thyroid MRI were initially assessed. Inclusion criteria included (1) PTC confirmed by surgical pathology; (2) MRI performed within two weeks prior to surgery; (3) patients who underwent ipsilateral lobectomy or total thyroidectomy with central lymph node dissection; (4) no prior thyroid surgery, biopsy, or history of head, neck tumors, or neck radiotherapy. Exclusion criteria included (1) PTC lesion with a maximum diameter less than 5 mm; (2) unclear MRI images that could not definitively delineate the region of interest (ROI); and (3) incomplete clinical or pathological data. The detailed patient selection process, including inclusion and exclusion criteria, is illustrated in Figure 1. Ultimately, 105 patients were included in the study with 55 individuals (52.38%) diagnosed as CLNM positive and 50 (47.62%) as CLNM negative. Patients were randomly assigned to training and testing sets in an 8:2 ratio for the construction of predictive models.
Clinical data including age and gender were collected, and pathology reports provided information on CLNM status, primary lesion diameter, extrathyroidal extension (ETE), multifocality, bilaterality, presence of calcification, and benign thyroid conditions.

2.2. MRI Protocol

MRI examinations were performed on a 3.0 Tesla MRI scanner (SIGNA Architect; GE Healthcare, Milwaukee, WI, USA) equipped with a 28-channel head and neck phased-array coil 1 to 2 weeks prior to surgery. During scanning, patients were positioned supine with their neck extended and shoulders relaxed downward to minimize artifacts in the clavicular area. Patients were also instructed to avoid swallowing during the scan. The scan coverage extended from the pharynx to the upper margin of the clavicle. The scanning protocols included axial T1-weighted imaging (T1WI), axial T2-weighted imaging (T2WI), axial diffusion-weighted imaging (DWI, with b-values of 0 and 500 s/mm2), and contrast-enhanced T1WI (CE-T1WI). T1WI was acquired using a spin echo sequence (Repetition Time/Echo Time [TR/TE] = 520/14 ms), and T2WI was acquired using a fast spin echo sequence (TR/TE = 3500/95 ms) with fat suppression. DWI was obtained using a single-shot echo-planar imaging sequence with short tau inversion recovery for fat suppression. Careful shimming was applied to correct local magnetic field inhomogeneities, optimizing image quality in the thyroid region. CE-T1WI Images with and without fat suppression were immediately acquired after the intravenous injection of 0.1 mmol/kg gadolinium-DTPA (Gd-DTPA) contrast agent at a flow rate of 1.5 mL/s (Magnevist; Schering AG, Berlin, Germany). Imaging parameters included a slice thickness of 3 mm, an interslice gap of 1 mm, a field of view (FOV) of 40 × 28 cm2, a matrix size of 256 × 256, and a number of excitation (NEX) of 4. The entire examination process was completed within 30 min.

2.3. ROI Segmentation

MRI images from all patients were imported into 3D-Slicer software (version 5.2.2; http://www.slicer.org) for precise ROI segmentation (Figure 2). Following the previous research, for PTC patients with multiple lesions, the largest one was selected for analysis [9]. Initially, two radiologists determined the largest cross-sectional area of the primary tumor on T1WI and T2WI images in consensus. The ROI on the largest cross-sectional images was meticulously outlined by a junior radiologist (Observer 1, with 5 years of experience), carefully avoiding surrounding normal thyroid tissue. To ensure repeatability and consistency in the ROI delineation, intra-observer and inter-observer consistency checks were performed. Two weeks after the initial delineation, 30 cases were randomly selected for re-segmentation by Observer 1, and another experienced radiologist (Observer 2, with 10 years of experience) segmented the same 30 cases. Intra-observer and inter-observer consistency between segmentations was assessed by calculating the class correlation coefficient (CCC). A CCC greater than 0.75 indicates a high level of consistency in the ROI segmentation process. Figure 2 shows representative ROI delineations on T1WI and T2WI images.

2.4. Radiomics Feature Extraction and Selection

Data preprocessing included deduplication, normalization, and standardization. Subsequently, using the PyRadiomics package (version 3.0.1) in Python [26], radiomic features were extracted from T1WI and T2WI images and categorized into three main groups: first-order features, shape features, and texture features. Texture features include (I) Gray Level Size Zone Matrix (GLSZM); (II) Gray Level Run-Length Matrix (GLRLM); (III) Gray Level Dependence Matrix (GLDM); (IV) Neighborhood Gray-Tone Difference Matrix (NGTDM); (V) Gray Level Co-occurrence Matrix (GLCM). Additionally, higher-order radiomic features were extracted from images processed with wavelet or logarithmic transformations.
For T1WI features, we conducted a detailed selection process. Firstly, we used the Mann–Whitney U test to retain features with a p-value less than 0.05, which was followed by further selection using the Pearson correlation coefficient (PCC). For features that were selected, we employed the Least Absolute Shrinkage and Selection Operator (LASSO) method for regression analysis (Figure 3). The LASSO method incrementally shrinks regression coefficients by adjusting the penalty parameter λ until they reach zero, effectively setting the coefficients of most irrelevant features to zero. Using 10-fold cross-validation, we identified the optimal λ value to minimize the cross-validation error and ultimately obtained the best features. The same approach was applied for T2WI features. For combined T1 + T2 features, we performed a multi-layer selection. Initially, we retained features with a p-value less than 0.05 using the Mann–Whitney U test, which was followed by further selection using PCC. Features were then further selected using the Maximum Relevance Minimum Redundancy (mRMR) algorithm and the LASSO algorithm (Figure 3). The mRMR algorithm aims to maximize the relevance between the features and the classification variable while minimizing the redundancy among the features, thereby eliminating all redundant and irrelevant features to ultimately obtain the best features.

2.5. Construction and Validation of ML Radiomics Models

ML predictive models were constructed by integrating radiomic features and clinical information. Initially, models based solely on radiomic features were developed using support vector machine (SVM), logistic regression (LR), and random forest (RF) for T1WI models, T2WI models, and combined T1WI + T2WI models. Following this, we combined imaging features with clinical information to construct SVM, LR, and RF models for T1WI + Clinical, T2WI + Clinical, and comprehensive T1WI + T2WI + Clinical models. The predictive performance and clinical utility of these models were assessed on the test data set using receiver operating characteristic (ROC) curve analysis and decision curve analysis (DCA).

2.6. Construction and Validation of DL Models

We proposed an attention mechanism-based multimodal classification network (AMMCNet), a sophisticated DL architecture integrating CNN and attention mechanisms, specifically designed for processing and analyzing MRI images and clinical pathology text information (Figure 4). This architecture consists of three main components: image feature extraction, text feature extraction, and feature fusion module. The network utilizes channel and spatial attention mechanisms for image feature extraction, extracting hierarchical features through a series of convolution and pooling layers. Text information is processed through a sequential module, including linear layers and ReLU activation, with a transformer encoder capturing contextual information and long-range dependencies within the text data. The feature fusion stage combines features from both image and text modalities processed through a linear layer to reduce their dimensionality, allowing the effective integration of complementary information for better classification. Predictive models were developed utilizing the AMMCNet classification network, which was initially based on imaging features (DL-T1, DL-T2, and DL-T1 + T2 models). Subsequently, we combined imaging features with clinical information to construct DL-T1 + Clinical, DL-T2 + Clinical, and the comprehensive fusion models. The predictive performance of DL models was evaluated on the test set using ROC and DCA analyses.

2.7. Statistical Analysis

Continuous variables from clinical pathology data were analyzed using the independent samples t-test or the Mann–Whitney U test, while categorical variables were assessed through the chi-square test or Fisher’s exact test. All statistical analyses were conducted using R software (version 4.3.0; http://www.Rproject.org). The performance of models was evaluated using ROC curve analysis, yielding the area under the curve (AUC), and the accuracy (ACC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV) were recorded.

3. Results

3.1. Clinical Baseline Characteristics

A total of 105 patients (28 males and 77 females, age: 46.19 ± 11.66 years, age range: 24–74 years) were included in the study. Data from these patients were divided into a training set (n = 84) and a test set (n = 21) according to an 8:2 ratio. The clinical baseline characteristics of the CLNM positive and CLNM negative groups are shown in Table 1. In the univariate analysis, both lesion diameter (OR = 1.966, p = 0.019) and ETE status (OR = 2.227, p = 0.045) showed significant differences. Further multivariate logistic regression analysis identified lesion diameter as a significant independent risk factor for predicting CLNM in PTC patients (p < 0.05) (Table 2). Table 3 displays the distribution of clinical baseline characteristics in the training and test sets, with no significant differences noted.

3.2. Radiomics Feature Extraction and Selection

From both T1WI and T2WI images, 864 radiomic features were extracted. For T1WI radiomic features, the Mann–Whitney U test reduced this to 41 features, and PCC further narrowed it down to 28 features, with 10 best features selected after LASSO. Regarding T2WI radiomic features, 90 features remained after the Mann–Whitney U test, 59 features after PCC, and 8 best features after LASSO. For combined T1WI + T2WI radiomic features, 94 features remained after the Mann–Whitney U test, 63 features after PCC, 20 features after mRMR, and finally, the 16 best features were selected after LASSO (Figure 5 and Table 4).

3.3. Performance of ML Radiomics Models

Table 5 and Figure 6 display the performance of SVM models in predicting CLNM in PTC. For AUC, the highest AUC was achieved by the T1WI + T2WI + Clinical model (0.764), and the lowest by the T2 model (0.600). For ACC, the T1WI, T1WI + T2WI, and T1WI + T2WI + Clinical models had the highest accuracy (0.714), while the T2 model had the lowest (0.476). For SEN, the T1WI, T1WI + T2WI, T1WI + Clinical, and T1WI + T2WI + Clinical models achieved the highest sensitivity (0.700), and the T2 model had the lowest (0.500). For SPE, the highest specificity was observed in the T1WI, T1WI + T2WI, and T1WI + T2WI + Clinical models (0.727), and the T2 model had the lowest (0.455). For PPV, the T1WI, T1WI + T2WI, and T1 + T2 + Clinical models had the highest values (0.700), and the T2 model the lowest (0.455). Finally, for NPV, the highest was found in the T1, T1 + T2, and T1 + T2 + Clinical models (0.727), and the lowest in the T2 model (0.500). The results of the DCA show that only the T1 + T2 + Clinical model briefly falls below the extreme curve (Treat all), while other models are consistently below both extreme curves (Treat all and Treat none) to varying degrees. This indicates that the T1 + T2 + Clinical model has a higher clinical utility (Figure 7a). Table 6 and Figure 6 present the performance of the LR models in predicting CLNM in PTC. For AUC, within the test set, the T1 + T2 and T1 + T2 + Clinical models showed the highest AUC (0.791), while the T1 + Clinical model had the lowest (0.664). In terms of ACC, the highest accuracy was recorded by the T1 + T2 and T1 + T2 + Clinical models (0.762), and the lowest by the T1 + Clinical model (0.667). For SEN, the T1 + T2 and T1 + T2 + Clinical models achieved the highest sensitivity (0.800), and the lowest sensitivity was seen in the T1, T2, T1 + Clinical, and T2 + Clinical models (0.700). Regarding SPE, the highest specificity was observed in the T1, T2, T1 + T2, T2 + Clinical, and T1 + T2 + Clinical models (0.727), with the T1 + Clinical model recording the lowest (0.636). For PPV, the highest values were found in the T1 + T2 and T1 + T2 + Clinical models (0.727), and the lowest in the T1 + Clinical model (0.636). Finally, for NPV, the highest was in the T1 + T2 and T1 + T2 + Clinical models (0.800), and the lowest in the T1 + Clinical model (0.700). The results of the DCA indicate that only the T1 + T2 and T1 + T2 + Clinical models did not fall below the extreme curve. In contrast, other models did fall below the extreme curve. Moreover, the T1 + T2 + Clinical model performed better than the T1 + T2 model across most threshold levels, suggesting that the T1 + T2 + Clinical model has superior clinical utility (Figure 7b). Table 7 and Figure 6 shows the performance of the RF models in predicting CLNM in PTC. For AUC, within the test set, the highest AUC was recorded by the T1 + T2 and T1 + T2 + Clinical models (0.836), while the lowest AUC was observed in the T2 model (0.600). In terms of ACC, the highest accuracy was achieved by the T1 + T2 model (0.857), and the lowest accuracy was seen in the T1 and T2 + Clinical models (0.571). For SEN, the highest sensitivity was shown by the T1 + T2 model (0.900), and the lowest sensitivity by the T2 + Clinical model (0.500). Regarding SPE, the T1 + T2 and T1 + T2 + Clinical models had the highest specificity (0.818), while the T1 model recorded the lowest (0.545). For PPV, the highest PPV was found in the T1 + T2 model (0.818), and the lowest in the T1 model (0.546). Finally, for NPV, the highest NPV was in the T1 + T2 model (0.900), and the lowest in the T2 + Clinical model (0.583). DCA analysis revealed that only the T1WI + T2WI and T1WI + T2WI + Clinical models did not fall below the extreme curves. Additionally, the T1WI + T2WI + Clinical model consistently outperformed the T1WI + T2WI model across most threshold levels, indicating its higher clinical utility (Figure 7c).

3.4. Performance of DL Models

Table 8 and Figure 6 display the performance of DL models in predicting CLNM in PTC. For AUC, the highest AUC was achieved by the fusion model (0.891), while the lowest was recorded by the T1WI model (0.718). For ACC, the highest accuracy was observed in the T1WI + T2WI and fusion models (0.857), with the lowest accuracy found in the T1WI model (0.714). For SEN, the highest sensitivity was shown in the T1WI + T2WI model (0.900), and the lowest was in the T1WI and T2WI models (0.700). For SPE, the highest specificity was recorded by the fusion model (0.909), and the lowest was in the T1WI and T1WI + Clinical models (0.727). For PPV, the highest PPV was achieved by the fusion model (0.889), and the lowest was in the T1 model (0.700). Finally, for NPV, the highest NPV was in the T1 + T2 model (0.900), and the lowest was in the T1 model (0.727). DCA results demonstrate that only the fusion model did not fall below the extreme curve, suggesting that the integrated model holds greater clinical utility (Figure 7d).

4. Discussion

In this study, we developed a fusion model integrating T1WI and T2WI images with clinical pathological data using the proposed DL architecture, AMMCNet, to effectively predict CLNM in patients with PTC. This model demonstrated superior predictive performance compared to traditional ML models, indicating the potential advantages of DL-based models in this context. Beyond its technical contributions, the proposed model has significant clinical implications. By providing a more accurate preoperative prediction of CLNM, it has the potential to reduce overtreatment in PTC patients. Specifically, it can guide more precise surgical decision making, such as avoiding unnecessary pCLND, which are associated with higher risks of complications. This highlights the model’s potential to improve patient outcomes and optimize resource utilization in clinical practice.
Univariate analysis revealed significant differences in lesion diameter and ETE between the CLNM-positive and CLNM-negative groups (p < 0.05). Multivariate logistic regression analysis further highlighted lesion diameter (OR = 1.137, p < 0.01) as an independent clinical risk factor related to CLNM in PTC patients. Previous studies have identified lesion diameter, ETE, age, gender, and multifocality as independent risk factors related to CLNM [9]. Our findings partly align with their results, which was potentially due to a small sample size. Future studies should consider expanding the sample size to uncover more significant clinical features, allowing for the further refinement and optimization of current models.
The radiomics features in this study included first-order statistics, morphological features, and texture features, including GLSZM, GLRLM, GLDM, NGTDM, and GLCM. Analysis of the T1WI, T2WI, and T1WI + T2WI features revealed that the majority of the best features were texture features, which assess tumor heterogeneity by analyzing the grayscale distribution in images at different scales and directions. Tumor heterogeneity, a fundamental characteristic of malignancies, significantly affects tumor growth, invasiveness, drug response, and prognosis [27]. Assessment of tissue heterogeneity has been a major focus in oncological research, with genomic studies confirming its close association with biological prognosis, posing challenges for treatment strategies. Radiomics features have shown a strong correlation with tumor heterogeneity at the molecular level [28].
The results of this study indicated that the RF-T1WI + T2WI model’s AUC was slightly higher than the DL-T1WI + T2WI model (0.863 vs. 0.827). However, DL models generally exhibited higher AUCs compared to the corresponding ML models. Recent reviews comparing the performance of DL models, classic ML models, and multi-domain fusion models in medical research have shown that in 65% of the studies, DL models outperformed ML models, while in 20% they performed worse, and in 15% they showed comparable performance [29]. Our comparative analysis confirmed that DL models are superior in predicting CLNM compared to traditional ML methods, which is consistent with the most prior results. Studies on predicting occult lymph node metastasis in laryngeal squamous cell carcinoma and classifying colorectal cancer lymph node metastasis also support that DL models offer enhanced predictive performance, facilitating a better integration of AI algorithms with medical diagnostics [9,30,31].
For ML models, the SVM-T1WI + T2WI + Clinical model achieved the highest AUC (0.764) among the SVM models, the LR-(T1WI + T2WI, T1WI + T2WI + Clinical) models achieved the highest AUC (0.791) among the LR models, and the RF-(T1WI + T2WI, T1WI + T2WI + Clinical) models achieved the highest AUC (0.836) among the RF models. In the SVM models, the AUC for T1WI + T2WI + Clinical was slightly higher than that for T1WI + T2WI, whereas in the LR and RF models, the AUCs for T1WI + T2WI and T1WI + T2WI + Clinical were similar, which was likely because lesion diameter was the only clinical feature included. Thus, the improvement from the T1WI + T2WI + Clinical model over the T1WI + T2WI model was minimal. Moreover, our study found that the AUC of the RF-T1WI + T2WI + Clinical model was significantly higher than that of the SVM and LR models using the same data, which was consistent with previous findings [32]. This may be due to RF’s ability to build numerous decision trees from random subsets of training data and features, thereby enhancing model robustness. The RF model, considered an ensemble of decision trees, benefits from MRI image data by reducing information loss as the tree depth increases [31]. Overall, MRI images combined with clinical pathology data can serve as effective predictors of CLNM, but selecting the appropriate algorithm is crucial.
The integration of the proposed model into clinical practice requires addressing several practical challenges. Key barriers include regulatory compliance, training requirements for radiologists to effectively interpret model outputs, and the cultural acceptance of AI-driven tools within the medical community. To overcome these challenges, multi-center validation studies and targeted training programs are essential to build trust and facilitate adoption. These steps will enhance the model’s clinical applicability and ensure its effective integration into routine practice.
Despite the meticulous design and implementation of this study, there are some limitations. First, the small sample size, especially in the test set, may limit the generalizability of the predictive models. Secondly, all data in this study came from a single center, lacking external validation. Additionally, for multifocal PTC, we only included the largest lesion and extracted features from the primary tumor, ignoring information from lymph nodes, which may affect predictive performance. Finally, due to MRI resolution limitations, lesions smaller than 5 mm in diameter were excluded, potentially limiting the comprehensiveness of the conclusions.

5. Conclusions

We developed a fusion model based on the AMMCNet architecture, integrating MRI images and clinicopathological data to efficiently detect CLNM in PTC patients. The model significantly outperforms ML models and has the potential to assist clinicians to make more accurate treatment decisions and to prevent overtreatment in PTC patients.

Author Contributions

Conceptualization, X.W. and S.H.; methodology, X.W.; software, X.W., X.Y., H.F. and J.F.; investigation, X.W.; resources, H.Z.; data curation, X.W. and H.Z.; writing—original draft preparation, X.W.; writing—review and editing, P.W. and Y.N.; supervision, S.H.; project administration, S.H.; funding acquisition, S.H. and H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Wuxi Health Committee, grant number z202204 and HB2023041.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of affiliated hospital of Jiangnan University (Approval Number: LS2020066).

Informed Consent Statement

Informed consent was waived since it was a retrospective study.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

Author Puyeh Wu was employed by the company GE Healthcare (China). The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

CLNM: central lymph node metastasis; CNN, convolutional neural network; CT, computed tomography; DL, deep learning; ETE, extrathyroidal extension; FNAB, fine needle aspiration biopsy; ML, machine learning; MLP, multilayer perceptron; MRI, magnetic resonance imaging; PET, positron emission tomography; PTC, papillary thyroid cancer; T1WI, T1-weighted imaging; T2WI, T2-wighted imaging; TC, thyroid cancer; US, ultrasound.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
  2. Feng, J.W.; Liu, S.Q.; Qi, G.F.; Ye, J.; Hong, L.Z.; Wu, W.X.; Jiang, Y. Development and Validation of Clinical-Radiomics Nomogram for Preoperative Prediction of Central Lymph Node Metastasis in Papillary Thyroid Carcinoma. Acad. Radiol. 2024, 31, 2292–2305. [Google Scholar] [CrossRef] [PubMed]
  3. Grimm, D. Cell and Molecular Biology of Thyroid Disorders 2.0. Int. J. Mol. Sci. 2021, 22, 1990. [Google Scholar] [CrossRef] [PubMed]
  4. Xue, T.; Liu, C.; Liu, J.J.; Hao, Y.H.; Shi, Y.P.; Zhang, X.X.; Zhang, Y.J.; Zhao, Y.F.; Liu, L.P. Analysis of the Relevance of the Ultrasonographic Features of Papillary Thyroid Carcinoma and Cervical Lymph Node Metastasis on Conventional and Contrast-Enhanced Ultrasonography. Front. Oncol. 2021, 11, 794399. [Google Scholar] [CrossRef]
  5. Wang, Z.; Qu, L.; Chen, Q.; Zhou, Y.; Duan, H.; Li, B.; Weng, Y.; Su, J.; Yi, W. Deep learning-based multifeature integration robustly predicts central lymph node metastasis in papillary thyroid cancer. BMC Cancer 2023, 23, 128. [Google Scholar] [CrossRef]
  6. Zhou, Y.; Su, G.Y.; Hu, H.; Tao, X.W.; Ge, Y.Q.; Si, Y.; Shen, M.P.; Xu, X.Q.; Wu, F.Y. Radiomics from Primary Tumor on Dual-Energy CT Derived Iodine Maps can Predict Cervical Lymph Node Metastasis in Papillary Thyroid Cancer. Acad. Radiol. 2022, 29 (Suppl. S3), S222–S231. [Google Scholar] [CrossRef]
  7. Zhao, H.; Huang, T.; Li, H. Risk factors for skip metastasis and lateral lymph node metastasis of papillary thyroid cancer. Surgery 2019, 166, 55–60. [Google Scholar] [CrossRef]
  8. Haugen, B.R.; Alexander, E.K.; Bible, K.C.; Doherty, G.M.; Mandel, S.J.; Nikiforov, Y.E.; Pacini, F.; Randolph, G.W.; Sawka, A.M.; Schlumberger, M.; et al. 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid 2016, 26, 1–133. [Google Scholar] [CrossRef]
  9. Gao, Y.; Wang, W.; Yang, Y.; Xu, Z.; Lin, Y.; Lang, T.; Lei, S.; Xiao, Y.; Yang, W.; Huang, W.; et al. An integrated model incorporating deep learning, hand-crafted radiomics and clinical and US features to diagnose central lymph node metastasis in patients with papillary thyroid cancer. BMC Cancer 2024, 24, 69. [Google Scholar] [CrossRef]
  10. Kim, E.; Park, J.S.; Son, K.R.; Kim, J.H.; Jeon, S.J.; Na, D.G. Preoperative diagnosis of cervical metastatic lymph nodes in papillary thyroid carcinoma: Comparison of ultrasound, computed tomography, and combined ultrasound with computed tomography. Thyroid 2008, 18, 411–418. [Google Scholar] [CrossRef]
  11. Mulla, M.; Schulte, K.-M. Central cervical lymph node metastases in papillary thyroid cancer: A systematic review of imaging-guided and prophylactic removal of the central compartment. Clin. Endocrinol. 2012, 76, 131–136. [Google Scholar] [CrossRef] [PubMed]
  12. Giordano, D.; Valcavi, R.; Thompson, G.B.; Pedroni, C.; Renna, L.; Gradoni, P.; Barbieri, V. Complications of central neck dissection in patients with papillary thyroid carcinoma: Results of a study on 1087 patients and review of the literature. Thyroid 2012, 22, 911–917. [Google Scholar] [CrossRef] [PubMed]
  13. Lambin, P.; Rios-Velazquez, E.; Leijenaar, R.; Carvalho, S.; van Stiphout, R.G.P.M.; Granton, P.; Zegers, C.M.L.; Gillies, R.; Boellard, R.; Dekker, A.; et al. Radiomics: Extracting more information from medical images using advanced feature analysis. Eur. J. Cancer 2012, 48, 441–446. [Google Scholar] [CrossRef] [PubMed]
  14. Mazurowski, M.A.; Buda, M.; Saha, A.; Bashir, M.R. Deep learning in radiology: An overview of the concepts and a survey of the state of the art with focus on MRI. J. Magn. Reson. Imaging 2019, 49, 939–954. [Google Scholar] [CrossRef] [PubMed]
  15. Zhao, C.-K.; Ren, T.-T.; Yin, Y.-F.; Shi, H.; Wang, H.-X.; Zhou, B.-Y.; Wang, X.-R.; Li, X.; Zhang, Y.-F.; Liu, C.; et al. A Comparative Analysis of Two Machine Learning-Based Diagnostic Patterns with Thyroid Imaging Reporting and Data System for Thyroid Nodules: Diagnostic Performance and Unnecessary Biopsy Rate. Thyroid 2021, 31, 470–481. [Google Scholar] [CrossRef]
  16. Zhang, B.; Tian, J.; Pei, S.; Chen, Y.; He, X.; Dong, Y.; Zhang, L.; Mo, X.; Huang, W.; Cong, S.; et al. Machine Learning-Assisted System for Thyroid Nodule Diagnosis. Thyroid 2019, 29, 858–867. [Google Scholar] [CrossRef]
  17. Wang, X.; Agyekum, E.A.; Ren, Y.; Zhang, J.; Zhang, Q.; Sun, H.; Zhang, G.; Xu, F.; Bo, X.; Lv, W.; et al. A Radiomic Nomogram for the Ultrasound-Based Evaluation of Extrathyroidal Extension in Papillary Thyroid Carcinoma. Front. Oncol. 2021, 11, 625646. [Google Scholar] [CrossRef]
  18. Kwon, M.R.; Shin, J.H.; Park, H.; Cho, H.; Hahn, S.Y.; Park, K.W. Radiomics Study of Thyroid Ultrasound for Predicting BRAF Mutation in Papillary Thyroid Carcinoma: Preliminary Results. Am. J. Neuroradiol. 2020, 41, 700–705. [Google Scholar] [CrossRef]
  19. Wu, G.-G.; Lv, W.-Z.; Yin, R.; Xu, J.-W.; Yan, Y.-J.; Chen, R.-X.; Wang, J.-Y.; Zhang, B.; Cui, X.-W.; Dietrich, C.F. Deep Learning Based on ACR TI-RADS Can Improve the Differential Diagnosis of Thyroid Nodules. Front. Oncol. 2021, 11, 575166. [Google Scholar] [CrossRef]
  20. Wang, C.; Yu, P.; Zhang, H.; Han, X.; Song, Z.; Zheng, G.; Wang, G.; Zheng, H.; Mao, N.; Song, X. Artificial intelligence-based prediction of cervical lymph node metastasis in papillary thyroid cancer with CT. Eur. Radiol. 2023, 33, 6828–6840. [Google Scholar] [CrossRef]
  21. Li, J.; Wu, X.; Mao, N.; Zheng, G.; Zhang, H.; Mou, Y.; Jia, C.; Mi, J.; Song, X. Computed Tomography-Based Radiomics Model to Predict Central Cervical Lymph Node Metastases in Papillary Thyroid Carcinoma: A Multicenter Study. Front. Endocrinol. 2021, 12, 741698. [Google Scholar] [CrossRef] [PubMed]
  22. Hu, W.; Wang, H.; Wei, R.; Wang, L.; Dai, Z.; Duan, S.; Ge, Y.; Wu, P.-Y.; Song, B. MRI-based radiomics analysis to predict preoperative lymph node metastasis in papillary thyroid carcinoma. Gland. Surg. 2020, 9, 1214–1226. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, H.; Hu, S.; Wang, X.; He, J.; Liu, W.; Yu, C.; Sun, Z.; Ge, Y.; Duan, S. Prediction of Cervical Lymph Node Metastasis Using MRI Radiomics Approach in Papillary Thyroid Carcinoma: A Feasibility Study. Technol. Cancer Res. Treat. 2020, 19, 1533033820969451. [Google Scholar] [CrossRef] [PubMed]
  24. Park, V.Y.; Han, K.; Seong, Y.K.; Park, M.H.; Kim, E.-K.; Moon, H.J.; Yoon, J.H.; Kwak, J.Y. Diagnosis of Thyroid Nodules: Performance of a Deep Learning Convolutional Neural Network Model vs. Radiologists. Sci. Rep. 2019, 9, 17843. [Google Scholar] [CrossRef]
  25. Buda, M.; Wildman-Tobriner, B.; Hoang, J.K.; Thayer, D.; Tessler, F.N.; Middleton, W.D.; Mazurowski, M.A. Management of Thyroid Nodules Seen on US Images: Deep Learning May Match Performance of Radiologists. Radiology 2019, 292, 695–701. [Google Scholar] [CrossRef]
  26. van Griethuysen, J.J.M.; Fedorov, A.; Parmar, C.; Hosny, A.; Aucoin, N.; Narayan, V.; Beets-Tan, R.G.H.; Fillion-Robin, J.-C.; Pieper, S.; Aerts, H.J.W.L. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017, 77, e104–e107. [Google Scholar] [CrossRef]
  27. O’Connor, J.P.B.; Rose, C.J.; Waterton, J.C.; Carano, R.A.D.; Parker, G.J.M.; Jackson, A. Imaging intratumor heterogeneity: Role in therapy response, resistance, and clinical outcome. Clin. Cancer Res. 2015, 21, 249–257. [Google Scholar] [CrossRef]
  28. Moon, S.H.; Kim, J.; Joung, J.-G.; Cha, H.; Park, W.-Y.; Ahn, J.S.; Ahn, M.-J.; Park, K.; Choi, J.Y.; Lee, K.-H.; et al. Correlations between metabolic texture features, genetic heterogeneity, and mutation burden in patients with lung cancer. Eur. J. Nucl. Med. Mol. Imaging 2019, 46, 446–454. [Google Scholar] [CrossRef]
  29. Demircioğlu, A. Are deep models in radiomics performing better than generic models? A systematic review. Eur. Radiol. Exp. 2023, 7, 11. [Google Scholar] [CrossRef]
  30. Wang, W.; Liang, H.; Zhang, Z.; Xu, C.; Wei, D.; Li, W.; Qian, Y.; Zhang, L.; Liu, J.; Lei, D. Comparing three-dimensional and two-dimensional deep-learning, radiomics, and fusion models for predicting occult lymph node metastasis in laryngeal squamous cell carcinoma based on CT imaging: A multicentre, retrospective, diagnostic study. eClinicalMedicine 2024, 67, 102385. [Google Scholar] [CrossRef]
  31. Li, J.; Wang, P.; Zhou, Y.; Liang, H.; Luan, K. Different Machine Learning and Deep Learning Methods for the Classification of Colorectal Cancer Lymph Node Metastasis Images. Front. Bioeng. Biotechnol. 2020, 8, 620257. [Google Scholar] [CrossRef] [PubMed]
  32. Huang, X.; Zhang, Y.; He, D.; Lai, L.; Chen, J.; Zhang, T.; Mao, H. Machine Learning-Based Shear Wave Elastography Elastic Index (SWEEI) in Predicting Cervical Lymph Node Metastasis of Papillary Thyroid Microcarcinoma: A Comparative Analysis of Five Practical Prediction Models. Cancer Manag. Res. 2022, 14, 2847–2858. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Inclusion and exclusion flowchart.
Figure 1. Inclusion and exclusion flowchart.
Cancers 16 04042 g001
Figure 2. Segmentation of the ROI in axial T1 and T2 images. The red arrows in the MRI images indicate the location of the primary lesion.
Figure 2. Segmentation of the ROI in axial T1 and T2 images. The red arrows in the MRI images indicate the location of the primary lesion.
Cancers 16 04042 g002
Figure 3. Distribution of LASSO coefficients for T1 features, T2 features, and combined T1 + T2 features. (a,b) represent T1 features, (c,d) represent T2 features and (e,f) represent the combined T1 + T2 features.
Figure 3. Distribution of LASSO coefficients for T1 features, T2 features, and combined T1 + T2 features. (a,b) represent T1 features, (c,d) represent T2 features and (e,f) represent the combined T1 + T2 features.
Cancers 16 04042 g003
Figure 4. Deep learning model workflow.
Figure 4. Deep learning model workflow.
Cancers 16 04042 g004
Figure 5. Histograms of best feature coefficients for T1, T2, and T1 + T2. (a) The best feature coefficients for T1; (b) the best feature coefficients for T2; (c) the best feature coefficients for T1 + T2.
Figure 5. Histograms of best feature coefficients for T1, T2, and T1 + T2. (a) The best feature coefficients for T1; (b) the best feature coefficients for T2; (c) the best feature coefficients for T1 + T2.
Cancers 16 04042 g005
Figure 6. The ROC curves of ML and DL models on the training set and test set. (a,b) The ROC curves of the SVM models on the training set and test set, (c,d) the ROC curves of the LR models on the training set and test set, (e,f) the ROC curves of RF models on training set and test set, (g,h) ROC curves of DL models on training set and test set.
Figure 6. The ROC curves of ML and DL models on the training set and test set. (a,b) The ROC curves of the SVM models on the training set and test set, (c,d) the ROC curves of the LR models on the training set and test set, (e,f) the ROC curves of RF models on training set and test set, (g,h) ROC curves of DL models on training set and test set.
Cancers 16 04042 g006
Figure 7. ML and DL models’ DCA curves on the test set. (a) The DCA curves of the SVM models on the test set, (b) the DCA curves of the LR models on the test set, (c) the DCA curves of RF models on the test set, and (d) the DCA curves of the DL modes on the test set.
Figure 7. ML and DL models’ DCA curves on the test set. (a) The DCA curves of the SVM models on the test set, (b) the DCA curves of the LR models on the test set, (c) the DCA curves of RF models on the test set, and (d) the DCA curves of the DL modes on the test set.
Cancers 16 04042 g007
Table 1. Baseline of CLNM positive group and CLNM negative group.
Table 1. Baseline of CLNM positive group and CLNM negative group.
CharacteristicsCLNM (+) (n = 50)CLNM (−) (n = 55)p Value
Age, Mean ± SD44.62 ± 11.6847.62 ± 11.570.190
Diameter, M (Q1, Q3)1.15 (0.80, 1.65)1.00 (0.60, 1.50)0.037 *
Gender, n (%)  0.105
Male17 (34.00)11 (20.00) 
Female33 (66.00)44 (80.00) 
ETE, n (%)  0.044 *
Yes28 (56.00)20 (36.36) 
No22 (44.00)35 (63.64) 
Multifocal, n (%)  0.163
Yes17 (34.00)12 (21.82) 
No33 (66.00)43 (78.18) 
Biliteral, n (%)  0.150
Yes14 (28.00)9 (16.36) 
No36 (72.00)46 (83.64) 
Calcification, n (%)  0.679
Yes1 (2.00)3 (5.45) 
No49 (98.00)52 (94.55) 
Benign lesions, n (%)  0.654
Yes26 (52.00)31 (56.36) 
No24 (48.00)24 (43.64) 
PTC: papillary thyroid cancer; CLNM: central lymph node metastasis; ETE: extrathyroidal extension; *: p < 0.05.
Table 2. Univariate analysis and multivariate logistic regression analysis.
Table 2. Univariate analysis and multivariate logistic regression analysis.
 OR (95% CI)p Value
Age0.937 (0.864–1.017)0.190
Diameter1.137 (1.050–1.231)0.008 *
Gender1.083 (0.998–1.174)0.107
ETE1.104 (1.018–1.196)0.044 *
Multifocality1.071 (0.987–1.162)0.166
Bilateral1.073 (0.980–1.164)0.153
Calcification0.956 (0.881–1.038)0.361
Benign lesion0.978 (0.881–1.038)0.658
ETE: extrathyroidal extension; CI: confidence interval; OR: odds ratio; *: p < 0.05.
Table 3. Distribution of baseline between training and test sets.
Table 3. Distribution of baseline between training and test sets.
CharacteristicsTraining Cohort (n = 84)Test Cohort (n = 21)p Value
Age, Mean ± SD46.38 ± 11.9145.43 ± 10.840.740
Diameter, M (Q1, Q3)1.10 (0.70–1.50)0.80 (0.60–1.80)0.782
CLNM, n (%)  0.329
Positive42 (50.00)8 (38.10) 
Negative42 (50.00)13 (61.90) 
Gender, n (%)  0.741
Male23 (27.38)5 (23.81) 
Female61 (72.62)16 (76.19) 
ETE, n (%)  0.493
Yes37 (44.05)11 (52.38) 
No47 (55.95)10 (47.62) 
Multifocal, n (%)  0.913
Yes23 (27.38)6 (28.57) 
No61 (72.62)15 (71.43) 
Biliteral, n (%)  1.000
Yes18 (21.43)5 (23.81) 
No66 (78.57)16 (76.19) 
Calcification, n (%)  0.581
Yes4 (4.76)0 (0.00) 
No80 (95.24)21 (100.00) 
Benign lesions, n (%)  0.433
Yes44 (52.38)13 (61.90) 
No40 (47.62)8 (38.10) 
CLNM: central lymph node metastasis; ETE: extrathyroidal extension.
Table 4. Best radiomic features of T1, T2 and T1 + T2.
Table 4. Best radiomic features of T1, T2 and T1 + T2.
SequenceFeature Name
Best T1 featuresoriginal_gldm_DependenceNonUniformityNormalized
log-sigma-2-0-mm-3D_firstorder_Skewness
log-sigma-2-0-mm-3D_glcm_ClusterShade
log-sigma-2-0-mm-3D_gldm_LargeDependenceLowGrayLevelEmphasis
log-sigma-4-0-mm-3D_firstorder_10Percentile
log-sigma-4-0-mm-3D_glszm_SmallAreaEmphasis
log-sigma-5-0-mm-3D_glcm_Idn
wavelet-LH_glszm_GrayLevelNonUniformity
wavelet-HH_firstorder_Skewness
wavelet-HH_glcm_ClusterShade
Best T2 featureslog-sigma-2-0-mm-3D_glcm_Imc1
log-sigma-3-0-mm-3D_gldm_DependenceNonUniformityNormalized
log-sigma-4-0-mm-3D_gldm_GrayLevelNonUniformity
log-sigma-5-0-mm-3D_glrlm_RunVariance
log-sigma-5-0-mm-3D_ngtdm_Contrast
wavelet-LH_glcm_Imc1
wavelet-HH_glcm_Imc1
wavelet-HH_glcm_Imc2
Best T1 + T2 featureslog-sigma-2-0-mm-3D_glcm_ClusterShade
wavelet-HH_glcm_ClusterShade
log-sigma-2-0-mm-3D_firstorder_Skewness
log-sigma-4-0-mm-3D_firstorder_10Percentile
log-sigma-5-0-mm-3D_glrlm_RunVariance
log-sigma-2-0-mm-3D_gldm_LowGrayLevelEmphasis
wavelet-HH_ngtdm_Busyness
log-sigma-3-0-mm-3D_gldm_DependenceNonUniformityNormalized
log-sigma-2-0-mm-3D_glcm_Imc1
log-sigma-4-0-mm-3D_glrlm_RunVariance
log-sigma-5-0-mm-3D_ngtdm_Contrast
wavelet-LH_glcm_Imc1
wavelet-HH_glcm_MCC
log-sigma-4-0-mm-3D_gldm_DependenceNonUniformityNormalized
original_shape_Flatness
log-sigma-5-0-mm-3D_glszm_ZoneVariance
Table 5. Performance of SVM models on training and test sets.
Table 5. Performance of SVM models on training and test sets.
SVM ModelsSetAUC (95% CI)ACCSENSPEPPVNPV
T1Training0.716 (0.599–0.821)0.6550.4250.8640.7390.623
T2Training0.624 (0.505–0.744)0.5710.8250.3410.5320.682
T1 + T2Training0.777 (0.663–0.876)0.6790.8000.5680.6280.758
T1 + ClinicalTraining0.665 (0.553–0.779)0.6310.4250.8180.6800.610
T2 + ClinicalTraining0.610 (0.494–0.730)0.6070.8750.3640.5560.762
T1 + T2 + ClinicalTraining0.770 (0.664–0.868)0.6900.7750.6140.6460.750
T1Test0.664 (0.398–0.900)0.7140.7000.7270.7000.727
T2Test0.600 (0.324–0.852)0.4760.5000.4550.4550.500
T1 + T2Test0.727 (0.479–0.935)0.7140.7000.7270.7000.727
T1 + ClinicalTest0.664 (0.418–0.885)0.6670.7000.6360.6360.700
T2 + ClinicalTest0.655 (0.391–0.889)0.5710.6000.5450.5460.600
T1 + T2 + ClinicalTest0.764 (0.510–0.971)0.7140.7000.7270.7000.727
CI: confidence interval; AUC: area under curve; ACC: accuracy; SEN: sensitivity; SPE: specificity; PPV: positive predictive value; NPV: negative predictive value.
Table 6. Performance of LR models on training and test sets.
Table 6. Performance of LR models on training and test sets.
LR ModelsSetAUC (95% CI)ACCSENSPEPPVNPV
T1Training0.820 (0.724–0.905)0.7620.6750.8410.7940.740
T2Training0.725 (0.609–0.829)0.6670.7000.6360.6360.700
T1 + T2Training0.799 (0.707–0.886)0.6550.9000.4320.5900.826
T1 + ClinicalTraining0.816 (0.704–0.905)0.7380.6750.7950.7500.729
T2 + ClinicalTraining0.730 (0.617–0.831)0.6310.7250.5450.5920.686
T1 + T2 + ClinicalTraining0.805 (0.704–0.892)0.6900.9000.5000.6210.846
T1Test0.709 (0.472–0.926)0.7140.7000.7270.7000.727
T2Test0.718 (0.469–0.945)0.7140.7000.7270.7000.727
T1 + T2Test0.791 (0.548–0.963)0.7620.8000.7270.7270.800
T1 + ClinicalTest0.664 (0.417–0.904)0.6670.7000.6360.6360.700
T2 + ClinicalTest0.727 (0.455–0.959)0.7140.7000.7270.7000.727
T1 + T2 + ClinicalTest0.791 (0.577–0.962)0.7620.8000.7270.7270.800
CI: confidence interval; AUC: area under curve; ACC: accuracy; SEN: sensitivity; SPE: specificity; PPV: positive predictive value; NPV: negative predictive value.
Table 7. Performance of LR models on training and test sets.
Table 7. Performance of LR models on training and test sets.
RF ModelsSetAUC (95% CI)ACCSENSPEPPVNPV
T1Training0.995 (0.985–1.000)0.9520.9500.9550.9500.955
T2Training0.972 (0.936–0.997)0.9170.9500.8860.8840.951
T1 + T2Training0.854 (0.773–0.929)0.7260.8750.5910.6600.839
T1 + ClinicalTraining0.994 (0.981–1.000)0.9520.9250.9770.9740.935
T2 + ClinicalTraining0.978 (0.949–0.998)0.9400.9250.9550.9490.933
T1 + T2 + ClinicalTraining0.842 (0.754–0.918)0.7140.8500.5910.6540.813
T1Test0.600 (0.357–0.847)0.5710.6000.5450.5460.600
T2Test0.600 (0.333–0.866)0.6670.7000.6360.6360.700
T1 + T2Test0.836 (0.618–1.000)0.8570.9000.8180.8180.900
T1 + ClinicalTest0.609 (0.327–0.846)0.6190.6000.6360.6000.636
T2 + ClinicalTest0.673 (0.422–0.900)0.5710.5000.6360.5560.583
T1 + T2 + ClinicalTest0.836 (0.611–1.000)0.8100.8000.8180.8000.818
CI: confidence interval; AUC: area under curve; ACC: accuracy; SEN: sensitivity; SPE: specificity; PPV: positive predictive value; NPV: negative predictive value.
Table 8. Performance of DL models on training and test sets.
Table 8. Performance of DL models on training and test sets.
DL ModelsSetAUC (95% CI)ACCSENSPEPPVNPV
T1Training0.895 (0.826–0.963)0.8450.8250.8640.8460.844
T2Training0.856 (0.767–0.945)0.8810.8250.9320.9170.854
T1 + T2Training0.977 (0.950–1.000)0.9400.9750.9090.9070.976
T1 + ClinicalTraining0.816 (0.723–0.910)0.7860.8750.7050.7290.861
T2 + ClinicalTraining0.957 (0.920–0.994)0.8810.9000.8640.8570.905
FusionTraining0.980 (0.956–1.000)0.9400.9500.9320.9270.953
T1Test0.718 (0.490–0.946)0.7140.7000.7270.7000.727
T2Test0.745 (0.514–0.977)0.7620.7000.8180.7780.750
T1 + T2Test0.827 (0.613–1.000)0.8570.9000.8180.8180.900
T1 + ClinicalTest0.745 (0.522–0.969)0.7620.8000.7270.7270.800
T2 + ClinicalTest0.800 (0.593–1.000)0.8100.8000.8180.8000.818
FusionTest0.891 (0.745–1.000)0.8570.8000.9090.8890.833
CI: confidence interval; AUC: area under curve; ACC: accuracy; SEN: sensitivity; SPE: specificity; PPV: positive predictive value; NPV: negative predictive value.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, X.; Zhang, H.; Fan, H.; Yang, X.; Fan, J.; Wu, P.; Ni, Y.; Hu, S. Multimodal MRI Deep Learning for Predicting Central Lymph Node Metastasis in Papillary Thyroid Cancer. Cancers 2024, 16, 4042. https://doi.org/10.3390/cancers16234042

AMA Style

Wang X, Zhang H, Fan H, Yang X, Fan J, Wu P, Ni Y, Hu S. Multimodal MRI Deep Learning for Predicting Central Lymph Node Metastasis in Papillary Thyroid Cancer. Cancers. 2024; 16(23):4042. https://doi.org/10.3390/cancers16234042

Chicago/Turabian Style

Wang, Xiuyu, Heng Zhang, Hang Fan, Xifeng Yang, Jiansong Fan, Puyeh Wu, Yicheng Ni, and Shudong Hu. 2024. "Multimodal MRI Deep Learning for Predicting Central Lymph Node Metastasis in Papillary Thyroid Cancer" Cancers 16, no. 23: 4042. https://doi.org/10.3390/cancers16234042

APA Style

Wang, X., Zhang, H., Fan, H., Yang, X., Fan, J., Wu, P., Ni, Y., & Hu, S. (2024). Multimodal MRI Deep Learning for Predicting Central Lymph Node Metastasis in Papillary Thyroid Cancer. Cancers, 16(23), 4042. https://doi.org/10.3390/cancers16234042

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop