Abstract
Vestibular schwannomas (VS) are the most common tumor of the skull base with available treatment options that carry a risk of iatrogenic injury to the facial nerve, which can significantly impact patients’ quality of life. As facial nerve outcomes remain challenging to prognosticate, we endeavored to utilize machine learning to decipher predictive factors relevant to facial nerve outcomes following microsurgical resection of VS. A database of patient-, tumor- and surgery-specific features was constructed via retrospective chart review of 242 consecutive patients who underwent microsurgical resection of VS over a 7-year study period. This database was then used to train non-linear supervised machine learning classifiers to predict facial nerve preservation, defined as House-Brackmann (HB) I vs. facial nerve injury, defined as HB II–VI, as determined at 6-month outpatient follow-up. A random forest algorithm demonstrated 90.5% accuracy, 90% sensitivity and 90% specificity in facial nerve injury prognostication. A random variable (rv) was generated by randomly sampling a Gaussian distribution and used as a benchmark to compare the predictiveness of other features. This analysis revealed age, body mass index (BMI), case length and the tumor dimension representing tumor growth towards the brainstem as prognosticators of facial nerve injury. When validated via prospective assessment of facial nerve injury risk, this model demonstrated 84% accuracy. Here, we describe the development of a machine learning algorithm to predict the likelihood of facial nerve injury following microsurgical resection of VS. In addition to serving as a clinically applicable tool, this highlights the potential of machine learning to reveal non-linear relationships between variables which may have clinical value in prognostication of outcomes for high-risk surgical procedures.
Similar content being viewed by others
Introduction
Vestibular schwannomas (VS; formerly acoustic neuroma) are the most common tumor of the skull base with nearly 2500 new cases diagnosed in the US each year1 and accounting for 8% of all intracranial tumors2. VS continue to present a clinical conundrum in that their benign pathology affords a slow growth pattern with low likelihood of metastasis; however, local compression of cranial nerve VIII (the vestibulocochlear nerve) and the brainstem can result in hearing loss, dizziness, vertigo and in the worst cases, sudden death3. Furthermore, microsurgical resection carries an inherent risk of iatrogenic injury to each of these structures with similar clinical consequences. Microsurgery carries an additional risk of injury to the nearby cranial nerve VII (facial nerve) which may result in significant morbidity and impairment in quality of life. As such, the likelihood of damage to each of these structures from treatment must be weighed against the likelihood of developing complications from the natural history of tumor progression.
Historically, treatment decisions have been based on tumor size and growth patterns over time. Treatment is often considered when serial growth is observed on interval imaging, or patients develop neurological symptoms that correlate with tumor compression. Microsurgery remains the mainstay of treatment for large VS4. For lesions > 2.5 cm, vestibulocochlear nerve compression beyond salvageability is often encountered and patients are preemptively counseled that hearing preservation is unlikely5. However, facial nerve preservation remains an important goal of surgery. While larger tumors are generally associated with more difficult facial nerve dissection, few other factors that portend a higher risk of facial nerve dysfunction have been identified, and thus prognostication at the individual patient level remains relatively poor6. Even in situations where anatomic preservation of the nerve is achieved, stretching or other trauma from difficult dissection can lead to post-operative facial weakness. While machine learning algorithms have recently been developed in the domain of VS to aid in decisions regarding timing of treatment7 and likelihood of hearing preservation8, no study has yet applied this technology to discern the likelihood of facial nerve dysfunction, nor to develop a deeper understanding of the clinically relevant factors which may contribute to poorer facial nerve outcomes. We leveraged emerging machine learning approaches combined with the VS experience at two high-volume VS centers to develop an algorithm for prediction of facial nerve dysfunction in patients undergoing microsurgical resection of VS, based on patient, surgery, and tumor characteristics.
Methods
Database collection
This study was reviewed by the University of Pennsylvania Institutional Review Board, who determined that it met criteria for exemption from full ethical approval and subject informed consent. All methods were carried out in accordance with relevant guidelines and regulations. Data from the University of Pittsburgh were shared in accordance with executed Data Usage Agreements between the University of Pittsburgh and the University of Pennsylvania. We conducted a retrospective chart review of patients who had undergone retrosigmoid or translabyrinthine craniotomies for VS over a seven-year study period from 2014 to 2021 at three hospitals: the Hospital of the University of Pennsylvania, Pennsylvania Hospital, and the University of Pittsburgh Medical Center Presbyterian Hospital. We excluded patients who underwent retrosigmoid or translabyrinthine craniotomies for other pathologies, including lower cranial nerve schwannomas, meningiomas, chordomas, epidermoid cysts and brain metastasis. Two patients expired prior to the 6-month follow-up period and are thus not represented in this analysis. Data reviewed included patient demographic information, surgical reports, and pre- and post-operative magnetic resonance images.
Statistical methods
The primary outcome assessed was facial nerve function at 6-month follow-up. Facial nerve function was assessed on the basis of physician ratings of facial function, measured using the House-Brackman (HB) scale at 6-month post-operative follow-up visits. This was represented as a binary outcome variable, post-operative preserved facial nerve function (HB grade I) vs. post-operative facial nerve dysfunction (HB grades II–VI). Independent variables included patient-, tumor- and surgery-related characteristics, as described below.
Measurements of tumor dimensions were made relative to the porus acusticus and posterior petrous bone (Supplementary Fig. 1)9. These dimensions were selected due to their relationships to surgical corridors and in keeping with the goal of reproducibility in replicative efforts. Such measurements have also been shown to correlate well with volumetric analyses10. Measurements were made by two raters, and agreement was assessed by intraclass correlation coefficient (ICC) accounting for 2-way random effects11.
Normality of continuous variables was assessed using D’Agostino-Pearson’s test12, finding that measurement C and case length were normally distributed, and thus were compared between HB I and HB II–VI groups with independent samples t-test. In contrast, age, BMI, measurements A, B, and D were not normally distributed and thus statistical significance of comparisons between facial nerve outcome groups was assessed using a Mann–Whitney U test. Categorical variables (sex, laterality, tumor size represented as a binary measurement of ≥ 2.5 cm vs. < 2.5 cm greatest tumor dimension, and presence/absence of residual tumor) were evaluated for associations to the outcome using Chi-squared tests. All statistical tests were evaluated at a significance level of alpha = 0.05.
Machine learning classifier selection and training
Studies of machine learning proceed through certain regimented stages known as the machine learning lifecycle13. Although variations may exist based on the specific study and goals, in general, the lifecycle starts with data collection and pre-processing before proceeding through gathering of baseline descriptive statistical analysis (described above), classifier selection, model training, hyperparameter tuning, model testing and ultimately deployment with the subsequent collection of additional training examples for validation during deployment re-starting the cycle at data collection (Supplementary Fig. 2).
To guide classifier selection, the data were first visualized by class distribution on each feature axis using a pairplot (Supplementary Fig. 3). This demonstrated two important characteristics of the data: the class imbalance was likely significant enough to influence classifier performance and the data were not linearly separable on any two-dimensional feature axis plane. Given the relatively small size of the dataset, we applied the synthetic minority oversampling technique (SMOTE)14 to overcome class imbalance: this provided new training examples that would be useful in classifier training while equalizing the class distribution. Model training then proceeded with selection of a classifier that was suitable for the classification task while taking into account the restraints of the data. Because the data were not linearly separable, we selected non-linear classifiers, including the random forest, radial basis function (RBF) kernel support vector machine (SVM), and artificial neural network13 (Supplementary Fig. 2). Among these, the random forest classifier was selected for further development due to its superior accuracy in performing the classification task on the training data. The data were split for model training (90%) and subsequent testing (10%). While model tuning was attempted via hyperparameter optimization, the initial random forest model with hyperparameters based on the authors’ prior experience with similar classification tasks and patient datasets demonstrated the highest accuracy.
The validation dataset (n = 32 patients) consisted entirely of patients who underwent surgery at the University of Pennsylvania in the final year of the study, as this group had facial nerve outcomes assessed after initial algorithm development and thus were not included in the initial training and testing data sets. The same patient, tumor and surgery characteristics were collected for the 32 validation patients and the random forest algorithm was utilized to make predictions about which patients would have complete facial nerve preservation vs. those who would have any facial nerve dysfunction. Predicted outcomes were recorded and compared to actual 6-month facial nerve outcomes for this group of patients.
Results
Two-hundred and forty-two consecutive patients were identified who underwent microsurgical resection of VS over the specified time period. Of these, 206 (85%) had preserved facial nerve function (HB I), and 36 (15%) had any facial nerve dysfunction (HB II–VI). Summary statistics and tests of association for underlying differences in patient-, tumor-, and surgery-specific characteristics between outcome groups are shown in Table 1. Among the factors evaluated, none demonstrated a statistically significant difference between the HB I and HB II–VI groups when evaluated on the basis of linear comparisons of measures of centrality (i.e., means and medians). The ICC for tumor measurements was between 90 and 99% for all measurements (Supplementary Fig. 1, Supplementary Table 1).
When visualized in two dimensions, our data were not found to have linearly separable hyperplanes along any of the acquired feature axes (Supplementary Fig. 3). As such, non-linear supervised machine learning classifiers were tested as described in Methods (see also Supplementary Fig. 2). The random forest classifier performed well with an accuracy of 90.5%15. Given the goal of applying the classifier as a clinical tool, sensitivity and specificity were assessed on the test data, and were found to be 90% and 90%, respectively. The receiver-operating characteristic (ROC) curve is shown in Fig. 1A. A random sampling from a Gaussian distribution was generated as a random variable and used as a baseline to further evaluate which features were relevant in the random forest predictions: the resulting feature importances were computed and plotted (Fig. 1B). Relative to this baseline, the random forest classifier indicated a relatively greater importance of BMI, case length, age, and measurement B, representing the extent of brainstem compression, in facial nerve function prognostication. When tested on the validation data set, the model demonstrated 84% accuracy in predicting facial nerve function at 6 months post-operatively.
Discussion
Facial nerve injury is a morbid complication of treatment for VS, with downstream effects ranging from social stigmata, patient depression and reduced quality of life16,17, to corneal abrasions and ulcers from incomplete eye closure and loss of corneal sensation18. Other than tumor size, relatively little is understood about factors that may influence facial nerve outcomes in microsurgery for VS. The clinical impact of facial nerve injury and importance of facial nerve preservation is highlighted by the extensive literature exploring predictors of facial nerve injury19,20,21,22,23. We leveraged our multi-institutional experience at two centers with high volumes of VS patients and applied machine learning techniques to identify novel predictors of facial nerve injury in patients treated with microsurgery.
Machine learning technologies have recently undergone a resurgence alongside the development of computational tools for handling and storing the large amounts of data required for their meaningful and broad scale utilization13,24. The recognition that such tools can be used to glean novel trends from data that are not readily apparent from common descriptive statistical approaches makes their application within the clinical domain a valuable and ongoing endeavor25. Such a phenomenon can be seen in the present study where tests of association, comparing measures of centrality between outcome groups, did not identify any factors which significantly differed between patients with and without preserved facial function. In contrast, random forest feature importance analysis discerned four features—BMI, case length, age and the tumor dimension representing growth towards the brainstem (measurement B)—as being relevant in predicting 6-month facial nerve status. While further studies must be carried out to fully characterize the mechanistic role of these factors in facial nerve outcome, this demonstrates the utility of applying novel data science techniques to uncover non-linear interactions between variables which may have real-world, clinical relevance.
Tumor dimensions
As previously noted, tumor measurements utilized in our study were selected due to their relationships to surgical corridors, as well as having been shown to correlate well with tumor size by volumetric analysis in previous literature10. We found high ICC for all measurements, which was comparable to other reports in the literature on similar VS measurement tasks26,27. Although historically, an overall larger tumor size has been demonstrated to portend worse facial nerve function after microsurgical resection19,20,28,29,30, results of the present study identified the tumor dimension representing growth within the cerebellopontine angle between the mid-axis of the tumor and the brainstem as most predictive of facial nerve outcome. Our findings are consistent with prior literature, while providing further insight into possible mechanisms by which tumor size may influence facial nerve injury. A relatively larger tumor dimension within the cerebellopontine angle, between the brainstem and porus acusticus is postulated to result in more thinning and splaying of the facial nerve. This causes direct mechanical injury and makes the facial nerve more difficult to distinguish from tumor capsule and surrounding adherent arachnoid, placing the facial nerve at greater risk of iatrogenic injury31. Thus, our study builds on prior literature reporting greater tumor size as a predictor of facial nerve injury following vestibular schwannoma microsurgery, by suggesting that the tumor dimension representing growth within the cerebellopontine angle from the mid-axis of the tumor towards the brainstem has the greatest implication on facial nerve outcome. We did not identify any difference between our facial nerve preservation and facial nerve dysfunction groups when comparing this dimension. It is worth noting that we observed a relatively higher rate of Koos grade III and IV tumors compared to other published series, suggesting that this series may be skewed towards larger tumors overall. This may partially explain our inability to decipher a difference between facial nerve preservation and facial nerve injury groups based on tumor size. We anticipate that future studies including larger cohorts of patients might capture a relationship between facial nerve susceptibility to injury as this tumor dimension increases.
Age
Older patient age has been previously shown to be predictive of facial nerve dysfunction, similar to our own findings20,29, though this remains controversial. While some studies have found no significant relationship between post-operative facial nerve function and age32, our study and others have identified a trend towards increasing age influencing unfavorable facial nerve outcomes following vestibular schwannoma microsurgery33. Others reporting on this finding have hypothesized on the influence of frailty, burden of comorbidities, decreased neurologic reserve resulting in reduced facial nerve rehabilitation potential33, and the confounding influence of age itself on facial nerve grading given that skin laxity and thinning may contribute to worse grading and/or worsened manifestations of facial nerve paralysis in elderly patients34. We further hypothesize that the basis of this relationship might be less favorable tissue dissection planes in patients of advanced age, placing older patients at greater risk of iatrogenic facial nerve injury. Although further detailed analysis of the role of age in facial nerve outcome on patients undergoing vestibular schwannoma microsurgery is beyond the scope of the current study, further study would certainly be valuable to confirm and better characterize the nature of this relationship. Our study further demonstrated additional unique features predictive of facial nerve outcomes which have not been previously identified. Our hypotheses regarding the role of BMI and case length are discussed further below.
BMI
Interestingly, our model identified BMI and operative case length as being highly predictive of facial nerve outcome at 6 months post-operatively. To the best of our knowledge, these associations have not been clearly delineated in previous studies. One study examined facial nerve injury in the context of post-operative complications and the need for readmission or re-operation, finding no significant association to BMI35. However, as the authors note, facial nerve injury often occurs without the requirement for reoperation and readmission, thus is likely underrepresented in their analysis. Another study evaluated the influence of BMI on mean HB score pre-operatively (1.1 non-obese vs. 1.0 obese, p = 0.16) and post-operatively (1.9 non-obese vs. 1.7 obese, p = 0.32) finding no difference between obese and non-obese groups36. However, the timing of facial nerve function assessment is not clearly specified in this study and when facial function is modelled as a categorical variable (rather than continuous, summarized with mean HB scores), obese patients were more likely than non-obese patients to have HB scores equal to or greater than III (9.2% non-obese vs. 17.7% obese). The observed association between BMI and facial nerve dysfunction in our study may be seen as hypothesis-generating, and should be explored in future studies. It is possible that difficult surgical ergonomics in high-BMI patients make tumor dissection off of the facial nerve more difficult, placing patients at higher risk of dysfunction37,38,39. For example, in higher BMI patients, relatively higher mass of the neck and shoulder may further narrow an already small operative working corridor, which in addition to requiring less ergonomic positioning for tumor access, limits the dissection vectors and angles, and reduces range of motion and visibility. The increased utilization of endoscopes40 and exoscopes41 in lateral skull base surgery may eventually mitigate some of these constraints.
Case length
Operative duration is identified as a key factor associated with facial nerve outcome in microsurgical resection of vestibular schwannomas in the present study—to our knowledge, this is the first such description of this association, however, this is consistent with previous studies in which prolonged operative duration has been shown to be associated with a higher rate of complications42. Our observed association of increased operative length being associated with a higher likelihood of facial nerve dysfunction may be reflective in part of the known association between tumor size and facial nerve outcomes, as a result of larger tumors having longer average operative durations. However, given that larger overall tumor size and individual tumor measurements in three dimensions (parallel to the posterior petrous bone, between central axis of tumor and porus acusticus, and from porus acusticus to distalmost extent of tumor growth within the IAC) were not found to be predictive of facial nerve dysfunction, other factors which may increase case length should be considered and investigated in future studies as the underlying mechanism of this association. Factors such as tumor hypervascularity43, adherence to the facial nerve perineurium, and the direction of facial nerve displacement may be reflected among difference in operative length across patients, and thus contribute to the observed differential risk of facial nerve dysfunction as it relates to case length20. These factors may serve as a surrogate for dissection complexity. Lastly, it is important to recognize that this algorithm, as any machine learning/artificial intelligence tool, is limited by the inputs. As such, there may be other confounding variables that influence facial nerve injury risk which were not captured in our data or analysis. Further study will be critical to better understand the myriad factors which may influence the role of case length on facial nerve outcome in vestibular schwannoma microsurgery.
A major strength of this study is the inclusion of patient cohorts from three hospitals across two health systems, increasing the generalizability of the resulting model. The model demonstrates an expected performance decay from 90.5 to 84% when assessed on unseen data from one of the included institutions. This level of performance decay both demonstrates the low likelihood of overfitting of this model and the relative reliability of the model in the real world (clinical) context. While the current model demonstrates good accuracy while avoiding overfitting, we recognize that performance will continue to improve in the deployment phase as further data is collected at external sites and through future prospective validation with patient data from the participating institutions (Supplementary Fig. 2). While we appreciate the tremendous benefit of multi-center data collection to enhance reproducibility, generalizability and clinical translation of our algorithm, we also recognize that as we increase the number of participating centers and expand to include institutions outside of our region, hospital-related factors (setting, level of care, equipment, etc.) and surgeon-related factors (patient selection, preferred surgical approach, years of experience, etc.), will need to be considered and evaluated in this stage of deployment44.
A limitation of the present study is an overall small proportion of patients with facial nerve dysfunction, which likely limited the statistical significance of associations which may have clinical relevance, as well as our ability to further stratify patients into different grades of facial function (i.e. HB I–VI). As vestibular schwannoma is a relatively rare disease entity, expanding our database with each currently participating institution will occur at a rate of roughly 30–60 patients per year, thus increasing the time to build a dataset robust enough to meaningfully improve the model metrics and generalizability. However, we aim to overcome this limitation through dissemination of our results and the current iteration of the algorithm—we aim to expand this work to include additional intuitions both nationally and internationally with the goals of improving statistical power, and further increasing the generalizability of this work. As additional validation is performed, we anticipate that the machine learning lifecycle will re-start, including further iterations of model evaluation and tuning to further improve performance.
As previously noted, the current iteration of this algorithm was developed based on manual tumor measurements that have been shown to have strong reproducibility and correlation with volumetric analysis throughout the vestibular schwannoma literature. However, accelerated deployment could be expedited through automated tumor segmentation—several such promising tools have recently been developed for vestibular schwannoma, however, in all cases the authors acknowledge that these will require further validation before implementation45,46,47,48. This approach has shown significant promise in other medical contexts, particularly in developing strategies for automating chest X-ray review during the COVID-19 pandemic49,50, and in the identification of concerning vs. benign gastrointestinal polyps51,52. Lastly, as data science techniques are increasingly applied in medicine, no discussion of their implementation in this context is complete without considering the protection of patient privacy and confidentiality. The algorithm we present here is run locally and completely offline. However, cloud-based automation offers several advantages that must be weighed against the potential for data leakage—strategies for obviating security concerns while maintaining the flexibility, reliability, and accelerated deployment afforded by these tools are under development. A full discussion of such methods is beyond the scope of this paper, but can be further explored in recent works by Mei et al.53 and Wu et al.54, among others.
It is our goal that this algorithm will ultimately be utilized as a clinically valuable tool for stratifying an individual patient’s risk of facial nerve injury, aiding in pre-operative counseling about treatment approach (watchful waiting vs. radiosurgery vs. microsurgical resection) and timing. Importantly, the model was evaluated via accuracy, sensitivity and specificity given the common utilization of these as metrics of test performance in the clinical setting. In this specific context, we interpret the 90% accuracy to be excellent compared to the 85% accuracy which has been referenced as a benchmark of acceptable performance15—we further anticipate improved accuracy and generalizability performance (less performance decay), with the addition of validation examples during deployment. In addition, the sensitivity and specificity of 90% and 90% represent that the model performs equally well at predicting which patients are likely to have complete facial nerve preservation as it does at predicting which patients are likely to have facial nerve dysfunction. We anticipate that further validation through collaboration with additional centers which treat high volumes of vestibular schwannomas will continue to improve the model’s performance.
Recognizing that clinicians and patients with little to no computer programming background may find it cumbersome to implement the algorithm, we plan to develop a graphical user interface to facilitate ease of use in both exploratory and clinical settings. This concept has been applied in other areas of medicine to facilitate a user-friendly implementation of artificial intelligence in the clinical environment55,56.
Future directions
Traditionally, tumor size has been the single most important factor in counseling patients regarding their risk of facial nerve injury. Importantly, our findings suggest that additional patient-, tumor- and surgery-related factors might influence the likelihood of facial nerve injury in vestibular schwannoma microsurgery. The finding that the tumor dimension representing the mid-axis of the tumor to the brainstem is important in deciphering which patients are likely to experience facial nerve injury builds on existing literature which has found tumor size to be a critical determinant of facial nerve outcome by offering more granularity to the description of the potential role of tumor size. In addition, our finding that increased age is of relative importance in predicting facial nerve outcome adds to existing literature finding the same. Lastly, the findings that elevated BMI and longer case length are of relative importance in predicting the likelihood of facial nerve dysfunction following vestibular schwannoma microsurgery are novel to this study and hypothesis-generating. For all of the described factors, future validation in independent cohorts are worthwhile endeavors. In addition, further exploration of variables not represented in this study, but which might influence facial nerve outcome in vestibular schwannoma microsurgery may continue to build on the findings presented here towards improving patient outcomes.
Conclusions
Here, we have described the development of a multi-institutionally derived machine learning algorithm to predict the likelihood of facial nerve injury following microsurgical resection of VS. Our model demonstrated a high degree of accuracy, and was able to identify novel predictors of facial nerve dysfunction following microsurgical resection of VS. The model will be further developed as a clinical tool for predictions of facial nerve outcome. With the inclusion of additional national and international institutions to improve generalizability, our ultimate goal is to utilize this tool for counseling patients about surgical risk, and aid in surgical decision-making. More broadly, while further evaluation is necessary to fully understand the mechanistic implications of the features identified, this analysis has demonstrated the utility of machine learning in identifying clinically relevant factors which may otherwise evade elucidation via linear statistical methods, such as comparisons of measures of centrality.
Data availability
In accordance with the University of Pennsylvania Institutional Review Board requirements for study exemption, the research data supporting this project may not be publicly shared. De-identified data may be shared upon reasonable request following execution of a Data Usage Agreement. Please reach out to corresponding author, SMHA, with any inquiries.
References
Acoustic Neuroma. NORD. https://rarediseases.org/rare-diseases/acoustic-neuroma/ (accessed 29 Nov 2021).
Vestibular Schwannoma (Acoustic Neuroma) and Neurofibromatosis. NIDCD. https://www.nidcd.nih.gov/health/vestibular-schwannoma-acoustic-neuroma-and-neurofibromatosis (accessed 29 Nov 2021).
Mohammadi, A. & Jufas, N. Sudden death due to vestibular schwannoma: Caution in emergent management. Otol. Neurotol. 37(5), 564–567. https://doi.org/10.1097/mao.0000000000001004 (2016).
Lin, E. P. & Crane, B. T. The management and imaging of vestibular schwannomas. Am. J. Neuroradiol. 38(11), 2034. https://doi.org/10.3174/ajnr.A5213 (2017).
Ferroli, P., Bosio, L. & Broggi, M. Facial nerve sparing surgery for large vestibular schwannomas. Acta Neurochir. (Wien). 159(7), 1213–1218. https://doi.org/10.1007/s00701-017-3216-y (2017).
Roland, J. T. Jr. et al. Cranial nerve preservation in surgery for large acoustic neuromas. Skull Base. 14(2), 85–91. https://doi.org/10.1055/s-2004-828699 (2004).
Profant, O. et al. Decision making on vestibular schwannoma treatment: Predictions based on machine-learning analysis. Sci. Rep. 11(1), 18376. https://doi.org/10.1038/s41598-021-97819-x (2021).
Cha, D., Shin, S. H., Kim, S. H., Choi, J. Y. & Moon, I. S. Machine learning approach for prediction of hearing preservation in vestibular schwannoma surgery. Sci. Rep. 10(1), 7136. https://doi.org/10.1038/s41598-020-64175-1 (2020).
Slattery, W. H. 3rd., Fisher, L. M., Yoon, G., Sorensen, G. & Lev, M. Magnetic resonance imaging scanner reliability for measuring changes in vestibular schwannoma size. Otol. Neurotol. 24(4), 666–670. https://doi.org/10.1097/00129492-200307000-00022 (2003) (discussion 670–1).
Choi, Y. et al. Maximum diameter versus volumetric assessment for the response evaluation of vestibular schwannomas receiving stereotactic radiotherapy. Radiat. Oncol. J. 36(2), 114–121. https://doi.org/10.3857/roj.2018.00031 (2018).
Koo, T. K. & Li, M. Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012 (2016).
D’Agostino, R. & Pearson, E. S. Tests for departure from normality. Biometrika. 60(3), 613–622. https://doi.org/10.2307/2335012 (1973).
Sebastian Raschka, V. M. Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn and TensorFlow 2. 3rd edn (Packt Publishing Ltd., 2019).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002).
Wilson, R. C., Shenhav, A., Straccia, M. & Cohen, J. D. The eighty five percent rule for optimal learning. Nat. Commun. 10(1), 4646. https://doi.org/10.1038/s41467-019-12552-4 (2019).
Dobel, C., Miltner, W. H., Witte, O. W., Volk, G. F. & Guntinas-Lichius, O. Emotional impact of facial palsy. Laryngorhinootologie. 92(1), 9–23. https://doi.org/10.1055/s-0032-1327624 (2013) (Emotionale Auswirkungen einer Fazialisparese).
Saadi, R., Shokri, T., Schaefer, E., Hollenbeak, C. & Lighthall, J. G. Depression rates after facial paralysis. Ann. Plast. Surg. 83(2), 190–194. https://doi.org/10.1097/sap.0000000000001908 (2019).
Joseph, S. S. et al. Evaluation of patients with facial palsy and ophthalmic sequelae: A 23-year retrospective review. Ophthalmic Epidemiol. 24(5), 341–345. https://doi.org/10.1080/09286586.2017.1294186 (2017).
Ren, Y., MacDonald, B. V., Tawfik, K. O., Schwartz, M. S. & Friedman, R. A. Clinical predictors of facial nerve outcomes after surgical resection of vestibular schwannoma. Otolaryngol. Head Neck Surg. 164(5), 1085–1093. https://doi.org/10.1177/0194599820961389 (2021).
Sun, Y., Yang, J., Li, T., Gao, K. & Tong, X. Nomogram for predicting facial nerve outcomes after surgical resection of vestibular schwannoma. Front. Neurol. 12, 817071. https://doi.org/10.3389/fneur.2021.817071 (2021).
Tawfik, K. O., Alexander, T. H., Saliba, J., Mastrodimos, B. & Cueva, R. A. Predicting long-term facial nerve outcomes after resection of vestibular schwannoma. Otol. Neurotol. 41(10), e1328–e1332. https://doi.org/10.1097/mao.0000000000002883 (2020).
Troude, L. et al. Predictive factors of early postoperative and long-term facial nerve function after large vestibular schwannoma surgery. World Neurosurg. 127, e599–e608. https://doi.org/10.1016/j.wneu.2019.03.218 (2019).
Axon, P. R. & Ramsden, R. T. Intraoperative electromyography for predicting facial function in vestibular schwannoma surgery. Laryngoscope. 109(6), 922–926. https://doi.org/10.1097/00005537-199906000-00015 (1999).
Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380(14), 1347–1358. https://doi.org/10.1056/NEJMra1814259 (2019).
Karargyris, A. et al. Federated benchmarking of medical artificial intelligence with MedPerf. Nat. Mach. Intell. 5(7), 799–810. https://doi.org/10.1038/s42256-023-00652-2 (2023).
Erickson, N. J. et al. Koos classification of vestibular schwannomas: A reliability study. Neurosurgery. 85(3), 409–414. https://doi.org/10.1093/neuros/nyy409 (2019).
Lawson McLean, A. C., McLean, A. L. & Rosahl, S. K. Evaluating vestibular schwannoma size and volume on magnetic resonance imaging: An inter- and intra-rater agreement study. Clin. Neurol. Neurosurg. 145, 68–73. https://doi.org/10.1016/j.clineuro.2016.04.010 (2016).
Fenton, J. E., Chin, R. Y., Fagan, P. A., Sterkers, O. & Sterkers, J. M. Predictive factors of long-term facial nerve function after vestibular schwannoma surgery. Otol. Neurotol. 23(3), 388–392. https://doi.org/10.1097/00129492-200205000-00027 (2002).
Rivas, A. et al. A model for early prediction of facial nerve recovery after vestibular schwannoma surgery. Otol. Neurotol. 32(5), 826–833. https://doi.org/10.1097/MAO.0b013e31821b0afd (2011).
Moffat, D. A., Parker, R. A., Hardy, D. G. & Macfarlane, R. Factors affecting final facial nerve outcome following vestibular schwannoma surgery. J. Laryngol. Otol. 128(5), 406–415. https://doi.org/10.1017/s0022215114000541 (2014).
Raslan, A. M., Liu, J. K., McMenomey, S. O. & Delashaw, J. B. Jr. Staged resection of large vestibular schwannomas. J. Neurosurg. 116(5), 1126–1133. https://doi.org/10.3171/2012.1.Jns111402 (2012).
Bloch, O. et al. Factors associated with preservation of facial nerve function after surgical resection of vestibular schwannoma. J. Neuro-Oncol. 102(2), 281–286. https://doi.org/10.1007/s11060-010-0315-5 (2011).
Helal, A. et al. Differential impact of advanced age on clinical outcomes after vestibular schwannoma resection in the very elderly: Cohort study. Oper. Neurosurg. (Hagerstown). 21(3), 104–110. https://doi.org/10.1093/ons/opab170 (2021).
Macielak, R. J. et al. The effect of age on facial nerve recovery after vestibular schwannoma resection. Otol. Neurotol. 44(7), 725–729. https://doi.org/10.1097/mao.0000000000003937 (2023).
Murphy, M. E. et al. Morbid obesity increases risk of morbidity and reoperation in resection of benign cranial nerve neoplasms. Clin. Neurol. Neurosurg. 148, 105–109. https://doi.org/10.1016/j.clineuro.2016.06.020 (2016).
Lipschitz, N. et al. Obesity is not associated with postoperative complications after vestibular schwannoma surgery in a large single institution series. Otol. Neurotol. 40(10), 1373–1377. https://doi.org/10.1097/mao.0000000000002397 (2019).
Moss, E. L. et al. Impact of obesity on surgeon ergonomics in robotic and straight-stick laparoscopic surgery. J. Minim. Invasive Gynecol. 27(5), 1063–1069. https://doi.org/10.1016/j.jmig.2019.07.009 (2020).
Sers, R., Forrester, S., Zecca, M., Ward, S. & Moss, E. The impact of patient body mass index on surgeon posture during simulated laparoscopy. medRxiv. https://doi.org/10.1101/2020.11.24.20237123 (2020).
Sers, R., Forrester, S., Zecca, M., Ward, S. & Moss, E. Objective assessment of surgeon kinematics during simulated laparoscopic surgery: A preliminary evaluation of the effect of high body mass index models. Int. J. Comput. Assist. Radiol. Surg. 17(1), 75–83. https://doi.org/10.1007/s11548-021-02455-5 (2022).
Setty, P. et al. Endoscopic resection of vestibular schwannomas. J. Neurol. Surg. B Skull Base. 76(3), 230–238. https://doi.org/10.1055/s-0034-1543974 (2015).
Veldeman, M. et al. Three-dimensional exoscopic versus microscopic resection of vestibular schwannomas: A comparative series. Oper. Neurosurg. (Hagerstown). 24(5), 507–513. https://doi.org/10.1227/ons.0000000000000602 (2023).
Cheng, H. et al. Prolonged operative duration is associated with complications: A systematic review and meta-analysis. J. Surg. Res. 229, 134–144. https://doi.org/10.1016/j.jss.2018.03.022 (2018).
Teranishi, Y., Kohno, M., Sora, S., Sato, H. & Nagata, O. Hypervascular vestibular schwannomas: Clinical characteristics, angiographical classification, and surgical considerations. Oper. Neurosurg. 15(3), 251–261 (2018).
Jo, D. The interpretation bias and trap of multicenter clinical research. Korean J. Pain. 33(3), 199–200. https://doi.org/10.3344/kjp.2020.33.3.199 (2020).
Neve, O. M. et al. Fully automated 3D vestibular schwannoma segmentation with and without gadolinium-based contrast material: A multicenter, multivendor study. Radiol. Artif. Intell. 4(4), e210300. https://doi.org/10.1148/ryai.210300 (2022).
Shapey, J. et al. Data from: Segmentation of vestibular schwannoma from magnetic resonance imaging: An open annotated dataset and baseline algorithm. Sci. Data https://doi.org/10.7937/TCIA.9YTJ-5Q73 (2021).
Shapey, J. et al. An artificial intelligence framework for automatic segmentation and volumetry of vestibular schwannomas from contrast-enhanced T1-weighted and high-resolution T2-weighted MRI. J. Neurosurg. JNS. 134(1), 171–179. https://doi.org/10.3171/2019.9.JNS191949 (2021).
Wang, H., Qu, T., Bernstein, K., Barbee, D. & Kondziolka, D. Automatic segmentation of vestibular schwannomas from T1-weighted MRI with a deep neural network. Radiat. Oncol. 18(1), 78. https://doi.org/10.1186/s13014-023-02263-y (2023).
Su, H. et al. Multilevel threshold image segmentation for COVID-19 chest radiography: A framework using horizontal and vertical multiverse optimization. Comput. Biol. Med. 146, 105618. https://doi.org/10.1016/j.compbiomed.2022.105618 (2022).
Qi, A. et al. Directional mutation and crossover boosted ant colony optimization with application to COVID-19 X-ray image segmentation. Comput. Biol. Med. 148, 105810. https://doi.org/10.1016/j.compbiomed.2022.105810 (2022).
Hu, K. et al. Colorectal polyp region extraction using saliency detection network with neutrosophic enhancement. Comput. Biol. Med. 147, 105760. https://doi.org/10.1016/j.compbiomed.2022.105760 (2022).
Jiang, X. et al. BiFTransNet: A unified and simultaneous segmentation network for gastrointestinal images of CT & MRI. Comput. Biol. Med. 165, 107326. https://doi.org/10.1016/j.compbiomed.2023.107326 (2023).
Mei, Z. et al. Secure multi-dimensional data retrieval with access control and range query in the cloud. Inf. Syst. 122, 102343. https://doi.org/10.1016/j.is.2024.102343 (2024).
Wu, Z. et al. An effective method for the protection of user health topic privacy for health information services. World Wide Web. 26(6), 3837–3859. https://doi.org/10.1007/s11280-023-01208-5 (2023).
She, Y. et al. Development and validation of a deep learning model for non-small cell lung cancer survival. JAMA Netw. Open. 3(6), e205842. https://doi.org/10.1001/jamanetworkopen.2020.5842 (2020).
Ye, R. Z. et al. DeepImageTranslator: A free, user-friendly graphical interface for image translation using deep-learning and its applications in 3D CT image analysis. SLAS Technol. 27(1), 76–84. https://doi.org/10.1016/j.slast.2021.10.014 (2022).
Funding
This work was supported by a grant from the National Institutes of Health, USA (SMHA, NIH grant T32NS091006-07) and a Dean’s Master’s Scholarship from the University of Pennsylvania Department of Bioengineering (SMHA).
Author information
Authors and Affiliations
Contributions
S.M.H.A.: conceived and designed the study, collected data, contributed data or analysis tools, performed data analysis, wrote the manuscript. R.B.: collected data, contributed data or analysis tools, reviewed the manuscript. A.E.Q.: contributed data or analysis tools, reviewed the manuscript. H.A.: collected data, contributed data or analysis tools, reviewed the manuscript. E.M.S.: contributed data or analysis tools, reviewed the manuscript. D.C.: contributed data or analysis tools, reviewed the manuscript. T.H.: reviewed the manuscript. J.B.: reviewed the manuscript. M.J.R.: reviewed the manuscript. D.C.B.: reviewed the manuscript. C.J.: reviewed the manuscript. G.Z.: contributed data or analysis tools, reviewed the manuscript. P.G.: contributed data or analysis tools, reviewed the manuscript. S.E.B.: conceived and designed the study, reviewed the manuscript. Y.E.C.: contributed data or analysis tools, reviewed the manuscript. J.Y.K.L.: contributed data or analysis tools, reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Heman-Ackah, S.M., Blue, R., Quimby, A.E. et al. A multi-institutional machine learning algorithm for prognosticating facial nerve injury following microsurgical resection of vestibular schwannoma. Sci Rep 14, 12963 (2024). https://doi.org/10.1038/s41598-024-63161-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-63161-1