Vorschau |
PDF, Englisch
- Hauptdokument
Download (9MB) | Nutzungsbedingungen |
Abstract
Background: Feature extraction and signature identification are two critical steps to understand diverse biological processes. Signatures are defined as groups of molecular features that are sufficient to identify certain genotype or phenotype. In particular, Non-negative Matrix Factorization (NMF) has been used to identify signatures in complex genomic datasets. However, running a basic NMF analysis is a challenging task with a steep learning curve and long computing time; furthermore, the usability of these algorithms is lessened by limited resources to interpret the results obtained from them. This creates a pressing need for the development of tools that mitigate such obstacles.
Results: In this study we developed ButchR and ShinyButchR, a fast and user-friendly toolkit to decompose datasets (slicing genomics) and learn signatures using NMF. The package can be freely installed from GitHub at https://github.com/wurst-theke/ButchRr. We used ButchR to identify a new regulatory subtype in neuroblastoma, which showed mesenchymal characteristics and was phenotypically associated to multipotent Schwann cell precursors. Additionally, we created a new workflow to infer regulatory relationships between genes and their _cis_-regulatory elements for individual cells, followed by inference of regulatory-signatures.
Conclusions: ButchR/ShinyButchR is an useful toolkit for analyzing multiple types of data, and inferring signatures that are able to capture relevant biological information. This toolkit is a new valuable resource to the scientific community, and it can be used to understand complex biological processes.
Dokumententyp: | Dissertation |
---|---|
Erstgutachter: | Brors, Prof. Dr. Benedikt |
Ort der Veröffentlichung: | Heidelberg |
Tag der Prüfung: | 21 September 2021 |
Erstellungsdatum: | 21 Okt. 2021 09:38 |
Erscheinungsjahr: | 2021 |
Institute/Einrichtungen: | Fakultät für Biowissenschaften > Dekanat der Fakultät für Biowissenschaften |
DDC-Sachgruppe: | 004 Informatik
500 Naturwissenschaften und Mathematik |
Normierte Schlagwörter: | nicht-negative Matrixfaktorisierung, Bioinformatik, Krebs <Medizin>, Neuroblastom, Genomik |