Nothing Special   »   [go: up one dir, main page]

Directly to content
  1. Publishing |
  2. Search |
  3. Browse |
  4. Recent items rss |
  5. Open Access |
  6. Jur. Issues |
  7. DeutschClear Cookie - decide language by browser settings

Learning the Parts of Omics: Inference of Molecular Signatures with Non-negative Matrix Factorization

Quintero Moreno, Andrés Felipe

[thumbnail of thesis.pdf]
Preview
PDF, English - main document
Download (9MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

Abstract

Background: Feature extraction and signature identification are two critical steps to understand diverse biological processes. Signatures are defined as groups of molecular features that are sufficient to identify certain genotype or phenotype. In particular, Non-negative Matrix Factorization (NMF) has been used to identify signatures in complex genomic datasets. However, running a basic NMF analysis is a challenging task with a steep learning curve and long computing time; furthermore, the usability of these algorithms is lessened by limited resources to interpret the results obtained from them. This creates a pressing need for the development of tools that mitigate such obstacles.

Results: In this study we developed ButchR and ShinyButchR, a fast and user-friendly toolkit to decompose datasets (slicing genomics) and learn signatures using NMF. The package can be freely installed from GitHub at https://github.com/wurst-theke/ButchRr. We used ButchR to identify a new regulatory subtype in neuroblastoma, which showed mesenchymal characteristics and was phenotypically associated to multipotent Schwann cell precursors. Additionally, we created a new workflow to infer regulatory relationships between genes and their _cis_-regulatory elements for individual cells, followed by inference of regulatory-signatures.

Conclusions: ButchR/ShinyButchR is an useful toolkit for analyzing multiple types of data, and inferring signatures that are able to capture relevant biological information. This toolkit is a new valuable resource to the scientific community, and it can be used to understand complex biological processes.

Document type: Dissertation
Supervisor: Brors, Prof. Dr. Benedikt
Place of Publication: Heidelberg
Date of thesis defense: 21 September 2021
Date Deposited: 21 Oct 2021 09:38
Date: 2021
Faculties / Institutes: The Faculty of Bio Sciences > Dean's Office of the Faculty of Bio Sciences
DDC-classification: 004 Data processing Computer science
500 Natural sciences and mathematics
Controlled Keywords: nicht-negative Matrixfaktorisierung, Bioinformatik, Krebs <Medizin>, Neuroblastom, Genomik
About | FAQ | Contact | Imprint |
OA-LogoDINI certificate 2013Logo der Open-Archives-Initiative