Quantitative Biology > Genomics

arXiv:2211.03553 (q-bio)

[Submitted on 7 Nov 2022 (v1), last revised 16 Feb 2023 (this version, v4)]

Title:Learning Causal Representations of Single Cells via Sparse Mechanism Shift Modeling

Authors:Romain Lopez, Nataša Tagasovska, Stephen Ra, Kyunghyn Cho, Jonathan K. Pritchard, Aviv Regev

View PDF

Abstract:Latent variable models such as the Variational Auto-Encoder (VAE) have become a go-to tool for analyzing biological data, especially in the field of single-cell genomics. One remaining challenge is the interpretability of latent variables as biological processes that define a cell's identity. Outside of biological applications, this problem is commonly referred to as learning disentangled representations. Although several disentanglement-promoting variants of the VAE were introduced, and applied to single-cell genomics data, this task has been shown to be infeasible from independent and identically distributed measurements, without additional structure. Instead, recent methods propose to leverage non-stationary data, as well as the sparse mechanism shift assumption in order to learn disentangled representations with a causal semantic. Here, we extend the application of these methodological advances to the analysis of single-cell genomics data with genetic or chemical perturbations. More precisely, we propose a deep generative model of single-cell gene expression data for which each perturbation is treated as a stochastic intervention targeting an unknown, but sparse, subset of latent variables. We benchmark these methods on simulated single-cell data to evaluate their performance at latent units recovery, causal target identification and out-of-domain generalization. Finally, we apply those approaches to two real-world large-scale gene perturbation data sets and find that models that exploit the sparse mechanism shift hypothesis surpass contemporary methods on a transfer learning task. We implement our new model and benchmarks using the scvi-tools library, and release it as open-source software at this https URL.

Comments:	Accepted at CLeaR (Causal Learning and Reasoning) 2023
Subjects:	Genomics (q-bio.GN); Machine Learning (cs.LG)
Cite as:	arXiv:2211.03553 [q-bio.GN]
	(or arXiv:2211.03553v4 [q-bio.GN] for this version)
	https://doi.org/10.48550/arXiv.2211.03553

Submission history

From: Natasa Tagasovska [view email]
[v1] Mon, 7 Nov 2022 15:47:40 UTC (971 KB)
[v2] Tue, 8 Nov 2022 12:44:03 UTC (966 KB)
[v3] Wed, 9 Nov 2022 22:04:16 UTC (966 KB)
[v4] Thu, 16 Feb 2023 22:31:44 UTC (1,507 KB)

Quantitative Biology > Genomics

Title:Learning Causal Representations of Single Cells via Sparse Mechanism Shift Modeling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Genomics

Title:Learning Causal Representations of Single Cells via Sparse Mechanism Shift Modeling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators