Abstract
Different omics profiles, depending on the underlying technology, encompass measurements of several hundred to several thousand molecules in a biological sample or a cell. This study develops upon the concept of “omics imagification” as a process of transforming a vector representing these numerical measurements into an image with a one-to-one relationship with the corresponding sample. The proposed imagification process transforms a high-dimensional vector of molecular measurements into a two-dimensional RGB image to enable holistic molecular representation of a biological sample and to improve the classification of different biological phenotypes using automated image recognition methods in computer vision. A transformed image represents 2D coordinates of molecules in a neighbour-embedded space representing molecular abundance and gene intensity. The proposed method was applied to a single-cell RNA sequencing (scRNA-seq) data to “imagify” gene expression profiles of individual cells. Our results show that a simple convolutional neural network trained on single-cell transcriptomics images accurately classifies diverse cell types outperforming the best-performing scRNA-seq classifiers such as support vector machine and random forest.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data and code availability
All the datasets used in this study are public and accessible from Gene Expression Omnibus (GEO).The entire code base, including the python implementation of the proposed method and compared techniques, are available at https://github.com/VafaeeLab/Fotomics-Imagification
References
Abdelaal T et al (2019) A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol 20(1):1–19
Almufti SM (2019) Historical survey on metaheuristics algorithms. Int J Sci World 7(1):1
Asuncion A (2007) UCI machine learning repository, university of california, irvine, school of information and computer sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34(5):525–527
Brbić M et al (2020) MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat Methods 17(12):1200–1206
Conesa A, Beck S (2019) Making multi-omics data accessible to researchers. Sci Data 6(1):1–4
Corces MR et al (2016) Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat Genet 48(10):1193–1203
Deng Q, Ramsköld D, Reinius B, Sandberg R (2014) Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science (1979) 343(6167):193–196
der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605
Fan X et al (2015) Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos. Genome Biol 16(1):1–17
Franzén O, Gan L-M, Björkegren JLM (2019) PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database 19:baz06
Heckbert P (1995) Fourier transforms and the fast Fourier transform (FFT) algorithm. Comput Graph (ACM) 2:15–463
Johnson B, Shneiderman B (1998) Tree-maps: a space filling approach to the visualization of hierarchical information structures. In: Proceedings of IEEE Visualization
Juszczak P, Tax D, Duin RPW (2002) Feature scaling in support vector data description. In: Proceedings of ASCI, pp 95–102
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 45(D1):D353–D361
Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 4:580–585
Kiselev VY, Yiu A, Hemberg M (2018) scmap: projection of single-cell RNA-seq data across data sets. Nat Methods 15(5):359–362
Koch FC, Sutton GJ, Voineagu I, Vafaee F (2021) Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data. Brief Bioinform 22(6):bbab304
Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA (2015) The technology and biology of single-cell RNA sequencing. Mol Cell 58(4):610–620
Kovalerchuk B, Agarwal B, Kall DC (2020) Solving non-image learning problems by mapping to images. In: 2020 24th International Conference Information Visualisation (IV), pp 264–269
Krzak M, Raykov Y, Boukouvalas A, Cutillo L, Angelini C (2019) Benchmark and parameter sensitivity analysis of single-cell RNA sequencing clustering methods. Front Genet 10:1253
la Manno G et al (2016) Molecular diversity of midbrain development in mouse, human, and stem cells. Cell 167(2):566–580
Lall S, Ghosh A, Ray S, Bandyopadhyay S (2022) sc-REnF: an entropy guided robust feature selection for single-cell RNA-seq data. Brief Bioinform 23(2):bbab517
Lanczos C, Gellai B (1975) Fourier analysis of random sequences. Comput Math Appl 1(3–4):269–276
Li E et al (2019) Long-range interactions between proximal and distal regulatory regions in maize. Nat Commun 10(1):1–14
Lopez-Garcia G, Jerez JM, Franco L, Veredas FJ (2020) Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data. PLoS ONE 15(3):e0230536
Lyu B, Haque A (2018)Deep learning based tumor type classification using gene expression data. In: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp 89–96
Noble WS (2006) What is a support vector machine? Nat Biotechnol 24(12):1565–1567
Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222
Saelens W, Cannoodt R, Todorov H, Saeys Y (2019) A comparison of single-cell trajectory inference methods. Nat Biotechnol 37(5):547–554
Sharma A, Vans E, Shigemizu D, Boroevich KA, Tsunoda T (2019) DeepInsight: a methodology to transform a non-image data to an image for convolution neural network architecture. Sci Rep 9(1):1–7
Sharma A, Lysenko A, Boroevich KA, Vans E, Tsunoda T (2021) DeepFeature: feature selection in nonimage data using convolutional neural network. Brief Bioinform 22(6):bbab297
Sharma A, Kumar D (2020)Classification with 2-D Convolutional Neural Networks for breast cancer diagnosis. arXiv preprint arXiv:2007.03218
Svensson V (2020) Droplet scRNA-seq is not zero-inflated. Nat Biotechnol 38(2):147–150
T. M. Consortium (2018) Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562(7727):367–372
Tasic B et al (2016) Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat Neurosci 19(2):335–346
Weinstein S, Ebert P (1971) Data transmission by frequency-division multiplexing using the discrete Fourier transform. IEEE Trans Commun Technol 19(5):628–634
Xiong L et al (2019) SCALE method for single-cell ATAC-seq analysis via latent feature extraction. Nat Commun 10(1):1–10
Yan L et al (2013) Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat Struct Mol Biol 20(9):1131–1139
Yu L, Cao Y, Yang JYH, Yang P (2022) Benchmarking clustering algorithms on estimating the number of cell types from single-cell RNA-sequencing data. Genome Biol 23(1):1–21
Zandavi SM et al (2022) Disentangling single-cell omics representation with a power spectral density-based feature extraction. Nucleic Acids Res. https://doi.org/10.1093/nar/gkac436
Zeisel A et al (2015) Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science (1979) 347(6226):1138–1142
Zheng GXY et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8(1):1–12
Author information
Authors and Affiliations
Contributions
FV and SMZ conceived and led the study and guided the method development. SMZ and DL developed the Fotomics method and the corresponding python package. SMZ, DL and FV conducted the analyses and produced the results. FV and SMZ wrote the manuscript. FV produced images. VC and AA provided input on method evaluation. All authors reviewed the manuscript and approved it.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zandavi, S.M., Liu, D., Chung, V. et al. Fotomics: fourier transform-based omics imagification for deep learning-based cell-identity mapping using single-cell omics profiles. Artif Intell Rev 56, 7263–7278 (2023). https://doi.org/10.1007/s10462-022-10357-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10357-4