Abstract
As single-cell RNA sequencing (scRNA-seq) technology has rapidly become a powerful technique for revealing gene expression information at the cellular level. In scRNA-seq data analysis, cell clustering is a key step in downstream analysis as it can identify cell types and discover new cell subtypes. However, the high dimensionality, sparsity, and high noise characteristics of scRNA-seq datasets present significant challenges for clustering analysis.In this study, a model based on bipartite graph integration clustering and graph attention autoencoder is proposed. Firstly, the scRNA-seq dataset is preprocessed using network enhancement (NE) and principal component analysis (PCA) for denoising and feature selection. Next, a graph attention autoencoder is employed for dimension reduction to obtain low-dimensional embeddings. Finally, bipartite graph integration clustering is utilized to derive the final clustering results based on the relationship between cells and low-dimensional embeddings. Based on various clustering metrics, a comparison was made between different scRNA-seq datasets, and the experimental results showed that scBAGA outperformed other advanced methods. This indicates that our model can serve as a reliable classification tool.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Shapiro, E., Biezuner, T., Linnarsson, S.: Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat. Rev. Genet. 14, 618–630 (2013)
Kolodziejczyk, A.A., Kim, J.K., Svensson, V., et al.: The Technology and Biology of Single-Cell RNA Sequencing. Mol. Cell 58, 610–620 (2015)
Yuan, L., Zhao, J., Shen, Z., et al.: ICircDA-NEAE: Accelerated attribute network embedding and dynamic convolutional autoencoder for circRNA-disease associations prediction. PLoS Comput. Biol. 19, e1011344 (2023)
Yuan, L., Zhao, J., Sun, T., et al.: A machine learning framework that integrates multi-omics data predicts cancer-related LncRNAs. BMC Bioinform. 22, 332 (2021)
Yuan, L., Guo, L.-H., Yuan, C.-A., et al.: Integration of multi-omics data for gene regulatory network inference and application to breast cancer. IEEE/ACM Trans. Computat. Biol. Bioinform. 16, 782–791 (2018)
Yuan, L., Zhu, L., Guo, W.-L., et al.: Nonconvex penalty based low-rank representation and sparse regression for eQTL mapping. IEEE/ACM Trans. Comput. Biol. Bioinform. 14, 1154–1164 (2016)
Shen, Z., Shao, Y.L., Liu, W., et al.: Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks. BMC Genomics 23, 581 (2022)
Li, X., Wang, K., Lyu, Y., et al.: Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 11, 2338 (2020)
Eraslan, G., Simon, L.M., Mircea, M., et al.: Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10, 390 (2019)
Tian, T., Wan, J., Song, Q., et al.: Clustering single-cell RNA-seq data with a model-based deep learning approach. Nat. Mach. Intell. 1, 191–198 (2019)
Chen, L., Wang, W., Zhai, Y., et al.: Deep soft K-means clustering with self-training for single-cell RNA sequence data. NAR Genomics and Bioinform. 2 (2020)
Bach, K., Pensa, S., Grzelak, M., et al.: Differentiation dynamics of mammary epithelial cells revealed by single-cell RNA sequencing. Nat. Commun. 8, 2128 (2017)
Chen, R., Wu, X., Jiang, L., et al.: Single-Cell RNA-Seq Reveals Hypothalamic Cell Diversity. Cell Rep. 18, 3227–3241 (2017)
Muraro, M.J., Dharmadhikari, G., Grün, D., et al.: A Single-Cell Transcriptome Atlas of the Human Pancreas. Cell Syst. 3, 385-394.e383 (2016)
Romanov, R.A., Zeisel, A., Bakker, J., et al.: Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes. Nat. Neurosci. 20, 176–188 (2017)
Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nat. 562, 367–372 (2018)
Wolf, F.A., Angerer, P., Theis, F.J.: SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 1–5 (2018)
Wang, B., Pourshafeie, A., Zitnik, M., et al.: Network enhancement as a general method to denoise weighted biological networks. Nat. Commun. 9, 3108 (2018)
Cheng, Y., Ma, X.: ScGAC: a graph attentional architecture for clustering single-cell RNA-seq data. Bioinform. 38, 2187–2193 (2022)
Wong, J.A.H.: Algorithm AS 136: A K-Means Clustering Algorithm. J. R. Stat. Soc. 28, 100–108 (1979)
Wang, Y., Yu, Z., Li, S., et al.: scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering. Bioinform. 39 (2023)
Satija, R., Farrell, J.A., Gennert, D., et al.: Spatial reconstruction of single-cell gene expression data. Nat. Biotech. 33, 495–502 (2015)
Acknowledgement
This work was supported by the National Natural Science Foundation of China (No. 62002189), the Natural Science Foundation of Shandong Province, China (No. ZR2020QF038), the Ability Improvement Project of Science and Technology SMES in Shandong Province (2023TSGC0279), the Youth Innovation Team of Colleges and Universities in Shandong Province (2023KJ329) and the Qilu University of Technology (Shandong Academy of Sciences) Talent Scientific Research Project (No. 2023RCKY128).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yuan, L. et al. (2024). Cluster Analysis of Scrna-Seq Data Combining Bioinformatics with Graph Attention Autoencoders and Ensemble Clustering. In: Huang, DS., Pan, Y., Zhang, Q. (eds) Advanced Intelligent Computing in Bioinformatics. ICIC 2024. Lecture Notes in Computer Science(), vol 14882. Springer, Singapore. https://doi.org/10.1007/978-981-97-5692-6_6
Download citation
DOI: https://doi.org/10.1007/978-981-97-5692-6_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5691-9
Online ISBN: 978-981-97-5692-6
eBook Packages: Computer ScienceComputer Science (R0)