Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

scCAN: Clustering With Adaptive Neighbor-Based Imputation Method for Single-Cell RNA-Seq Data

Published: 29 January 2024 Publication History

Abstract

Single-cell RNA sequencing (scRNA-seq) is widely used to study cellular heterogeneity in different samples. However, due to technical deficiencies, dropout events often result in zero gene expression values in the gene expression matrix. In this paper, we propose a new imputation method called scCAN, based on adaptive neighborhood clustering, to estimate the zero value of dropouts. Our method continuously updates cell-cell similarity information by simultaneously learning similarity relationships, clustering structures, and imposing new rank constraints on the Laplacian matrix of the similarity matrix, improving the imputation of dropout zero values. To evaluate the performance of this method, we used four simulated and eight real scRNA-seq data for downstream analyses, including cell clustering, recovered gene expression, and reconstructed cell trajectories. Our method improves the performance of the downstream analysis and is better than other imputation methods.

References

[1]
E. A. Winkler et al., “A single-cell atlas of the normal and malformed human brain vasculature,” Science, vol. 375, no. 6584, 2022, Art. no.
[2]
Y. Cheng, Y. Gong, Y. Liu, B. Song, and Q. Zou, “Molecular design in drug discovery: A comprehensive review of deep generative models,” Brief. Bioinf., vol. 22, no. 6, 2021, Art. no.
[3]
S. H. Gohil, J. B. Iorgulescu, D. A. Braun, D. B. Keskin, and K. J. Livak, “Applying high-dimensional single-cell technologies to the analysis of cancer immunotherapy,” Nature Rev. Clin. Oncol., vol. 18, no. 4, pp. 244–256, 2021.
[4]
Y. Liu, X. Shen, Y. Gong, Y. Liu, B. Song, and X. Zeng, “Sequence alignment/map format: A comprehensive review of approaches and applications,” Brief. Bioinf., vol. 22, no. 5, 2023, Art. no.
[5]
Y. Wang, Y. Zhai, Y. Ding, and Q. Zou, “SBSM-Pro: Support bio-sequence machine for proteins,” 2023,.
[6]
T. Peng, Q. Zhu, P. Yin, and K. Tan, “SCRABBLE: Single-cell RNA-seq imputation constrained by bulk RNA-seq data,” Genome Biol., vol. 20, no. 1, pp. 1–12, 2019.
[7]
E. Papalexi and R. Satija, “Single-cell RNA sequencing to explore immune cell heterogeneity,” Nature Rev. Immunol., vol. 18, no. 1, pp. 35–45, 2018.
[8]
S. S. Potter, “Single-cell RNA sequencing for the study of development, physiology and disease,” Nature Rev. Nephrol., vol. 14, no. 8, pp. 479–492, 2018.
[9]
J. Kolasa and C. D. Rollo, “Introduction: The Heterogeneity of Heterogeneity: A Glossary,” in Ecological Heterogeneity. Berlin, Germany: Springer, 1991, pp. 1–23.
[10]
W. Hou, Z. Ji, H. Ji, and S. C. Hicks, “A systematic evaluation of single-cell RNA-sequencing imputation methods,” Genome Biol., vol. 21, no. 1, pp. 1–30, 2020.
[11]
T. Tang et al., “Machine learning on protein–protein interaction prediction: Models, challenges and trends,” Brief. Bioinf., vol. 24, no. 2, 2023, Art. no.
[12]
B. Song, X. Luo, X. Luo, Y. Liu, Z. Niu, and X. Zeng, ”Learning spatial structures of proteins improves protein–protein interaction prediction,” Brief. Bioinf., vol. 23, no. 2, 2022, Art. no.
[13]
X. Yang et al., “Modality-DTA: Multimodality fusion strategy for drug–target affinity prediction,” IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 20, no. 2, pp. 1200–1210, Mar./Apr. 2023.
[14]
L.-H. Ly and M. Vingron, “Effect of imputation on gene network reconstruction from single-cell RNA-seq data,” Patterns, vol. 3, no. 2, 2022, Art. no.
[15]
D. V. Dijk et al., “Recovering gene interactions from single-cell data using data diffusion,” Cell, vol. 174, no. 3, pp. 716–729, 2018.
[16]
M. Huang et al., “SAVER: Gene expression recovery for single-cell RNA sequencing,” Nature Methods, vol. 15, no. 7, pp. 539–542, 2018.
[17]
W. V. Li and J. J. Li, “An accurate and robust imputation method scimpute for single-cell RNA-seq data,” Nature Commun., vol. 9, no. 1, p. 997, 2018.
[18]
J. Liu, Y. Pan, Z. Ruan, and J. Guo, “SCDD: A novel single-cell RNA-seq imputation method with diffusion and denoising,” Brief. Bioinf., vol. 23, no. 5, 2022, Art. no.
[19]
G. C. Linderman et al., “Zero-preserving imputation of single-cell RNA-seq data,” Nature Commun., vol. 13, no. 1, pp. 1–11, 2022.
[20]
W. Gong, I.-Y. Kwak, P. Pota, N. Koyano-Nakagawa, and D. J. Garry, “DrImpute: Imputing dropout events in single cell RNA sequencing data,” BMC Bioinf., vol. 19, no. 1, pp. 1–10, 2018.
[21]
C. Arisdakessian, O. Poirion, B. Yunits, X. Zhu, and L. X. Garmire, “DeepImpute: An accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data,” Genome Biol., vol. 20, no. 1, pp. 1–14, 2019.
[22]
J. Ding, A. Condon, and S. P. Shah, “Interpretable dimensionality reduction of single cell transcriptome data with deep generative models,” Nature Commun., vol. 9, no. 1, pp. 1–13, 2018.
[23]
Y. Hu et al., “WEDGE: Imputation of gene expression values from single-cell RNA-seq datasets using biased matrix decomposition,” Brief. Bioinf., vol. 22, no. 5, 2021, Art. no.
[24]
Y. Gong, Z. Li, J. Zhang, W. Liu, B. Chen, and X. Dong, “A spatial missing value imputation method for multi-view urban statistical data,” in Proc. 29th Int. Conf. Int. Joint Conferences Artif. Intell., 2021, pp. 1310–1316.
[25]
Y. Gong, Z. Li, J. Zhang, W. Liu, Y. Yin, and Y. Zheng, “Missing value imputation for multi-view urban statistical data via spatial correlation learning,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 1, pp. 686–698, Jan. 2023.
[26]
J. Xu, L. Cai, B. Liao, W. Zhu, and J. Yang, “CMF-Impute: An accurate imputation tool for single-cell RNA-seq data,” Bioinformatics, vol. 36, no. 10, pp. 3139–3147, 2020.
[27]
F. Nie, X. Wang, and H. Huang, “Clustering and projected clustering with adaptive neighbors,” in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2014, pp. 977–986.
[28]
W. Wu and X. Ma, “Network-based structural learning nonnegative matrix factorization algorithm for clustering of scRNA-seq data,” IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 20, no. 1, pp. 566–575, Jan./Feb. 2023.
[29]
A. Ng, M. Jordan, and Y. Weiss, “On spectral clustering: Analysis and an algorithm,” in Proc. Adv. Neural Inf. Process. Syst., 2001, pp. 849–856.
[30]
Z. Li, J. Zhang, Q. Wu, Y. Gong, J. Yi, and C. Kirsch, “Sample adaptive multiple kernel learning for failure prediction of railway points,” in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 2848–2856.
[31]
J. Huang, F. Nie, and H. Huang, “Spectral rotation versus K-means in spectral clustering,” in Proc. AAAI Conf. Artif. Intell., 2013, pp. 431–437.
[32]
W. Wu, Z. Liu, and X. Ma, “jSRC: A flexible and accurate joint learning algorithm for clustering of single-cell RNA-sequencing data,” Brief. Bioinf., vol. 22, no. 5, 2021, Art. no.
[33]
Y.-X. Wang and Y.-J. Zhang, “Nonnegative matrix factorization: A comprehensive review,” IEEE Trans. Knowl. Data Dngineering, vol. 25, no. 6, pp. 1336–1353, Jun. 2013.
[34]
H. Wang and X. Ma, “Learning deep features and topological structure of cells for clustering of scRNA-sequencing data,” Brief. Bioinf., vol. 23, no. 3, 2022, Art. no.
[35]
D. Cai, X. He, J. Han, and T. S. Huang, “Graph regularized nonnegative matrix factorization for data representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 8, pp. 1548–1560, Aug. 2011.
[36]
R. Satija, J. A. Farrell, D. Gennert, A. F. Schier, and A. Regev, “Spatial reconstruction of single-cell gene expression data,” Nature Biotechnol., vol. 33, no. 5, pp. 495–502, 2015.
[37]
L. Tian et al., “Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments,” Nature Methods, vol. 16, no. 6, pp. 479–487, 2019.
[38]
H. Li et al., “Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors,” Nature Genet., vol. 49, no. 5, pp. 708–718, 2017.
[39]
J. G. Camp et al., “Multilineage communication regulates human liver bud development from pluripotency,” Nature, vol. 546, no. 7659, pp. 533–538, 2017.
[40]
M. Baron et al., “A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure,” Cell Syst., vol. 3, no. 4, pp. 346–360, 2016.
[41]
D. Usoskin et al., “Unbiased classification of sensory neuron types by large-scale single-cell rna sequencing,” Nature Neurosci., vol. 18, no. 1, pp. 145–153, 2015.
[42]
X. Qiu et al., “Reversed graph embedding resolves complex single-cell trajectories,” Nature Methods, vol. 14, no. 10, pp. 979–982, 2017.
[43]
Q. Deng, D. Ramsköld, B. Reinius, and R. Sandberg, “Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells,” Science, vol. 343, no. 6167, pp. 193–196, 2014.
[44]
L. Zappia, B. Phipson, and A. Oshlack, “Splatter: Simulation of single-cell RNA sequencing data,” Genome Biol., vol. 18, no. 1, pp. 1–15, 2017.
[45]
C. Dai et al., “scIMC: A platform for benchmarking comparison and visualization analysis of scRNA-seq data imputation methods,” Nucleic Acids Res., vol. 50, no. 9, pp. 4877–4899, 2022.

Index Terms

  1. scCAN: Clustering With Adaptive Neighbor-Based Imputation Method for Single-Cell RNA-Seq Data
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
          IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 21, Issue 1
          Jan.-Feb. 2024
          214 pages

          Publisher

          IEEE Computer Society Press

          Washington, DC, United States

          Publication History

          Published: 29 January 2024
          Published in TCBB Volume 21, Issue 1

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 37
            Total Downloads
          • Downloads (Last 12 months)37
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 18 Feb 2025

          Other Metrics

          Citations

          View Options

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media