“Super Gene Set” Causal Relationship Discovery from Functional Genomics Data
Pages 1991 - 1998
Abstract
In this article, we present a computational framework to identify “causal relationships” among super gene sets. For “causal relationships,” we refer to both stimulatory and inhibitory regulatory relationships, regardless of through direct or indirect mechanisms. For super gene sets, we refer to “pathways, annotated lists, and gene signatures,” or PAGs. To identify causal relationships among PAGs, we extend the previous work on identifying PAG-to-PAG regulatory relationships by further requiring them to be significantly enriched with gene-to-gene co-expression pairs across the two PAGs involved. This is achieved by developing a quantitative metric based on PAG-to-PAG Co-expressions PPC, which we use to infer the likelihood that PAG-to-PAG relationships under examination are causal—either stimulatory or inhibitory. Since true causal relationships are unknown, we approximate the overall performance of inferring causal relationships with the performance of recalling known r-type PAG-to-PAG relationships from causal PAG-to-PAG inference, using a functional genomics benchmark dataset from the GEO database. We report the area-under-curve AUC performance for both precision and recall being 0.81. By applying our framework to a myeloid-derived suppressor cells MDSC dataset, we further demonstrate that this framework is effective in helping build multi-scale biomolecular systems models with new insights on regulatory and causal links for downstream biological interpretations.
References
[1]
W. Huang da, B. T. Sherman, and R. A. Lempicki, "Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists," Nucleic Acids Res., vol. 37, no. 1, pp. 1-13, 2009.
[2]
P. Khatri, M. Sirota, and A. J. Butte, "Ten years of pathway analysis: Current approaches and outstanding challenges," PLoS Comput. Biol., vol. 8, no. 2, 2012. Art. no. e1002375.
[3]
I. Dinu, et al., "Gene-set analysis and reduction," Brief Bioinf., vol. 10, no. 1, pp. 24-34, 2009.
[4]
D. Nam and S. Y. Kim, "Gene-set approach for expression pattern analysis," Brief Bioinf., vol. 9, no. 3, pp. 189-97, 2008.
[5]
C. Gene Ontology, et al., "Gene ontology annotations and resources," Nucleic Acids Res., vol. 41, no. Database issue, pp. D530-D535, 2013.
[6]
W. Huang da, et al., "DAVID Bioinformatics resources: Expanded annotation database and novel algorithms to better extract biology from large gene lists," Nucleic Acids Res., vol. 35, no. Web Server issue, pp. W169-W175, 2007.
[7]
S. Draghici, et al., "Onto-tools, the toolkit of the modern biologist: Onto-express, onto-compare, onto-design, and onto-translate," Nucleic Acids Res., vol. 31, no. 13, pp. 3775-3781, 2003.
[8]
G. F. Berriz, et al., "Characterizing gene sets with FuncAssociate," Bioinf., vol. 19, no. 18, pp. 2502-2504, 2003.
[9]
A. Subramanian, et al., "Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles," Proc. Nat. Acad. Sci. United States America, vol. 102, no. 43, pp. 15545-15550, 2005.
[10]
E. Glaab, et al., "EnrichNet: Network-based gene set enrichment analysis," Bioinf., vol. 28, no. 18, pp. i451-i457, 2012.
[11]
M. V. Kuleshov, et al., "Enrichr: A comprehensive gene set enrichment analysis web server 2016 update," Nucleic Acids Res., vol. 44, no. W1, pp. W90-W97, 2016.
[12]
P. Khatri, et al., "Profiling gene expression using onto-express," Genomics, vol. 79, no. 2, pp. 266-270, 2002.
[13]
Y. Ben-Shaul, H. Bergman, and H. Soreq, "Identifying subtle interrelated changes in functional gene categories using continuous measures of gene expression," Bioinf., vol. 21, no. 7, pp. 1129- 1137, 2005.
[14]
K. Virtaneva, et al., "Expression profiling reveals fundamental biological differences in acute myeloid leukemia with isolated trisomy 8 and normal cytogenetics," Proc. Nat. Acad. Sci. United States America, vol. 98, no. 3, pp. 1124-1129, 2001.
[15]
L. Tian, et al., "Discovering statistically significant pathways in expression profiling studies," Proc. Nat. Acad. Sci. United States America, vol. 102, no. 38, pp. 13544-13549, 2005.
[16]
Z. Yue, et al., "PAGER: Constructing PAGs and new PAG-PAG relationships for network biology," Bioinf., vol. 31, no. 12, pp. i250-i257, 2015.
[17]
J. Y. Chen, et al., "Towards constructing "super gene sets" regulatory networks," in Proc. IEEE Int. Conf. Bioinf. Biomed., pp. 294-298, 2016.
[18]
A. C. Culhane, et al., "GeneSigDB: A manually curated database and resource for analysis of gene expression signatures," Nucleic Acids Res., vol. 40, no. Database issue, pp. D1060-D1066, 2012.
[19]
A. Liberzon, et al., "Molecular signatures database (MSigDB) 3.0," Bioinf., vol. 27, no. 12, pp. 1739-1740, 2011.
[20]
M. Kanehisa, et al., "KEGG: New perspectives on genomes, pathways, diseases and drugs," Nucleic Acids Res., vol. 45, no. D1, pp. D353-D361, 2017.
[21]
A. Fabregat, et al., "The reactome pathway knowledgebase," Nucleic Acids Res., 2017.
[22]
P. Martini, et al., "Along signal paths: An empirical gene set approach exploiting pathway topology," Nucleic Acids Res., vol. 41, no. 1, 2013, Art. no. e19.
[23]
D. Pepe and M. Grassi, "Investigating perturbed pathway modules from gene expression data via structural equation models," BMC Bioinf., vol. 15, 2014, Art. no. 132.
[24]
Institute, N. H. G. R. Biological Pathway. Jun. 12, 2012 [Online]. Available: http://www.genome.gov/27530687
[25]
P. Hieter and M. Boguski, "Functional genomics: It's all how you read it," Sci., vol. 278, no. 5338, pp. 601-602, 1997.
[26]
Z. Yue, et al., "PAGER 2.0: An update to the pathway, annotatedlist and gene-signature electronic repository for human network biology," Nucleic Acids Res., vol. 46, no. D1, pp. D668-D676, 2018.
[27]
T. D. Pfister, et al., "Topoisomerase I levels in the NCI-60 cancer cell line panel determined by validated ELISA and microarray analysis and correlation with indenoisoquinoline sensitivity," Mol. Cancer Ther., vol. 8, no. 7, pp. 1878-1884, 2009.
[28]
K. W. Kohn, et al., "Gene expression correlations in human cancer cell lines define molecular interaction networks for epithelial phenotype," PLoS One, vol. 9, no. 6, 2014, Art. no. e99269.
[29]
M. T. Weirauch, "Gene coexpression networks for the analysis of DNA microarray data," Applied Statistics for Network Biology: Methods in Systems Biology, Hoboken, NJ, USA: Wiley, 2011, pp. 215-250.
[30]
J. Rice, Mathematical Statistics and Data Analysis. 2006, Scarborough, ON, Canada: Nelson Education.
[31]
C. Cimen Bozkus, et al., "Expression of cationic amino acid transporter 2 is required for myeloid-derived suppressor cell-mediated control of T cell immunity," J. Immunol., vol. 195, no. 11, pp. 5237- 5250, 2015.
[32]
R. K. Do, et al., "Attenuation of apoptosis underlies B lymphocyte stimulator enhancement of humoral immune response," J. Exp. Med., vol. 192, no. 7, pp. 953-964, 2000.
[33]
M. Yoneyama, et al., "Shared and unique functions of the DExD/ H-box helicases RIG-I, MDA5, and LGP2 in antiviral innate immunity," J. Immunol., vol. 175, no. 5, pp. 2851-2858, 2005.
[34]
S. Sakaguchi, et al., "FOXP3+ regulatory T cells in the human immune system," Nat. Rev. Immunol., vol. 10, no. 7, pp. 490-500, 2010.
[35]
E. J. Wherry, et al., "Lineage relationship and protective immunity of memory CD8 T cell subsets," Nat. Immunol., vol. 4, no. 3, pp. 225-234, 2003.
[36]
C. Müller-Schmah, et al., "Immune response as a possible mechanism of long-lasting disease control in spontaneous remission of MLL/AF9-positive acute myeloid leukemia," Annals Hematology, vol. 91, no. 1, pp. 27-32, 2012.
[37]
S. Freytag, et al., "Systematic noise degrades gene co-expression signals but can be corrected," BMC Bioinf., vol. 16, 2015, Art. no. 309.
[38]
J. M. Stuart, et al., "A gene-coexpression network for global discovery of conserved genetic modules," Sci., vol. 302, no. 5643, pp. 249-255, 2003.
- “Super Gene Set” Causal Relationship Discovery from Functional Genomics Data
Recommendations
Gene–disease relationship discovery based on model-driven data integration and database view definition
Motivation: Computational methods are widely used to discover gene–disease relationships hidden in vast masses of available genomic and post-genomic data. In most current methods, a similarity measure is calculated between gene annotations and known ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
Copyright © 2018.
Publisher
IEEE Computer Society Press
Washington, DC, United States
Publication History
Published: 01 November 2018
Published in TCBB Volume 15, Issue 6
Qualifiers
- Research-article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 28Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Reflects downloads up to 03 Oct 2024
Other Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in