Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Searching high-order SNP combinations for complex diseases based on energy distribution difference

Published: 01 May 2015 Publication History

Abstract

Single nucleotide polymorphisms, a dominant type of genetic variants, have been used successfully to identify defective genes causing human single gene diseases. However, most common human diseases are complex diseases and caused by gene-gene and gene-environment interactions. Many SNP-SNP interaction analysis methods have been introduced but they are not powerful enough to discover interactions more than three SNPs. The paper proposes a novel method that analyzes all SNPs simultaneously. Different from existing methods, the method regards an individual's genotype data on a list of SNPs as a point with a unit of energy in a multi-dimensional space, and tries to find a new coordinate system where the energy distribution difference between cases and controls reaches the maximum. The method will find different multiple SNPs combinatorial patterns between cases and controls based on the new coordinate system. The experiment on simulated data shows that the method is efficient. The tests on the real data of age-related macular degeneration (AMD) disease show that it can find out more significant multi-SNP combinatorial patterns than existing methods.

References

[1]
D. Botstein and N. Risch, "Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease," Nat. Genetics, vol. 33, pp. 228- 237, 2003.
[2]
J. Hoh, A. Wille, and J. Ott, "Trimming, weighting, and grouping SNPS in human case-control association studies," Genome Res., vol. 11, no. 12, pp. 2115-2119, 2001.
[3]
Z. Cai, H. Sabaa, Y. Wang, R. Goebel, Z. Wang, J. Xu, P. Stothard, and G. Lin, "Most parsimonious haplotype allele sharing determination," BMC Bioinformatics, vol. 10, no. 1, p. 115, 2009.
[4]
H. Sabaa, Z. Cai, Y. Wang, and R. Goebel, "Whole genome identity-by-descent determination," J. Bioinformat. Comput. Biol., vol. 11, no. 2, p. 1350002, 2012.
[5]
X. Guo, X. Ding, Y. Meng, and Y. Pan, "Cloud computing for de novo metagenomic sequence assembly," in Bioinformatics Research and Applications (ser. Lecture Notes in Computer Science), Z. Cai, O. Eulenstein, D. Janies, and D. Schwartz, Eds, vol. 7875. New York, NY, USA: Springer, 2013, pp. 185-198.
[6]
Y. Wang, Z. Cai, P. Stothard, S. S. Moore, R. Goebel, L. Wang, and G. Lin, "Fast accurate missing SNP genotype local imputation," BMC Res. Notes, vol. 5, no. 1, p. 404, 2012.
[7]
G. S. Sellick, C. Longman, J. Tolmie, R. Newbury-Ecob, L. Geenhalgh, S. Hughes, M. Whiteford, C. Garrett, and R. S. Houlston, "Genomewide linkage searches for mendelian disease loci can be efficiently conducted using high-density SNP genotyping arrays," Nucleic Acids Res, vol. 32, no. 20, p. e164, 2004.
[8]
A. R. Pico, I. V. Smirnov, J. S. Chang, R.-F. Yeh, J. L. Wiemels, J. K. Wiencke, T. Tihan, B. R. Conklin, and M. Wrensch, "Snplogic: An interactive single nucleotide polymorphism selection, annotation, and prioritization system," Nucleic Acids Res., vol. 37, no. suppl 1, pp. D803-D809, 2009.
[9]
I. Dinu, S. Mahasirimongkol, Q. Liu, H. Yanai, N. S. Eldin, E. Kreiter, X. Wu, S. Jabbari, K. Tokunaga, and Y. Yasui, "SNP-SNP interactions discovered by logic regression explain Crohn's disease genetics," PloS One, vol. 7, no. 10, p. e43035, 2012.
[10]
P. Koefoed, O. A. Andreassen, B. Bennike, H. Dam, S. Djurovic, T. Hansen, M. B. Jorgensen, L. V. Kessing, I. Melle, G. L. Møller, O. Mors, T. Werge, and E. Mellerup, "Combinations of SNPs related to signal transduction in bipolar disorder," PloS One, vol. 6, no. 8, p. e23812, 2011.
[11]
X. Ding, W. Wang, X. Peng, and J. Wang, "Mining protein complexes from ppi networks using the minimum vertex cut," Tsinghua Sci. Technol., vol. 17, no. 6, pp. 674-681, 2012.
[12]
J. Wang, M. Li, J. Chen, and Y. Pan, "A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks," IEEE/ACM Trans. Comput. Biol. Bioinformat, vol. 8, no. 3, pp. 607-620, May/Jun. 2011.
[13]
J. Wang, M. Li, H. Wang, and Y. Pan, "Identification of essential proteins based on edge clustering coefficient," IEEE/ACM Trans. Comput. Biol. Bioinformat., vol. 9, no. 4, pp. 1070-1080, Jul./Aug. 2012.
[14]
D. Brinza, "Discrete algorithms for analysis of genotype data," Ph.D. Dissertation, Dept. Comput. Sci., Georgia State Univ., Atlanta, GA, USA, 2007.
[15]
X. Wan, C. Yang, Q. Yang, H. Xue, X. Fan, N. L. Tang, and W. Yu, "Boost: A fast approach to detecting gene-gene interactions in genome-wide case-control studies," Amer. J. Human Genetics, vol. 87, no. 3, p. 325, 2010.
[16]
J. Lehár, A. Krueger, G. Zimmermann, and A. Borisy, "High-order combination effects and biological robustness," Mol. Syst. Biol., vol. 4, no. 1, p. 215, 2008.
[17]
M. Nelson, S. Kardia, R. Ferrell, and C. Sing, "A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation," Genome Res., vol. 11, no. 3, pp. 458-470, 2001.
[18]
M. D. Ritchie, L. W. Hahn, N. Roodi, L. R. Bailey, W. D. Dupont, F. F. Parl, and J. H. Moore, "Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer," Amer. J. Human Genetics, vol. 69, no. 1, pp. 138-147, 2001.
[19]
L. W. Hahn, M. D. Ritchie, and J. H. Moore, "Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions," Bioinformatics, vol. 19, no. 3, pp. 376-382, 2003.
[20]
X.-Y. Lou, G.-B. Chen, L. Yan, J. Z. Ma, J. Zhu, R. C. Elston, and M. D. Li, "A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence," Amer. J. Human Genetics, vol. 80, no. 6, pp. 1125-1137, 2007.
[21]
S.-J. Wu, L.-Y. Chuang, Y.-D. Lin, W.-H. Ho, F.-T. Chiang, C.-H. Yang, and H.-W. Chang, "Particle swarm optimization algorithm for analyzing SNP-SNP interaction of renin-angiotensin system genes against hypertension," Mol. Biol. Reports, vol. 40, no. 7, pp. 4227-4233, 2013.
[22]
W. Mao and J. Lee, "A combinatorial analysis of genetic data for Crohn's disease," in Proc. 1st Int. Conf. Bioinformat. Biomed. Eng., 2007, pp. 1031-1034.
[23]
T. Curk, G. Rot, and B. Zupan, "Snpsyn: Detection and exploration of SNP-SNP interactions," Nucleic Acids Res., vol. 39, no. suppl 2, pp. W444-W449, 2011.
[24]
A. Gyenesei, J. Moody, A. Laiho, C. A. Semple, C. S. Haley, and W.-H. Wei, "Biforce toolbox: Powerful high-throughput computational analysis of gene-gene interactions in genomewide association studies," Nucleic Acids Res., vol. 40, no. W1, pp. W628-W632, 2012.
[25]
L.-Y. Chuang, M.-C. Lin, H.-W. Chang, and C.-H. Yang, "Analysis of SNP interaction combinations to determine breast cancer risk with PSO," in Proc. IEEE 11th Int. Conf. Bioinformat. Bioeng., 2011, pp. 291-294.
[26]
Y. Zhang, "A novel bayesian graphical model for genome-wide multi-SNP association mapping," Genetic Epidemiol., vol. 36, no. 1, pp. 36-47, 2012.
[27]
S. J. Winham, C. L. Colby, R. R. Freimuth, X. Wang, M. de Andrade, M. Huebner, and J. M. Biernacka, "SNP interaction detection with random forests in high-dimensional genetic data," BMC bioinformatics, vol. 13, no. 1, p. 164, 2012.
[28]
D. Kim, S. Uhmn, and J. Kim, "Finding relevant SNP sets and predicting disease risk using simulated annealing," Int. J. Softw. Eng. Its Appl., vol. 6, no. 3, p. 81, 2012.
[29]
G. Fang, M. Haznadar, W. Wang, H. Yu, M. Steinbach, T. R. Church, W. S. Oetting, B. Van Ness, and V. Kumar, "High-order SNP combinations associated with complex diseases: Efficient discovery, statistical power and functional interactions," PloS One, vol. 7, no. 4, p. e33531, 2012.
[30]
D. Brinza, J. He, and A. Zelikovsky, "Combinatorial search methods for multi-SNP disease association," in Proc. 28th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2006, pp. 5802-5805.
[31]
M. Xie, J. Li, and T. Jiang, "Detecting genome-wide epistases based on the clustering of relatively frequent items," Bioinformatics, vol. 28, no. 1, pp. 5-12, 2012.
[32]
X. Guo, Y. Meng, N. Yu, and Y. Pan, "Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering," BMC Bioinformat., vol. 15, no. 1, p. 102, 2014.
[33]
T. Schpbach, I. Xenarios, S. Bergmann, and K. Kapur, "Fastepistasis: A high performance computing solution for quantitative trait epistasis," Bioinformatics, vol. 26, no. 11, pp. 1468-1469, 2010.
[34]
S. Prabhu and I. Pe'er, "Ultrafast genome-wide scan for SNP-SNP interactions in common complex disease," Genome Res., vol. 22, no. 11, pp. 2230-2240, Nov. 2012.
[35]
M. Xie, J. Wang, and J. Chen, "A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors," Bioinformatics, vol. 24, no. 13, pp. i105-i113, 2008.
[36]
M. Xie, J. Chen, and J. Wang, "Research on parameterized algorithms of the individual haplotyping problem," J. Bioinformat. Comput. Biol., vol. 5, no. 3, pp. 795-816, 2007.
[37]
M. Xie, J. Wang, and J. Chen, "A practical parameterised algorithm for the individual haplotyping problem MLF," Math. Structure Comput. Sci., vol. 20, no. 5, pp. 851-863, 2010.
[38]
M. Xie, J. Wang, and T. Jiang, "A fast and accurate algorithm for single individual haplotyping," BMC Syst. Biol., vol. 6, no. Suppl 2, p. S8, 2012.
[39]
M. Dewey and E. Seneta, "Carlo emilio bonferroni," in Statisticians of the Centuries. New York, NY, USA: Springer, 2001, pp. 411-414.
[40]
R. J. Klein, C. Zeiss, E. Y. Chew, J.-Y. Tsai, R. S. Sackler, C. Haynes, A. K. Henning, J. P. SanGiovanni, S. M. Mane, S. T. Mayne, M. B. Bracken, F. L. Ferris, J. Ott, C. Barnstable, and J. Hoh, "Complement factor h polymorphism in age-related macular degeneration," Science, vol. 308, no. 5720, pp. 385-389, 2005.
[41]
L. J. Kopplin, R. Igo, Y. Wang, T. A. Sivakumaran, S. A. Hagstrom, N. S. Peachey, P. J. Francis, M. L. Klein, J. P. SanGiovanni, E. Y. Chew, G. J. T. Pauer, G. M. Sturgill, T. Joshi, L. Tian, Q. Xi, A. K. Henning, K. E. Lee, R. Klein, B. E. K. Klein, and S. K. Iyengar, "Genome-wide association identifies skiv2l and myrip as protective factors for age-related macular degeneration," Genes Immunity, vol. 11, no. 8, pp. 609-621, 2010.
[42]
X. Liu, P. Zhao, S. Tang, F. Lu, J. Hu, C. Lei, X. Yang, Y. Lin, S. Ma, J. Yang, D. Zhang, Y. Shi, T. Li, Y. Chen, Y. Fan, and Z. Yang, "Association study of complement factor h, c2, cfb, and c3 and age-related macular degeneration in a han chinese population," Retina, vol. 30, no. 8, pp. 1177-1184, 2010.
[43]
T. K. Ng, L. J. Chen, D. T. Liu, P. O. Tam, W. M. Chan, K. Liu, Y. J. Hu, K. K. Chong, C. S. Lau, S. W. Chiang, D. S. C. Lam, and C. P. P. Pang, "Multiple gene polymorphisms in the complement factor h gene are associated with exudative age-related macular degeneration in chinese," Investigative Ophthalmol. Vis. Sci., vol. 49, no. 8, pp. 3312-3317, 2008.
[44]
T. A. Sivakumaran, R. P. Igo Jr, J. M. Kidd, A. Itsara, L. J. Kopplin, W. Chen, S. A. Hagstrom, N. S. Peachey, P. J. Francis, M. L. Klein, E. Y. Chew, V. L. Ramprasad, W.-T. Tay, P. Mitchell, M. Seielstad, D. E. Stambolian, A. O. Edwards, K. E. Lee, D. V. Leontiev, G. Jun, Y. Wang, L. Tian, F. Qiu, A. K. Henning, T. LaFramboise, P. Sen, M. Aarthi, R. George, R. Raman, M. K. Das, L. Vijaya, G. Kumaramanickavel, T. Y. Wong, A. Swaroop, G. R. Abecasis, R. Klein, B. E. K. Klein, D. A. Nickerson, E. E. Eichler, and S. K. Iyengar, "A 32 kb critical region excluding y402h in CFH mediates risk for agerelated macular degeneration," PloS One, vol. 6, no. 10, p. e25598, 2011.
[45]
L. J. Chen, D. Liu, P. Tam, W. M. Chan, K. Liu, K. Chong, D. Lam, and C. P. Pang, "Association of complement factor h polymorphisms with exudative age-related macular degeneration," Mol. Vis., vol. 12, no. 5, p. 1536, 2006.
[46]
W.-Y. Lin and W.-C. Lee, "Incorporating prior knowledge to facilitate discoveries in a genome-wide association study on age-related macular degeneration," BMC Res. Notes, vol. 3, no. 1, p. 26, 2010.
[47]
L. Dong, Y. Qu, H. Jiang, H. Dai, F. Zhou, X. Xu, H. Bi, X. Pan, and G. Dang, "Correlation of complement factor h gene polymorphisms with exudative age-related macular degeneration in a chinese cohort," Neurosci. Lett., vol. 488, no. 3, pp. 283-287, 2011.
[48]
W. Tang, X. Wu, R. Jiang, and Y. Li, "Epistatic module detection for case-control studies: A bayesian model with a Gibbs sampling strategy," PLoS Genetics, vol. 5, no. 5, p. e1000464, May 2009.
[49]
R. Jiang, W. Tang, X. Wu, and W. Fu, "A random forest approach to the detection of epistatic interactions in case-control studies," BMC Bioinformat., vol. 10, no. Suppl 1, p. S65, 2009.

Cited By

View all
  • (2019)Genome-Wide Analysis of MDR and XDR Tuberculosis from BelarusIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2017.272066916:4(1398-1408)Online publication date: 1-Jul-2019
  • (2016)A compressed sensing based two-stage method for detecting epistatic interactionsInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2016.07582114:4(354-372)Online publication date: 1-Apr-2016

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 May 2015
Accepted: 06 October 2014
Revised: 31 August 2014
Received: 14 May 2014
Published in TCBB Volume 12, Issue 3

Author Tags

  1. SNP combinations
  2. SNP-SNP interactions
  3. age-related macular degeneration
  4. case-control study

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Genome-Wide Analysis of MDR and XDR Tuberculosis from BelarusIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2017.272066916:4(1398-1408)Online publication date: 1-Jul-2019
  • (2016)A compressed sensing based two-stage method for detecting epistatic interactionsInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2016.07582114:4(354-372)Online publication date: 1-Apr-2016

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media