Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach

Gürkan Üstünkar¹,
Süreyya Özöğür-Akyüz²,
Gerhard W. Weber³,
Christoph M. Friedrich⁴ &
…
Yeşim Aydın Son⁵

323 Accesses
15 Citations
3 Altmetric
Explore all metrics

Abstract

After the completion of Human Genome Project in 2003, it is now possible to associate genetic variations in the human genome with common and complex diseases. The current challenge now is to utilize the genomic data efficiently and to develop tools to improve our understanding of etiology of complex diseases. Many of the algorithms needed to deal with this task were originally developed in management science and operations research (OR). One application is to select a subset of the Single Nucleotide Polymorphism (SNP) biomarkers from the whole SNP set that is informative and small enough for subsequent association studies. In this paper, we present an OR application for representative SNP selection that implements our novel Simulated Annealing (SA) based feature-selection algorithm. We hope that our work will facilitate reliable identification of SNPs that are involved in the etiology of complex diseases and ultimately support timely identification of genomic disease biomarkers and the development of personalized-medicine approaches and targeted drug discoveries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GP-Pi: Using Genetic Programming with Penalization and Initialization on Genome-Wide Association Study

Using penalized regression to predict phenotype from SNP data

Article Open access 17 September 2018

Preliminary Studies on Biclustering of GWA: A Multiobjective Approach

References

Alazamir, S., Rebennack, S., Pardalos, P.M.: Improving the neighborhood selection strategy in simulated annealing using optimal stopping problem. In: Tan, C.M. (ed.) Global Optimization: Focus on Simulated Annealing. Energy Systems, pp. 363–382. I-Tech Education and Publication (2008)
Bafna, V., Halldorsson, B.V., Schwartz, R., Clark, A.G.: Haplotypes and informative SNP selection algorithms: don’t block out information. In: Proceedings of the Seventh International Conference on Research in Computational Molecular Biology (2003)
Daly M.J., Rioux J.D., Schaffner S.F., Hudson T.J., Lander E.S.: High resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001)
Article Google Scholar
Floudas C., Pardalos P.M.: Optimization in Computational Chemistry and Molecular Biology—Local and Global Approaches. Kluwer, Dordrecht (2000)
MATH Google Scholar
Halperin E., Kimmel G., Shamir R.: Tag SNP selection in genotype data for maximizing SNP prediction accuracy. Bioinformatics 21, 195–203 (2005)
Article Google Scholar
Hampe J., Schreiber S., Krawczak M.: Entropy-based SNP selection for genetic association studies. Hum. Genet. 114, 36–43 (2003)
Article Google Scholar
Horne B., Camp N.J.: Principal component analysis for selection of optimal SNP-sets that capture intragenic genetic variation. Genet. Epidemiol. 26, 11–21 (2004)
Article Google Scholar
Howie B., Carlson C., Rieder M., Nickerson D.: Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Hum. Genet. 120, 58–68 (2006)
Article Google Scholar
Ke X., Cardon L.R.: Efficient selective screening of haplotype tag SNPs. Bioinformatics 19, 287–288 (2003)
Article Google Scholar
Kirkpatrick S., Gelatt C.D., Vecchi M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
Article MathSciNet MATH Google Scholar
Kruglyak L., Nickerson D.A.: Variation is the spice of life. Nat. Genet. 27, 234–236 (2001)
Article Google Scholar
Liu G., Wang Y., Wong L.: Fasttagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium. BMC Bioinformatics 11, 66 (2010)
Article Google Scholar
Liu L., Wu Y., Lonardi S., Jiang T.: Efficient genome-wide tagsnp selection across populations via the linkage disequilibrium criterion. J. Comput. Biol. (J. Computat. Mol. Cell Biol.) 17, 21–37 (2010)
Google Scholar
Mondaini R., Pardalos P.M.: Mathematical modelling of biosystems. Springer, Berlin (2001)
Google Scholar
Saccone S., Bolze R., Thomas P., Quan J., Mehta G., Deelman E., Tischfield J., Rice J.: Spot: a web-based tool for using biological databases to prioritize SNPs after a genome-wide association study. Nucleic Acids Res. 38, 201–209 (2010)
Article Google Scholar
Shastry B.S.: SNPs in disease gene mapping, medicinal drug development and evolution. J. Hum. Genet. 52, 871–880 (2007)
Article Google Scholar
Weale M.: Quality control for genome-wide association studies. Methods Mol. Biol. 628, 341–372 (2010)
Article Google Scholar
Xu Z., Taylor J.: SNPinfo: integrating gwas and candidate gene information into functional SNP selection for genetic association studies. Nucleic Acids Res. 37, 600–605 (2009)
Article Google Scholar
Zhang K., Qin Z., Chen T., Liu J.S., Waterman M.S., Sun F.: Hapblock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21, 131–134 (2005)
Article Google Scholar
Zhang P., Sheng H., Uehara R.: A double classification tree search algorithm for index SNP selection. BMC Bioinformatics 5, 89 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems, Informatics Institute, Middle East Technical University, Ankara, Turkey
Gürkan Üstünkar
Department of Mathematics and Computer Science, Bahçeşehir University, Istanbul, Turkey
Süreyya Özöğür-Akyüz
Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey
Gerhard W. Weber
Department of Computer Science, University of Applied Sciences and Arts, Dortmund, Germany
Christoph M. Friedrich
Department of Health Informatics, Informatics Institute, Middle East Technical University, Ankara, Turkey
Yeşim Aydın Son

Authors

Gürkan Üstünkar
View author publications
You can also search for this author in PubMed Google Scholar
Süreyya Özöğür-Akyüz
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard W. Weber
View author publications
You can also search for this author in PubMed Google Scholar
Christoph M. Friedrich
View author publications
You can also search for this author in PubMed Google Scholar
Yeşim Aydın Son
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gürkan Üstünkar.

Additional information

For the Alzheimer’s Disease Neuroimaging Initiative: Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.ucla.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.ucla.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Üstünkar, G., Özöğür-Akyüz, S., Weber, G.W. et al. Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach. Optim Lett 6, 1207–1218 (2012). https://doi.org/10.1007/s11590-011-0419-7

Download citation

Received: 01 October 2010
Accepted: 18 October 2011
Published: 01 November 2011
Issue Date: August 2012
DOI: https://doi.org/10.1007/s11590-011-0419-7

Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GP-Pi: Using Genetic Programming with Penalization and Initialization on Genome-Wide Association Study

Using penalized regression to predict phenotype from SNP data

Preliminary Studies on Biclustering of GWA: A Multiobjective Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GP-Pi: Using Genetic Programming with Penalization and Initialization on Genome-Wide Association Study

Using penalized regression to predict phenotype from SNP data

Preliminary Studies on Biclustering of GWA: A Multiobjective Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation