On Comprehensive Mass Spectrometry Data Analysis for Proteome Profiling of Human Blood Samples

Sameer Manchanda¹,
Mikaela Meyer²,
Qianqian Li³,
Kai Liang³,
Yan Li³ &
…
Nan Kong ORCID: orcid.org/0000-0002-4047-3414⁴

580 Accesses
2 Altmetric
Explore all metrics

Abstract

To guarantee meaningful interpretation of data in basic and translational medicine, it is critical to ensure the quality of biological samples. Mass spectrometers have become promising instruments to acquire proteomic information that is known to be associated with the quality of samples. However, a universally applicable mass spectrometry data analysis platform for quality assessment remains of great need. We present a comprehensive pattern recognition study to facilitate the development of such a platform. This study involves feature extraction, binary classification, and feature ranking. In this study, we develop classifiers with classification accuracy higher than 90% in distinguishing human serum samples stored for different amounts of time. We also derive fingerprint patterns of serum peptides that can be conveniently used for temporal classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature Selection and Machine Learning with Mass Spectrometry Data

Disease Profiling by MALDI MS Analysis of Biofluids

Mass Spectrometry-Based Serum Proteomics for Biomarker Discovery and Validation

References

Ayache S et al (2006) Effects of storage time and exogenous protease inhibitors on plasma protein levels. Am J Clin Pathol 126(2):174. https://doi.org/10.1309/3WM7XJ7RD8BCLNKX
Article Google Scholar
Baggerly KA, Morris JS, Coombes KR (2004) Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments. Bioinformatics 20(5):777–785
Article Google Scholar
Ball G et al (2002) An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics 18(3):395–404
Article MathSciNet Google Scholar
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York, Inc., Secaucus isbn: 0387310738
MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Carvalho PC et al (2008) Identifying differences in protein expression levels by spectral counting and feature selection. Genet Mol Res 7(2):342
Article Google Scholar
Chaigneau C et al (2007) Serum biobank certification and the establishment of quality controls for biological fluids: examples of serum biomarker stability after temperature variation. Clin Chem Lab Med 45(10):1390–1395
Article Google Scholar
Datta S, DePadilla LM (2006) Feature selection and machine learning with mass spectrometry data for distinguishing cancer and noncancer samples. Stat Methodol 3(1):79–92
Article MathSciNet MATH Google Scholar
Jackson DH, Banks RE (2010) Banking of clinical samples for proteomic biomarker studies: a consideration of logistical issues with a focus on pre-analytical variation. Proteomics Clin Appl 4(3):250–270
Article Google Scholar
Jenkins MA (2004) Quality control and quality assurance aspects of the routine use of capillary electrophoresis for serum and urine proteins in clinical laboratories. Electrophoresis 25(10–11):1555–1560
Article Google Scholar
Kozak KR et al (2003) Identification of biomarkers for ovarian cancer using strong anion-exchange ProteinChips: potential use in diagnosis and prognosis. Proc Natl Acad Sci 100(21):12343–12348
Article Google Scholar
Levner I (2005) Feature selection and nearest centroid classification for protein mass spectrometry. BMC Bioinformatics 6(1):1
Article MathSciNet Google Scholar
Liang K et al (2016) Mesoporous silica chip: enabled peptide profiling as an effective platform for controlling bio-sample quality and optimizing handling procedure. Clin Proteomics 13(1):34. issn: 1559–0275. https://doi.org/10.1186/s12014-016-9134-9
Article Google Scholar
Ostroff R et al (2010) The stability of the circulating human proteome to variations in sample collection and handling procedures measured with an aptamer-based proteomics array. J Proteomics 73(3):649–666
Article Google Scholar
Papadopoulos MC et al (2004) A novel and accurate diagnostic test for human African trypanosomiasis. Lancet 363(9418):1358–1363
Article Google Scholar
Petricoin EF et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306):572–577
Article Google Scholar
Pieragostino D et al (2010) Pre-analytical factors in clinical proteomics investigations: impact of ex vivo protein modifications for multiple sclerosis biomarker discovery. J Proteomics 73(3):579–592. Blood Proteomics, issn: 1874–3919. https://doi.org/10.1016/j.jprot.2009.07.014 http://www.sciencedirect.com/science/article/pii/S1874391909002395
Article Google Scholar
Rai AJ et al (2005) HUPO Plasma Proteome Project specimen collection and handling: towards the standardization of parameters for plasma proteome samples. Proteomics 5(13):3262–3277
Article Google Scholar
Russell SJ et al (2003) Artificial intelligence: a modern approach. Vol. 2. Prentice hall, Upper Saddle River
Google Scholar
Sorace JM, Zhan M (2003) A data review and re-assessment of ovarian cancer serum proteomic profiling. BMC Bioinformatics 4(1):1
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Society Ser B (Methodol) 267–288
MathSciNet MATH Google Scholar
Tibshirani R et al (2004) Sample classification from protein mass spectrometry, by ‘peak probability contrasts’. Bioinformatics 20(17):3034–3044
Article Google Scholar
Veenstra TD et al (2005) Biomarkers: mining the biofluid proteome. Mol Cell Proteomics 4(4):409–418. https://doi.org/10.1074/mcp.M500006-MCP200 eprint: http://www.mcponline.org/content/4/4/409.full.pdf+html. url: http://www.mcponline.org/content/4/4/409.abstract
Article Google Scholar
Villanueva J, Philip J, Chaparro CA, Li Y, Toledo-Crow R, DeNoyer L, Fleisher M, Robbins RJ, Tempst P (2005) Correcting common errors in identifying cancer-specific serum peptide signatures. J Proteome Res 4(4):1060–1072
Article Google Scholar
Wagner M, Naik D, Pothen A (2003) Protocols for disease classification from mass spectrometry data. Proteomics 3(9):1692–1698
Article Google Scholar
Won Y et al (2003) Pattern analysis of serum proteome distinguishes renal cell carcinoma from other urologic diseases and healthy persons. Proteomics 3(12):2310–2316
Article Google Scholar
Wu B et al (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19(13):1636–1643
Article Google Scholar
Yasui Y et al (2003) A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics 4(3):449–463
Article MATH Google Scholar
Yu JS et al (2005) Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data. Bioinformatics 21(10):2200–2209
Article Google Scholar
Zhang X et al (2006) Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data. BMC Bioinformatics 7(1):1
Article MathSciNet Google Scholar

Download references

Funding

This study received financial support from NSF grant DMS#1246818 and an industry grant from the Chinese Academy of Sciences Holding Co., Ltd.

Author information

Authors and Affiliations

Department of Computer Science, Purdue University, West Lafayette, USA
Sameer Manchanda
Department of Statistics and Mathematics, Purdue University, West Lafayette, IN, USA
Mikaela Meyer
Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
Qianqian Li, Kai Liang & Yan Li
Weldon School of Biomedical Engineering, Purdue University, 206 S. Martin Jischke Dr, West Lafayette, IN, 47906, USA
Nan Kong

Authors

Sameer Manchanda
View author publications
You can also search for this author in PubMed Google Scholar
Mikaela Meyer
View author publications
You can also search for this author in PubMed Google Scholar
Qianqian Li
View author publications
You can also search for this author in PubMed Google Scholar
Kai Liang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Li
View author publications
You can also search for this author in PubMed Google Scholar
Nan Kong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nan Kong.

Appendix

Before storage, the samples were left at room temperature for 1 h in order to allow coagulation and then centrifuged at 4° C for 15 min at 1400×g. In order to avoid fluid in the buffy-coat layer, serum was aspirated and collected in polypropylene tubes. After aliquoting, the samples were then stored in one of two conditions, room temperature or 4° C. For both cohorts, each sample’s mass spectrometer data was collected the day the sample was taken and then 1, 2, 5, and 10 days after that. This data was collected using a 1-μL sample that was processed by a mesoporous silicon wafer that was prepared by pre-baking in an oven at 120° C. This sample was spotted on the MALDI target plate and then allowed to air-dry. Afterwards, a 1-μL matrix in 50% acetonitrile containing 0.1% TFA was spotted on the dried sample spot. This sample was allowed to co-crystallize. The mass spectrum data was obtained by using a SHIMADZU AXIMA Resonance MALDI-IT-TOF equipped with a nitrogen laser emitting light at 337 nm. It had an adjustable mass range of 800 to 4000 Da. The positive ion was detected under reflective mode. After taking 500 laser shots, the spectra were usually averaged to find the final sample spectrum. The optimized accelerating voltage was 50 kV.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Manchanda, S., Meyer, M., Li, Q. et al. On Comprehensive Mass Spectrometry Data Analysis for Proteome Profiling of Human Blood Samples. J Healthc Inform Res 2, 305–318 (2018). https://doi.org/10.1007/s41666-018-0022-0

Download citation

Received: 12 May 2017
Revised: 15 April 2018
Accepted: 20 April 2018
Published: 22 May 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s41666-018-0022-0

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection and Machine Learning with Mass Spectrometry Data

Disease Profiling by MALDI MS Analysis of Biofluids

Mass Spectrometry-Based Serum Proteomics for Biomarker Discovery and Validation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

On Comprehensive Mass Spectrometry Data Analysis for Proteome Profiling of Human Blood Samples

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection and Machine Learning with Mass Spectrometry Data

Disease Profiling by MALDI MS Analysis of Biofluids

Mass Spectrometry-Based Serum Proteomics for Biomarker Discovery and Validation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation