Abstract
Based on the 38,899 pieces of H1N1 virus protein sequences from 1902 to 2013 in the world, the 1805 H1N1 virus sequences with HA and NA protein are selected according to viruses occurred at the same time and place. A new representation of feature vector for protein sequences is proposed by the physicochemical properties of amino acids and coarse graining theories. The 20 kinds of amino acids are divided into 4 classes and connected with each other to construct 16-dimensional feature vectors to represent HA and NA protein sequence, respectively. The whole protein sequence is represented by a 32-dimensional feature vector, which combines the feature vectors of HA and NA protein sequences, and the optimal cluster of the H1N1 influenza virus is obtained by the structural clustering. The relationship between HA and NA protein structures and the outbreak of H1N1 virus protein sequences is analyzed by selecting the representative elements and constructing evolutionary tree. The results show that the new representation of feature vector for protein sequences is reasonable, and large amount of data confirms that HA and NA protein sequences play a direct and important role in the outbreak of H1N1 influenza virus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anhlan, D., Grundmann, N., Makalowski, W.: Origin of the 1918 pandemic H1N1 Influenza A virus as studied by codon usage patterns and phylogenetic analysis. RNA 17(1), 64–73 (2001)
Fraser, C., Donnelly, C.A., Cauchemez, S.: Pandemic potential of A strain of Influenza A(H1N1): early findings. Science 324(5934), 1557–1561 (2009)
Gao, J., Zhang, L., Jin, P.-X.: Influenza pandemic early warning research on HA/NA protein sequences. Curr. Bioinform. 9(3), 228–233 (2014)
Rudneva, I.A., Kovaleva, V.P.: Influenza A virus reassortants with surface glycoprotein genes of the avian parent virus: effects of HA and Na gene Combinations on Virus aggregation. Arch. Virol. 133(3–4), 437–450 (1993)
Bhoumik, P., Hughes, A.L.: Ressortment of ancient neuraminidase and recent hemagglutinin in pandemic (H1N1) 2009 virus. Emerg. Infect. Dis. 16(11), 1748–1750 (2010)
Christopher, J.V., Liu, Y.: Special features of the 2009 pandemic swine-origin Influenza A H1N1 hemagglutintin and neuraminidase. Chin. Sci. Bull. 56(17), 1747–1752 (2013)
Wang, J.J.-Y., Wang, X., Gao, X.: Non-negative matrix factorization by maximizing correntropy for cancer clustering. BMC Bioinform. 14(1), 107 (2013)
Wang, J.J.-Y., Bensmail, H., Gao, X.: Multiple graph regularized protein domain ranking. BMC Bioinform. 13(1), 307 (2012)
Wu, X., Liao, B.: 6-D representation of protein sequences and the analysis of similarity/dissimilarity based on it. Lett. Biotechnol. 15(4), 366–368 (2004)
Qian, P.-P.: A novel representation of protein sequences. Shandong Univ., China (2011)
Tang, X.-Q., Zhu, P., Cheng, J.-X.: The structural clustering and analysis of metric based on granular space. Pattern Recogn. 43(11), 3768–3780 (2010)
Yao, Y.-Y.: Granular computing: basic issues and possible solutions. In: Proceedings of the 5th Joint Conference. Science, vol. 5, no. 1, pp. 186–189 (2000)
Yao, Y.-Y., Yao, J.-T.: Granular computing as a basis for consistent classification problems. In: Proceedings of PAKDD Workshop Found, Data Mining, vol. 5, pp. 101–106 (2002)
Chanderbali, A.S., Wong, G.K., Soltis, D.E.: Phylogeny and evolutionary history of glycogen synthase kinase 3/SHAGGY-like kinase genes in land plants. BMC Evol. Biol. 13(1), 143–145 (2013)
Luo, J.-W., Yin, Z.H.-Q., Liu, S.H.-Y.: Novel method for evolutionary tree construction based on correlation feature and fuzzy clustering. Appl. Res. Comput. 28, 2844–2847 (2011)
Cai, B., Peng, J., Jiang, H.: Tracking the spread of avian influenza in china: a model based on evolutionary genetics analysis and geographic visualization. Chin. J. Emerge. Med. 21, 887–891 (2012)
Qiu, G.-H., Zhang, R., Du, Sh-C: The evolutionary trees analysis of HCV genotypes in China. Chin. J. Clin. Lab. Sci. 23(3), 165–167 (2005)
Tang, X.-Q., Zhu, P.: Hierarchical clustering problems and analysis of fuzzy proximity relation on granular space. IEEE Trans. Fuzzy Syst. 21(5), 814–824 (2013)
Guo, Y.-J., Wen, L.-Y.: Origin of hemagglutinin and neuraminidase gene of swine Influenza A H1N1 viruses. Chin. J. Exp. Clin. Virol. 17(4), 315–318 (2013)
Li, J., Shao, T.-J., Yu, X.-F., et al.: Molecular evolution of HA gene of the Influenza A H1N1 Pdm09 strain during the consecutive seasons 2009–2011 in Hangzhou, China: several immune-escape variants without positively selected sites. J. Clin. Virol. 55(4), 363–366 (2012)
Mullick, J., Cherian, S.S., Potdar, V.A., et al.: Evolutionary dynamics of the Influenza A Pandemic (H1N1) 2009 virus with emphasis on Indian isolates: evidence for adaptive evolution in the HA gene. Infect Genet Evol. 11(5), 997–1005 (2011)
Acknowledgements
The work was supported by National Natural Science Foundation of China (Grant No. 11371174), Fundamental Research Funds for the Central Universities of China (Grant No. JUSRP51317B), International Technology Collaboration Research Program of China (Grant No. 2011DFA70500), and Colleges and Universities in Jiangsu Province Plans to Graduates Research and Innovation (Grant No. 1145210232141170).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, WW., Li, Y., Tang, XQ. (2015). A New Representation Method of H1N1 Influenza Virus and Its Application. In: Huang, DS., Jo, KH., Hussain, A. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9226. Springer, Cham. https://doi.org/10.1007/978-3-319-22186-1_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-22186-1_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22185-4
Online ISBN: 978-3-319-22186-1
eBook Packages: Computer ScienceComputer Science (R0)