In Silico Based Whole Genome Phylogenetic Analysis of Novel Coronavirus (Sars-Cov-2)
In Silico Based Whole Genome Phylogenetic Analysis of Novel Coronavirus (Sars-Cov-2)
In Silico Based Whole Genome Phylogenetic Analysis of Novel Coronavirus (Sars-Cov-2)
net/publication/344025951
CITATIONS READS
2 1,715
1 author:
Raghunath Satpathy
Gangadhar Meher University
59 PUBLICATIONS 110 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Raghunath Satpathy on 01 September 2020.
Fig. 1. Showing phylogenetic tree constructed by UPGMA method with highlighted genomes of Wuhan seafood
market pneumonia virus isolate Wuhan-Hu-1 complete genome and Bat coronavirus RaTG13.
Satpathy International Journal on Emerging Technologies 11(3): 1157-1163(2020) 1158
Table 1: Showing the comparison of genome composition of two genome.
S. No. Sequence details % T(U) %C %A %G Total base pairs
1. Bat coronavirus RaTG13 complete genome 32.0 18.4 29.9 19.5 29855
Wuhan seafood market pneumonia virus isolate
2. 32.0 18.3 29.9 19.6 29903
Wuhan-Hu-1 complete genome
Average of all the 100 sequences considered for
3. 31.4 19.1 29.2 20.2 29594
the study
The Bat coronavirus RaTG13 complete genome shares open reading frames (ORFs). About two-thirds of this
99% query coverage and 96.12 % of sequence identity genomic RNA, is located in the first ORF (ORF1a/b)
with the query sequence obtained from the BLAST N translates two polyproteins, pp1a and pp1ab, and
output. Similarly, Bat SARS-like Corona Virus isolates encodes 16 non-structural proteins (NSP), while the
show a common cluster and having and having 94-95% remaining ORFs encode accessory and structural
of query coverage and having 88-89% sequence identity proteins. The remaining part of virus genome encodes
was observed. All the Bat SARS coronavirus HKU types four important structural proteins, viz. small envelope
shows a common origin from their genome analysis and (E) protein, spike (S), nucleocapsid (N) protein,
shares 88-89% of query coverage and 80-82% of glycoprotein, matrix (M) protein, including other
sequence identity with the novel corona virus Wuhan accessory proteins that interfere with the host innate
seafood market pneumonia virus isolate Wuhan-Hu-1 immune response. However, the Bat Corona virus
(Fig. 1). The evolutionary origin of the novel Corona RatG13 contains proteins like orf1abpolyprotein, gene S
Virus 2019 and Bat coronavirus RaTG13 genome has (code for spike glycoprotein), non-structural proteins like
been obtained in the present phylogenetic analysis and NS3, NS6, NS7a, NS 7b, NS 8, envelope protein,
highlighted red in Fig. 1. Further, comparative genome membrane protein, and nucleo capsid protein. [18, 20].
analysis of the above two genome was performed by Despite of their variability in the genomic composition of
using the https://genomevolution.org/coge/GEvo.pl the two selected genomes, the gene orders were
server and the program uses the BLAST Z algorithm to obtained as similar in nature. Further annotation of the
generate the alignment results with default parameters two genomes were carried out by VISTA server
shown in Fig. 2 [17]. The Fig. 2 indicates about the (http://genome.lbl.gov/vista/wgvista/about.shtml) whole
difference in the available gene structures and orders genome alignment pipeline. To compute the conserved
about the two genomes. A recent study establishes the non-coding sequences (CNS) region among two
fact that the Wuhan seafood market pneumonia virus selected genomes (Table 2).
isolate Wuhan-Hu-1 contains, a variable number of
Fig. 2. Showing alignment of the two genome: ORFs has been represented in arrow.
Table 2: Showing the Conserved Non-coding sequences between two genome obtained from Wg-Vista
server.
Wuhan seafood market pneumonia
Bat coronavirus RaTG13 complete
virus isolate Wuhan-Hu-1 complete No. of base
S. No. genome (Accession : NC_045512.2)
genome (Accession: MN_908947.3) (bp)
From To From To
1. 1 250 16 265 250
2. 251 21537 266 21555 21290
3. 21545 25354 21563 25384 3822
4. 25363 26190 25393 26220 828
5. 26215 26442 26245 26472 228
6. 26493 27158 26523 27191 669
7. 27169 27354 27202 27387 186
8. 27360 27853 27394 27887 494
9. 27860 28225 27894 28259 366
10. 28240 29499 28240 29533 1260
11. 29524 29640 29558 29674 117
12. 29641 29855 29675 29890 216
How to cite this article: Satpathy, R. (2020). In Silico based Whole Genome Phylogenetic Analysis of Novel
Coronavirus (SARS-CoV-2). International Journal on Emerging Technologies, 11(3): 1157–1163.