Nothing Special   »   [go: up one dir, main page]

BIOINFORMATICS-basic

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

What is Bioinformatics/ Computational

Biology?
 Bioinformatics: collection and storage of
biological information

 Computational biology: development of


algorithms and statistical models to analyze
biological data

 Bioinformatics/Computational Biology will be


interchanged
What is Bioinformatics?

Source: http://ccb.wustl.edu/
Why is bioinformatics in demand?
 Very few people adequately trained in both biology
and computer science

 Genome sequencing, microarrays, etc lead to large


amounts of data to be analyzed

 Leads to important discoveries

 Saves time and money


What skills are needed?
 Well-grounded in one of the following areas:
 Computer science
 Molecular biology
 Statistics

 Working knowledge and appreciation in the others!


Brief history of bioinformatics: Databases
• The first biological database - Protein Identification Resource
was established in 1972 by Margaret Dayhoff
• Dayhoff and co-workers organized the proteins into families
and superfamilies based on degree of sequence similarity
• Idea of sequence alignment was introduced as well as special
tables that reflected the frequency of changes observed in the
sequences of a group of closely related proteins
• Currently there are several huge Protein Banks : SwissProt,
PIR International, etc.
• The first DNA database was established in 1979. Currently
there are several powerful databases: GenBank, EMBL, DDBJ,
etc.
Brief history of bioinformatics: other
important steps
• Development of sequence retrieval methods (1970-80s)
• Development of principles of sequence alignment (1980s)
• Prediction of RNA secondary structure (1980s)
• Prediction of protein secondary structure and 3D (1980-90s)
• The FASTA and BLAST methods for DB search (1980-90s)
• Prediction of genes (1990s)
• Studies of complete genome sequences (late 1990s –2000s)
Where Can You Learn More?
 ISCB: http://www.iscb.org/
 NBCI: http://ncbi.nlm.nih.gov/
 http://www.bioinformatics.org/
 Journals (Journal of Computational Biology,
Bioinformatics, BMC Bioinformatics,…)
 Conferences (ISMB, RECOMB, PSB, InCoB,…)
Data Mining & Bioinformatics : Why?
 Many biological processes are not well-understood
 Biological knowledge is highly complex, imprecise, descriptive, and
experimental
 Biological data is abundant and information-rich
 Genomics & proteomics data (sequences), microarray and protein-
arrays, protein database (PDB), bio-testing data
 Huge data banks, rich literature, openly accessible
 Largest and richest scientific data sets in the world
 Mining: gain biological insight (data/information  knowledge)
 Mining for correlations, linkages between disease and gene sequences,
protein networks, classification, clustering, outliers, ...
 Find correlations among linkages in literature and heterogeneous
databases
Algorithms Used in Bioinformatics
 Comparing sequences: Comparing large numbers of long sequences, allow
insertion/deletion/mutations of symbols
 Constructing evolutionary (phylogenetic) trees: Comparing seq. of diff.
organisms, & build trees based on their degree of similarity (evolution)
 Detecting patterns in sequences
◦ Search for genes in DNA or subcomponents of a seq. of amino acids
 Determining 3D structures from sequences
◦ E.g., infer RNA shape from seq. & protein shape from amino acid seq.
 Inferring cell regulation:
◦ Cell modeling from experimental (say, microarray) data
 Determining protein function and metabolic pathways: Interpret human
annotations for protein function and develop graph db that can be queried
 Assembling DNA fragments (provided by sequencing machines)
 Using script languages: script on the Web to analyze data and applications

You might also like