Bioinformatics is the study of data and the creation of
algorithms and statistics to analyze them.
Three sub disciplines of bioinformatics:
- Creation of algorithms and statistics to sort data
- The analyzation and interpretation of the sorting
- The implementation of what you have learned in the sorting process
Bioinformatics started with Gregor Mendel and now continues
with the mapping of organisms genomes.
Bioinformatics includes
Protein structure information, Dna information, gene
information, patient information, clinical trial information as well as
information about the metabolic pathways of many species.
Bioinformatics can be used to simulate cell environments
which can help custom design drugs and personalize medicine. It can be used to
create a better model of drug testing and replace the mouse. We have collected
all this information about the human genome and other organisms however we have
too much to analyze it manually, so we must create algorithms and programs that
will analyze the database for us or at least organize it into a digestible
portion. In terms of agriculture, bioinformatics can map the genome of plants
and diseases that plague plants create plants that are resistant. GMO’s. Also
resistant to droughts.
Bioinformatic tools must be developed around
- DNA sequence which determines protein sequence
- Protein sequence which determines protein structure
- Protein structure which determines protein function
Computational biologists use the tools and systems that
bioinformaticians create to solve biological problems by creating algorithms
and theories on how to utilize them.
Bioinformaticist knows how to create the systems and how to
use them and make it easier for others to use them, a bioinformatician only
knows how to create the system.
There are primary databases which contain data and then
there are secondary databases which contain analyzation of that data.
What data systems already exist
Nucleotide sequence database
Collection of research publications and medical journals
3D structures of proteins
Organism classification: BIODIVERSITY largely unmapped and
no set data base for it, all on paper
Enzymes and their function
Important technical abilities
JAVA
PERL: String manipulation
Regular expression matching
File parsing
Data format interconversion
FASTA format: represents nucleotide or peptide sequences in
which base pairs are represented in single letter codes. Series of lines that
should not exceed 120 characters and are usually 80 characters long.
Possible topic for project
*The structure of pathogens and their connection to the
structure of functioning protiens in the human body. Or structure of pathogens
against other pathogens.
No comments:
Post a Comment