To study Phylogenetics you create visual comparisons between DNA sequences or proteins in the form of a tree.

Mutation leads to speciation




In order to conduct experiments involving phylogenetic you must have good sampling. Some of your samples need to be homologous, independent and variants of the original specimen which the tree is being based off of. Lastly, you need sequence alignments, and statistical support for their arrangement on the tree.
There are two tree building methods:
Distance methods
- UPGMA
- Neighbour Joining: Using blosum or PAM matrix to compare, then create a system to rate and scale the distance between the species based on their matrix score.
- Good things: They are computationally fast, and there is a singular best tree found in the end.
- Bad things: Sometimes there isn't a single best tree
Character based (discrete) methods
- Maximum parisomony
- Maximum likelihood: Evaluates the likelihood of every possible mutation that could occur within a phylogenetic tree for a species to arrive at where it currently is. Then it uses statistical analysis to figure out which has the highest likelihood and assumes that's the correct tree. There are 4 base pairs so in an unbiased model there is a .25 likelihood for one of the 4 base pairs to change to another base pair. Then you multiply it to the 10th with the power of how many nucleotides there are within the sequence you are analyzing and that is the likelihood of a certain mutation. Say you have a sequence 20 base pairs long, and you can say that a certain G substitution you are studying has a .25*10^20 chance of occurring. Then after that it calculates the chances of the this change occurring over time in this fashion (process portion). Advantages: Produces clear results, you can statistically analyze the results you receive, it also gives you the other likely options that it produces. Disadvantages: It is computationally intensive and cannot be applied to large datasets.
There is something called bootstrapping where you take all possible versions of your phylogenetic tree and then you calculate how many times certain species are grouped together.
Here we see that A and B have been grouped together 100% (this number is arbitrary) and C and D have been grouped together 75%. So it is very likely that A and B and then C and D diverge from a more recent ancestor. 70-90% the relationship is very probable. Anything less means it is a less probable relationship.

No comments:
Post a Comment