|
[ Previous | Up | Next ]
The simultaneous alignment of many nucleotide or amino acid sequences
is now an essential tool in molecular biology. Multiple alignments
are used to find diagnostic patterns to characterize protein families;
to detect or demonstrate homology between new sequences and existing
families of sequences; to help predict the secondary and tertiary structures
of the new sequences; to suggest oligonucleotide primers for PCR; and
as an essential prelude to molecular evolutionary analysis.
One of the most popular programs for performing multiple sequence
alignments is clustalw. EMBOSS has an interface to clustal called emma
clustal (and thus emma) creates a multiple sequence alignment from
a group of related sequences using progressive pairwise alignments.
It can also produce a dendogram showing the clustering relationships
used to create the alignment. The alignment procedure begins with the
pairwise alignment of the two most similar sequences, producing a cluster
of two aligned sequences. This cluster can then be aligned to the next
most related sequence or cluster of aligned sequences. Two clusters
of sequences can be aligned by a simple extension of the pairwise alignment
of two individual sequences. The final alignment is achieved by a series
of progressive, pairwise alignments that include increasingly dissimilar
sequences and clusters, until all sequences have been included in the
final pairwise alignment. When gaps are inserted into a sequence to
produce an alignment, they are inserted at the same position in all
the sequences of the cluster. Each pairwise alignment uses the method
of Needleman and Wunsch extended for use with clusters of aligned sequences.
Exercise 10 - emma and prettyplot
We have obtained a number of beta globin sequences for you and placed them all in a single text file.
Use emma to align the sequences. Change the "Output Sequence
Format" to "GCG MSF".
The output file displays the best areas of similarity among the sequences. This process has aligned sequences from humans, zebra fish, cows and chickens. The sequences are very similar,
but there are some differences - note the gaps that have been
inserted. Also note that since this is a global alignment algorithm,
gaps have been inserted to make all the sequences the same length.
Differences in alignment can be very difficult to see in this
format.
The program prettyplot can enhance visualization of your
results, by aligning the sequences on top of one another. To use prettyplot, we need to get the sequence data from emma.
To do this, there is a link in the right-pane to "outseq".
Click on the link. You should see the sequences only.
Save this page (use Notepad). Now, go back, and click on prettyplot, and select the file you
just saved as input, then run prettyplot.
A graphic display will appear on your screen detailing your
alignment. Identical residues are shown in red, and similar residues
in green. This type of display can given you a first impression
region of conservation.
[ Previous | Up | Next ]
Page last modified
September 29, 2008
|