Editing Mitochondrial DNA sequences

Select the sequence list containing the raw sequence data from the mitochondrial DNA control region. Double-click on the list to open it in a new window. In the General tab to the right of the sequence view, choose to display Colors according to Quality. This will highlight the base calls according to the quality of the sequence at that base - the darker the blue, the lower the quality.

When zoomed out you won't see the individual bases or chromatogram peaks, but there will be a graph visible giving an indication of sequence quality. If you scroll down the sequences you'll see that the sequence quality decreases dramatically at the end of each sequence. Zoom in to at least 50% to see what the chromatograms look like in good vs poor quality regions. One of the sequences (CLG3) has no sequence, indicating the sequencing reaction failed so delete this one from the list. Sequence SRE1 has only a short stretch of good quality sequence before the sequence becomes unreadable so delete this one as well. Save the edited sequence list and close the window.

Trim the poor quality bases off the ends of the sequences by clicking Annotate and Predict→Trim Ends. Choose to "Remove new trimmed regions from sequences" and set the Error probability limit to 0.01, as shown in the screenshot below. Click OK and then Save once the trimming is finished.



From here it is more efficient to finish cleaning up and editing the sequences once they are aligned. Select the sequence list again and click Align/Assemble→Multiple Align. Select the MUSCLE alignment algorithm and run it with the default settings.

Double-click on the alignment to open it and zoom in to about 50% so you can see the base calls and chromatograms. You may need to check Show Graphs in the Graphs tab in order to see the chromatograms. Scroll along to the bases at the 3' end and you'll see that the base calls become weak after the GGGGGGGGAAGGGGGGGGG motif (see screenshot below). In many of the sequences the region following this motif is already trimmed off. Trim the remaining sequences by clicking Allow Editing then selecting the bases from base 563 onwards on the consensus sequence and hitting the delete key. Editing the consensus sequence will apply the change to all the sequences in the alignment. You should also delete the first 20 bases at the start of the alignment to make the sequences all the same length, as this region has already been trimmed off in a number of the sequences.



Click Save and choose Yes when asked if you want to apply the changes to the original sequences. Note that sometimes it is preferable not to apply the changes to the original sequences if you want to retain the original raw data file.

This alignment can now be used to build a phylogenetic tree of these sequences using the Tree function in Geneious Prime. For more information on building and viewing phylogenetic trees in Geneious Prime, see this page.


Exercise 2: Handling bidirectional nuclear sequence data