We will now run DESeq2 to calculate differential expression between the two sample conditions. With the reference sequence selected, go to Annotate and Predict → Compare Expression Levels. Choose DESeq2 as the Method.
Because we previously added a meta-data field for sample condition on our reads, Geneious has detected this and offered to automatically assign conditions using this field. You should see that all the "Plk" tracks are assigned to condition A, and all the "Vnt" tracks are assigned to condition B. If you do not have a field on your sequences for assigning sample conditions, you can set this manually.
Note that to assign conditions automatically by field, any field can be used as long as it contains only 2 different values. Geneious will detect valid fields and display them as options for automatic assignment.
The DESeq2 options should look like this:
Click OK to run the analysis.
The first time you run DESeq2, Geneious will download and install R and all the required packages. This will add a few extra minutes onto the analysis time. After the analysis is finished, you will see an extra track on your reference sequence called "Diff Expression, Sample condition, planktonic vs Squid-Associated". Note that because we used a meta data field for assigning sequences, the track has automatically been named according to the values in this field.
DESeq2 uses the raw read count data for differential expression analysis. For a full description of the method, please refer to the DESeq2 website and paper. To view the R script that Geneious runs click the "R Script" link in the DESeq2 setup window.
Ensure you have saved your document, then switch to the Annotations table to view the differential expression results in tabular format. Under Track, choose to display only the Differential Expression track. You should see columns in the table for Differential Expression Log2 ratio, Expression Confidence, and p value. If you do not see these columns, add them by clicking the Columns button. The Log2 ratio column shows the fold change in expression between the sample condition A and sample condition B - if this value is positive, the gene is up-regulated in sample condition B (in this case, the squid associated samples), and if negative it is down-regulated. Click on the header for this column to sort the table by the log2 ratio. You will see that many of the top up-regulated genes are lux genes. These are within the lux operon, which plays a key role in controlling bioluminescence.
The adjusted p-value tells you whether the differential expression observed is statistically significant. This value is adjusted to account for multiple tests.
The Differential Expression Absolute Confidence is negative base 10 log of the adjusted p value, so is another way of expressing the significance. For example, a confidence value above 2 equates to a p value of less than 0.01. The Differential Expression Confidence column shows the same information, but the confidence values are adjusted to be positive for genes that are up-regulated in sample condition B compared to sample condition A, or negative for down-regulated genes. By default, Geneious uses this value to color the annotations: from blue for down-regulated genes, through to white for genes that are not differentially expressed, through to red for genes that are up-regulated.
Additional graphical tools for visualising your results are available in tabs above the sequence viewer.
The PCA (Principal Component Analysis) Plot is a method for displaying the amount of variance in the data and can be used to check whether the replicates cluster together as a form of quality control. A "good" PCA plot should show that samples from the same sample condition cluster together and that the clusters should be reasonably well-separated along the X axis. In this example you'll see that the Plk (planktonic) replicates are close together and are separated along the X axis from the Vnt (squid-associated) replicates. However one of the Vnt replicates, Vnt1, does not cluster with the other two. This may be the result of natural biological variation, but if you see this sort of pattern in your data you should also investigate the potential for experimental errors and possibly even discard the sample from the analysis.
In Geneious R11 onwards, results can also be displayed in a Volcano Plot. This shows the fold change (Log2 Ratio) plotted against the significance (-Log10 adjusted p value). The "outliers" on this graph represent the most highly differentially expressed loci. When you mouse over each dot on the graph you will see the gene name - click on the dot to display the gene name on the graph.
The dots on the Volcano plot are linked with both the annotations table and the annotations in the sequence viewer. Try switching back to the annotations table, sort it by Differential Expression Absolute Confidence and then select the top 10 genes. Now switch back to the Volcano Plot and you will see those genes are now labelled on the plot. The Volcano Plot also has several Advanced view options, as shown in the screenshot below. To access these, click the Advanced button above the viewer. Here you can add dotted lines showing expression ratio and p-value thresholds, and color dots that fall outside those lines.