Terminology
- Log2
- Log2(x) means 2^? = x
- log2(16) = 4
Tools
- R Packages
- Limma-voom
- DESeq2
- edgeR
Tests
- Kolmogorov–Smirnov Test (KS test)
- Nonparametric test for quantifying the distance between a sample and the reference distribution.
- This does not measure the different between the sample means, only the difference in the shape of the distributions.
-
Using the KS test here is unrealistic because its underpowered compared to other, more clever, statistical models.
- Log2 Fold Change
- How different the mean expression of a sample is compared to the control sample.
-
- The fold change is the mean of the sample divided by the mean of the control.
Why Log2 Fold Change?
So severity between gene expression regardless of high or low is symmetrical around 0. Effect size measurement symmetric around 0. Both genes have twice expression, just in different direction.
- Gene 1
- mean(sample) = 4
- mean(control) = 2
- fold change = 4/2 = 2
- log2(2) = 1
- Gene 1
- mean(sample) = 2
- mean(control) = 4
- fold change = 2/4 = 1/2
- log2(1/2) = -1
In order to avoid a divide by zero / log2(0) error, we add a small number (epsilon, 1/0.1/0.01) to both numerator and denominator of the fold change. This dilutes the fold change by a little bit because it swings the fold change closer to 1, and the log2(1) = 0.
Divide by Zero
Test Corrections
- Multiple-Test Correction
- Checking your p-value per test opens you up to many chances for a coincidence (type I error?).
- You must adjust your p-value or threshold in order to account for the multitude of tests that are occurring.
- Bonferroni Correction Multiple-Test Correction
- Divide your p-value by the number of tests that occurred.
- Or multiply your threshold by the number of tests that occurred.
- Benjamini-Hochberg Correction Multiple-Test Correction
- ?