n denotes the sample size.
Measures of Spread
- Range
- The difference between the highest and lowest values.
- Median
- The middle of all the values.
- If there's a tie, you take the average of the middle two values.
- Mode
- The most frequent values..
- Mean
- The average of all the values.
- Variance
- Sample variance estimates population variance.
-
- Sample Variance s²
- Known, a statistic.
- Population Variance σ²
- Unknown, a parameter.
- We estimate it.
- Standard Deviation
- Equals the square root of the variance.
- Sample standard deviation estimates population standard deviation.
- Small std. dev. means items are grouped close to the center.
- The bigger the std. dev., the more spread out the data is.
-
- Sample Standard Deviation s
- Known, a statistic.
- Population Standard Deviation σ
- Unknown, a parameter.
- We estimate it.
Don't forget units!
- Sample Average x̄
- The average of all of the sample data.
- Sample Variance s²
- Units are squared. Different context then the original data.
- Sample Standard Deviation s
- On average, each data point is X units away from the mean of Y unit.
Standard deviation uses the same units as the original data.
The standard deviation is the average (mean) distance from a data point to the mean. It can be thought of how much a typical data point differs from the mean.
An outlier is a data value that is very different from the rest of the data and is far enough from the center. If there are extreme values in the data, the median is a better measure of the center than the mean. The mean is not a resistant measure, the median and mode are resistant measures.
Excel Functions
- Sample Average x̄
- =AVERAGE(highlighted data values here)
- Median
- =MEDIAN(highlighted data values here)
- Mode
- =MODE.MULT(highlighted data values here)
- Sample Variance s²
- =VAR.S(highlighted data values here)
- Sample Standard Deviation s
- =STDEV.S(highlighted data values here)
- Range
- =MAX(highlighted data values here)-MIN(highlighted data values here)