Data
Standard Deviation vs Variance: When to Use Which
Variance and standard deviation both measure how spread out a dataset is. They’re mathematically twins — variance is just standard deviation squared. So why do we have both, and when do you reach for which one?
The Formulas
Variance (population): σ² = Σ(xᵢ − μ)² ÷ N
Standard deviation (population): σ = √σ²
For a sample, divide by n − 1 instead of N (Bessel’s correction):
Sample variance: s² = Σ(xᵢ − x̄)² ÷ (n − 1)
Compute either in our Standard Deviation Calculator.
Why Two Numbers For The Same Thing?
Variance is mathematically convenient — it’s additive for independent variables, plays nicely in regression, and is the workhorse of ANOVA and most inferential statistics.
Standard deviation is interpretable — it’s in the same units as your data. If you measured customer ages, σ is in years. Variance is in “years squared,” which means nothing to anyone outside a stats class.
The Rule of Thumb
- Reporting and communication: use standard deviation.
- Analysis and modeling: use variance.
Sample vs Population: Which N?
If you have every data point in your population (e.g., every employee at a 50-person company), divide by N. If you have a sample drawn from a larger population (almost every real-world case in business analytics), divide by n − 1. Spreadsheet functions: STDEV.P/VAR.P for population, STDEV.S/VAR.S for sample.
The Empirical Rule (For Roughly Normal Data)
- ~68% of data falls within ±1 SD of the mean
- ~95% within ±2 SD
- ~99.7% within ±3 SD
This is what makes SD useful as a “typical deviation” measure.
Worked Example
Five customer order values: $40, $42, $50, $55, $63. Mean: $50.
- Squared deviations: 100, 64, 0, 25, 169 → sum 358
- Sample variance: 358 ÷ 4 = 89.5
- Sample SD: √89.5 ≈ $9.46
You’d tell a stakeholder: “Average order is $50, with a typical spread of about ±$9.” You wouldn’t say “variance is 89.5 dollars-squared.”
Coefficient of Variation: SD in Context
SD alone can mislead. A SD of $9 is huge for $50 orders but tiny for $50,000 orders. The coefficient of variation (CV) = SD ÷ mean normalizes it. CV of 0.19 (above) means the spread is 19% of the average.
FAQs
Why divide by n−1? Sample variance with N underestimates the true population variance. Subtracting one degree of freedom corrects the bias.
Is high SD bad? Not inherently — it just means more variability. Whether that’s good (diverse customer base) or bad (unstable manufacturing) depends on context.
Can SD be negative? No. It’s a square root of a sum of squares — always ≥ 0.