Data
Correlation vs Regression: When to Use Which (With Real Examples)
Correlation and regression are the foundation of every data science course — and the source of more misuse than any other pair of techniques.
Correlation: Direction and Strength
Pearson’s r ranges from −1 to +1. It tells you whether two variables move together and how tightly. It says nothing about which causes which. Use our Correlation Calculator.
Interpreting r
- |r| < 0.3 — weak
- 0.3 ≤ |r| < 0.7 — moderate
- |r| ≥ 0.7 — strong
Regression: Quantifying the Relationship
Linear regression fits the line y = a + bx through your data, letting you predict y from x. The slope b is the change in y for a 1-unit change in x. Run yours in our Regression Calculator.
When to Use Which
- Correlation — exploratory analysis, screening many variables, no prediction needed.
- Regression — prediction, quantification, controlling for multiple variables.
The Correlation ≠ Causation Reminder
Ice cream sales correlate with drownings. Both are caused by summer. Always ask: is there a confounding variable?
Worked Example
Marketing spend (x) and revenue (y) across 12 months. r = 0.84 (strong positive). Regression: y = 50,000 + 4.2x. Each $1 of marketing yields ~$4.20 in revenue — useful for budgeting, but only within the range of historical spend.
FAQs
Can I extrapolate beyond my data? Risky — relationships rarely stay linear forever.
What’s R²? The proportion of variance explained. R² = 0.71 means 71% of y’s variance is explained by x.