Statistics Quick Reference
A practical reference guide for essential statistics formulas and concepts. Bookmark this page for quick access during homework, research, or data analysis.
Measures of Central Tendency
Mean (Average)
Sum all values and divide by the count. The mean is sensitive to outliers.
Example: Data: 4, 8, 6, 5, 7
Mean = (4 + 8 + 6 + 5 + 7) / 5 = 30 / 5 = 6
Median
The middle value when data is sorted in order. For an even number of values, take the average of the two middle values. The median is resistant to outliers.
Odd count: 3, 5, 7, 9, 11 → Median = 7
Even count: 3, 5, 7, 9 → Median = (5 + 7) / 2 = 6
Mode
The most frequently occurring value. A dataset can have no mode, one mode (unimodal), or multiple modes (bimodal, multimodal).
Example: 2, 3, 3, 5, 7 → Mode = 3
Measures of Spread & Variability
Range
The simplest measure of spread. Easy to calculate but sensitive to outliers.
Variance
Sample: s² = Σ(x - x̄)² / (n - 1)
The average of the squared differences from the mean. Sample variance uses n-1 (Bessel's correction) to provide an unbiased estimate.
Standard Deviation
The square root of variance. Standard deviation is in the same units as your data, making it more interpretable than variance.
Example: Data: 4, 8, 6, 5, 7 (Mean = 6)
Differences: -2, 2, 0, -1, 1
Squared: 4, 4, 0, 1, 1 → Sum = 10
Sample variance = 10 / 4 = 2.5
Standard deviation = √2.5 ≈ 1.58
Probability Basics
| Rule | Formula | Use When |
|---|---|---|
| Addition (OR) | P(A or B) = P(A) + P(B) - P(A and B) | Either event can occur |
| Multiplication (AND) | P(A and B) = P(A) × P(B|A) | Both events must occur |
| Complement | P(not A) = 1 - P(A) | Event does not occur |
| Independent events | P(A and B) = P(A) × P(B) | Events don't affect each other |
Permutations & Combinations
Permutations (order matters)
P(n,r) = n! / (n-r)!
How many ways to arrange r items from n items.
Combinations (order irrelevant)
C(n,r) = n! / (r!(n-r)!)
How many ways to choose r items from n items.
Common Distributions
Normal Distribution (Bell Curve)
The most important distribution in statistics. Many natural phenomena follow a normal distribution. It is defined by its mean (μ) and standard deviation (σ).
The 68-95-99.7 Rule:
- 68% of data falls within 1 standard deviation of the mean
- 95% of data falls within 2 standard deviations of the mean
- 99.7% of data falls within 3 standard deviations of the mean
Z-Score
Tells you how many standard deviations a value is from the mean. A z-score of 2 means the value is 2 standard deviations above the mean.
Linear Regression
Variables:
- y = predicted value
- m = slope (rate of change)
- x = independent variable
- b = y-intercept
R-squared (R²):
Measures how well the line fits your data. Ranges from 0 to 1. An R² of 0.85 means 85% of the variation in y is explained by x.
Quick Reference Table
| Measure | Formula | Best For |
|---|---|---|
| Mean | Σx / n | Symmetric data, no outliers |
| Median | Middle value | Skewed data, has outliers |
| Mode | Most frequent | Categorical data |
| Std Dev | √(Σ(x-x̄)²/(n-1)) | Measuring spread |
| Z-Score | (x - μ) / σ | Comparing across datasets |
| Correlation | r = -1 to +1 | Relationship strength |