Hypothesis Testing
4. Hypothesis Testing — Making Decisions with Data
4.1 The Framework
🧠 Court analogy se samjho: H₀ = "Accused is innocent" (default). H₁ = "Accused is guilty" (claim). Evidence = Data. If evidence itna strong hai ki innocence mein possible nahi (p < 0.05) → Guilty! If evidence weak hai → Not Guilty.
4.2 p-value Explained
p-value = "If H₀ is true, what is the probability of observing results this extreme or more?"
- p = 0.03 → Only 3% chance this result is random → Pattern is REAL → Reject H₀
- p = 0.45 → 45% chance this is random → Nothing special → Fail to Reject H₀
Worked Problem — Business Hypothesis:
A marketing team claims their new email subject line has increased open rates from the old 22% to 27%.
H₀: Open rate = 22% (no change)
H₁: Open rate > 22% (improvement)
Data: 500 emails sent, 135 opened → Observed rate = 27%
Test: One-proportionz-test
Result: p-value = 0.012
Decision: p = 0.012 < α = 0.05 → REJECT H₀
Conclusion: The new subject line significantly improved open rates ✅
4.3 Type I vs Type II Errors
| Error | What Happened | Analogy | Business Example |
|---|---|---|---|
| Type I (α) | Rejected H₀ when it was true | Convicting innocent | Concluding campaign worked when it didn't → wasted budget |
| Type II (β) | Failed to reject H₀ when H₁ was true | Letting guilty go free | Concluding campaign failed when it actually worked → missed opportunity |
🧠 Yaad kaise rakho: Type I = "I wrongly accused" (innocent ko pakda). Type II = criminal "eIIude" kiya (guilty bach gaya).
4.4 Statistical Power
Power = 1 - β = Probability of correctly rejecting H₀ when H₁ is true.
- Standard target: 80% power (β = 0.20)
- Higher power = larger sample size needed
Factors that increase power:
- Larger sample size
- Larger effect size
- Higher significance level (α)
- Lower variability in data
4.5 Common Statistical Tests
| Test | When to Use | Example | Data Type |
|---|---|---|---|
| One-sample t-test | Compare sample mean to known value | "Is our avg delivery time different from 3 days?" | 1 continuous variable |
| Two-sample t-test | Compare means of 2 groups | "Are Delhi and Mumbai avg order values different?" | 1 continuous + 1 categorical (2 groups) |
| Paired t-test | Compare same group at 2 times | "Did training improve employee scores?" | 2 paired continuous measurements |
| Chi-Square | Test association between categorical variables | "Is gender linked to product preference?" | 2 categorical variables |
| ANOVA | Compare means across 3+ groups | "Are revenues different across cities?" | 1 continuous + 1 categorical (3+ groups) |
| Pearson Correlation | Linear relationship between 2 continuous vars | "Is there a link between ad spend and sales?" | 2 continuous variables |