Trustner AcademyTrustner AcademyCourses
Reading 7CFA L1 QuantFull chapter

Linear regression

In this chapter: Single-variable regression · Coefficient interpretation, R², SSE · Beta estimation in CAPM

~6 min readLayer 4 · Professional CertificationsFree

Regression fits a line through data: how does Y change as X changes? Equity beta — how much a stock moves with the market — comes from a regression. Factor models, performance attribution, return forecasting — all regression. CFA L1 tests single-variable regression; L2 deepens to multiple regression. Master the single-variable case here and the rest follows.

Foundation

Linear regression: Y = α + β·X + ε • Y: dependent variable (what we predict) • X: independent variable (predictor) • α: intercept — value of Y when X = 0 • β: slope — change in Y per unit change in X • ε: residual — unexplained variation For finance: • Y = stock return • X = market return • β = sensitivity to market moves • α = manager skill (Jensen's alpha) above CAPM-predicted return Least-squares estimation: choose α and β to minimise sum of squared residuals. Key formula: β = Cov(X, Y) / Var(X) = ρ_{XY} × σ_Y / σ_X R²: fraction of Y's variance explained by X. Bounded 0 to 1. Higher = X explains more of Y.

Deep Dive

CAPM regression — the workhorse of equity finance: Monthly returns of HDFC Bank regressed on NIFTY 50 monthly returns over 36 months: Result: HDFC Bank β = 1.05, α = 0.3% per month, R² = 0.65 Interpretation: • β = 1.05 — HDFC Bank moves 1.05× as much as NIFTY (slightly aggressive) • α = 0.3% per month — beyond what beta predicts. Annualised: 3.6%. Could be skill, could be luck. • R² = 0.65 — 65% of HDFC's return variation explained by NIFTY. The other 35% is HDFC-specific (idiosyncratic risk). Standard error of β estimate determines whether the β is statistically distinguishable from a hypothesis (e.g., β = 1). Is α significant? Compute t-statistic for α/SE(α). If |t| > 2, reject H₀ that α = 0. For most active managers, α is statistically indistinguishable from zero over typical sample sizes. Detecting real skill needs many years and high active risk.

Advanced

Beyond simple OLS — practitioner concerns: 1. Heteroscedasticity: variance of residuals changes with X. Common in cross-sectional regressions across firms of different sizes. Fix: White or Newey-West standard errors. 2. Autocorrelation: residuals correlated across observations (typical in time-series). Fix: Newey-West, or include lagged variables. 3. Outlier sensitivity: a few extreme observations can dominate β estimates. Robust regression methods downweight outliers. 4. Non-stationarity: regressing one trending variable on another produces "spurious" results. Test for stationarity first. 5. Multicollinearity (in multiple regression — L2 topic): independent variables correlated with each other. Inflates standard errors, makes β unstable. 6. Time-varying β: real β changes over time. Rolling-window regression captures this. CFA L1 tests recognition of issues; L2 tests fixes; the practitioner deals with all of them daily.

Regulatory references
  • CFA Institute Curriculum — Level 1, Quantitative Methods, Reading 7
  • CAPM — foundational asset-pricing model in finance
  • SEBI factsheet disclosure standards — beta typically reported
Common mistakes & pitfalls
  • Reporting beta without standard error or confidence interval.
  • Assuming β is constant — really it's time-varying.
  • Confusing R² with prediction accuracy — R² is variance explained, not absolute predictive accuracy.
  • Forgetting that high R² doesn't imply causation — just correlation.
  • Using small samples (n < 30) — beta estimates very unstable.

Frequently asked

What's the difference between alpha and excess return?
Excess return = actual return − risk-free rate. Alpha = excess return − beta × (market excess return). Alpha is the residual after accounting for market exposure. A fund can have positive excess return and zero or negative alpha if it just took more market risk.
How do I interpret a high R² with low alpha?
Fund tracks the benchmark closely (high R²) but doesn't outperform after adjusting for risk (low alpha). A "closet index fund" — provides little active value beyond market exposure.
How long a track record do I need to estimate beta reliably?
3-5 years minimum. With 36 months, SE of beta is typically 0.05-0.15. To detect 0.1 beta deviation from 1.0 with statistical significance requires careful confidence interval analysis. Practitioners often use 5-year or longer windows.

Practice questions

Click each question to reveal the answer and explanation.

Q 1
For an asset with σ = 25%, market σ = 15%, correlation 0.6, beta is:
  1. (a)0.36
  2. (b)0.6
  3. (c)1.0
  4. (d)1.5
Correct: (c) 1.0
β = ρ × σ_asset / σ_market = 0.6 × 25% / 15% = 1.0.
Q 2
A regression's R² is 0.81. The correlation coefficient (assuming positive slope) is:
  1. (a)0.40
  2. (b)0.66
  3. (c)0.81
  4. (d)0.90
Correct: (d) 0.90
Correlation (ρ) = √R² for simple regression with one X variable. √0.81 = 0.9.
Q 3
A stock has β = 1.2 in a market expected to return 15%. Risk-free rate is 7%. CAPM-expected return:
  1. (a)7%
  2. (b)12.6%
  3. (c)16.6%
  4. (d)22%
Correct: (c) 16.6%
CAPM: E[R] = R_f + β(E[R_m] − R_f) = 7% + 1.2 × (15% − 7%) = 7% + 9.6% = 16.6%.
Q 4
A fund's alpha t-statistic is 0.5. We conclude:
  1. (a)The alpha is significantly positive
  2. (b)The alpha is statistically indistinguishable from zero
  3. (c)The fund will outperform
  4. (d)The fund has no market exposure
Correct: (b) The alpha is statistically indistinguishable from zero
t-stat 0.5 is far below 1.96 critical value. Cannot reject H₀ that alpha = 0. Reported alpha could be luck.
Q 5
In regression Y = α + βX + ε, if X increases by 1 unit, Y changes by:
  1. (a)α units
  2. (b)β units
  3. (c)ε units
  4. (d)R² units
Correct: (b) β units
β is the slope — change in Y per unit change in X. α is the intercept.
Q 6
A high R² (0.95) in a fund-vs-benchmark regression typically suggests:
  1. (a)The fund is highly active
  2. (b)The fund tracks the benchmark closely (closet indexer)
  3. (c)The fund has high alpha
  4. (d)The fund is risk-free
Correct: (b) The fund tracks the benchmark closely (closet indexer)
High R² = most variation explained by benchmark. Suggests closet indexing rather than active management.
Educational purposes only. The numbers, returns, and examples used in this lesson are illustrative. Past performance does not guarantee future results. Mutual fund and securities investments are subject to market risks. This lesson is not investment advice; for advice tailored to your circumstances, consult a SEBI-registered Investment Adviser. Read our full disclaimer.