Asset pricing has evolved dramatically over the past several decades, shifting from simple models based on a single market factor to an expansive zoo of multifactor frameworks. But beneath the growing complexity lies a deeper unifying idea: factor models are, at their core, a way to approximate the Stochastic Discount Factor (SDF). This post is not about which factors “work” or how to pick them, it’s about why factor models work at all. Drawing on insights from Cochrane (2005), we’ll explore how the empirical world of factor investing is fundamentally tied to the SDF.
Table of Contents
- Factor Models Dominating Role in Modern Asset Pricing
- SDF: The True Concept Behind Asset Prices
- Factor Models as SDFs in Disguise
- Conclusion
Factor Models in Finance: A Historical Perspective
In the beginning, investors created the Market. The Market was one, with a clear form and simple measure. And the investors said: Let risk be priced. And risk was priced. And they called this the Capital Asset Pricing Model, born in 1964, the first of its kind.
But investors said: Let there be more than one risk, for markets are many and varied. And so in 1992 arose the Fama–French Three-Factor Model. And they called it the Threefold Model, for it spanned size, value, and the market itself. Thus were born factors, and with them the first risk portfolios.
And again the investors spoke: Let us make factors in our own image, many and diverse, to capture the world as we see it. And so began the great multiplication. From momentum in the 1990s, to profitability and investment in the 2010s, factors spread and multiplied, until there was a vast zoo of them, each claiming to price risk and explain return.
But despite the proliferation of models that price the market and explain the cross-section of returns, one central question remains: Where does all this actually come from? After all, divine metaphors aside, no one created the market from thin air with well structured properties and linear pricing properties. In this post, I want to explore one possible line of reasoning that helps explain why factor models work, not just how. My thinking here is largely influenced by the reading of Cochrane (2005).
SDF: The True Concept Behind Asset Prices
When first encountering factors, OLS estimators, shrinkage, and $t$-tests, it can feel rather abstract—especially if the end goal is making an investment decision. Yet one principle is more intuitive and foundational: the law of one price, which underlies most modern asset pricing theory.
The Law of One Price
The Law of One Price states that in an arbitrage-free market, two assets with identical cash flows must have the same price (G. Cassel (1918)). In practice, of course, there are frictions such as limited accessibility, bid-ask spreads, or information asymmetries. Nevertheless, it forms the backbone of asset pricing: two portfolios delivering the same payoffs should have the same price, and two bonds with identical cash flows and risk characteristics should offer the same yield.
From the Law of One Price to the SDF
This idea can be formalized mathematically through the stochastic discount factor (SDF), also called the pricing kernel Cochrane (2005). The SDF is a random variable $m_{t+1}$ that links future payoffs $X_{t+1}$ to current prices $P_t$:
\[P_t = \mathbb{E}_t \left[ m_{t+1} \, X_{t+1} \right]\]Equivalently, for excess returns $r_{t+1}$ (returns over the risk-free rate), the SDF satisfies:
\[\mathbb{E}_t \left[ m_{t+1} \, r_{t+1} \right] = 0\]Intuitively, this states that all assets are “fairly priced” given the SDF. The SDF is closely related to risk-neutral probabilities and can be thought of as a weighting factor that adjusts future payoffs to their present value while accounting for risk preferences. Go check this previous post if you want info on that.
The SDF in Practice
In theory, if we knew the distribution of $m_{t+1}$, pricing would be straightforward. This leads to the “everything is factored into the price,” which you can see on Benjamin Channel. But in reality, the SDF is unobservable, so we must approximate it. One approach could be Bayesian: treat the SDF as a latent random variable and update a prior distribution with market realizations over time.
In practice, however, finance has favored simpler approximations. Factor models, estimated via OLS regressions, can be interpreted as linear projections of the SDF onto observed returns. This is why factor models—CAPM, Fama–French, and beyond—capture so much pricing information: they approximate the true, unobservable stochastic discount factor. To known more about the Stochastic Discount Factor and what it’s implies in terms of risk neutral mesures and martingales, go check this previous post
Factor Models as SDFs in Disguise
At this point, you might be thinking: “What’s the link between factor models and the SDF?” If you already know where this is going, congratulations mate.
Estimating a factor model, whether it is CAPM, the Fama–French Five-Factor Model, or a custom model with hundreds of factors selected via LASSO, is, at its core, an SDF estimation problem. This was a revelation for me while reading John Cochrane’s Asset Pricing (Chapter 6).
From Factor Models to the SDF
Suppose we have a factor model:
\[r_{i,t+1} = \sum_{k=1}^K \beta_{ik} F_{k,t+1} + \epsilon_{i,t+1}, \quad i = 1, \dots, N\]with factors $F_k$ and factor loadings $\beta_{ik}$. We want an implied SDF $m_{t+1}^{\text{impl}}$ that satisfies the SDF condition for all assets:
\[\mathbb{E}_t \left[ m_{t+1}^{\text{impl}} \, r_{i,t+1} \right] = 0 \quad \forall i.\]Cochrane shows that we can construct it as a linear combination of the factors:
\[m_{t+1}^{\text{impl}} = 1 - \sum_{k=1}^K \lambda_k F_{k,t+1},\]where the weights $\lambda_k$ solve the system:
\[\mathbb{E}_t \Big[ (1 - \sum_{k=1}^K \lambda_k F_{k,t+1}) \, r_{i,t+1} \Big] = 0, \quad i = 1, \dots, N.\]In matrix notation, let $R$ be the $N \times 1$ vector of excess returns and $F$ the $N \times K$ factor matrix:
\[\mathbb{E}_t[R] = F \lambda \quad \implies \quad \lambda = (F^\top F)^{-1} F^\top \mathbb{E}_t[R].\]This provides an explicit SDF estimate from any factor model. Every factor captures a dimension of risk, and the combination of factors approximates the true, unobservable SDF $M_t = 1-\lambda^\top F_t$. This is why factor models are powerful : they encode the pricing kernel in a tractable, estimable form.
Why Use Factor Models Instead of Directly Approximating the Stochastic Discount Factor?
If the true power of a factor model lies in its deep connection to the stochastic discount factor (SDF), why not bypass the factors entirely and approximate the SDF directly to predict future prices?
The idea might be to approximate the SDF $ M $ and then use the fundamental pricing equation $ P_t = \mathbb{E}[M_{t+1} \times P_{t+1}] $ to compute fair prices and identify whether an asset is over or undervalued. However, this approach isn’t as straightforward or as informative as it might seem.
For example, suppose $ M = 1 $ almost surely. Then you’re left with $ P_t = \mathbb{E}[P_{t+1}] $, which offers no new insight. This illustrates that simply knowing the distribution of the SDF is not enough.
What really matters when working with $\mathbb{E}[M_{t+1} R_{t+1}] = 0$ is understanding the joint distribution of $ (M_t, R_t) $ the stochastic discount factor and the asset returns together. This joint distribution is difficult to characterize directly.
Factor models offer a practical solution by capturing the sensitivity of returns to underlying risk factors. This sensitivity gives us a tractable way to understand how returns co-move with the SDF, essentially providing a handle on their joint behavior.
In other words, the key mathematical value of factor models is that they let us approximate the joint dynamics between $ M_t $ and $ R_t $ through the factors $ F $. Let’s see that with some equations :
\[\mathbb{E}_t[R_{t+1} \times M_{t+1}] = 0\]Which implies
\[\mathbb{E}_t\big[R_{t+1} \times (1 - \lambda^\top F_{t+1})\big] = 0\]And thus
\[\mathbb{E}_t[R_{t+1}] = \mathbb{E}_t[R_{t+1} F_{t+1}^\top] \lambda\]When using the OLS estimator, the betas of the stock (an $ N \times K $ matrix) are defined as
\[B_{t+1} = R_{t+1} F_{t+1}^\top (F_{t+1} F_{t+1}^\top)^{-1}\]Since
\[R_{t+1} F_{t+1}^\top = R_{t+1} F_{t+1}^\top (F_{t+1} F_{t+1}^\top)^{-1} (F_{t+1} F_{t+1}^\top),\]We get :
\[R_{t+1} F_{t+1}^\top = B_{t+1} F_{t+1} F_{t+1}^\top\]Noting :
\[\Sigma_{FF}^{t+1} = \mathbb{E}_t[F_{t+1} F_{t+1}^\top]\]We end up with :
\[\mathbb{E}_t[R_{t+1}] = B_{t+1} \Sigma_{FF}^{t+1} \lambda\]By starting with basic properties of the stochastic discount factor, we ultimately arrive at a pricing formula that depends only on our factor model. This is powerful because it allows us to shift the focus from estimating relationships across all $ N $ assets to working with a much smaller set of $ K $ factors.
Plenty of work must be then done on estimating $B_{t+1}$, $\Sigma_{FF}^{t+1}$ and $\lambda_{t+1}$. When $ K \ll N $, and assuming the factors are not highly correlated, the factor covariance matrix $\Sigma_{FF}^{t+1}$ is easier to estimate, invert, and work with in general.
This dimensionality reduction offers several practical benefits, including more stable estimates, reduced computational complexity, and better interpretability of risk-return trade-offs.
Conclusion
At first glance, factor models may look like clever statistical tools thrown at return data to extract patterns. But a deeper look reveals something more fundamental: they are not just predictive regressions, they are structured approximations of the Stochastic Discount Factor, the very object at the heart of modern asset pricing. This tells us why factor models work.
So next time you’re running a factor regression, remember: you’re not just crunching numbers, you’re approximating the invisible hand that prices all possible states of the economy.
Got thoughts, questions, or counterpoints? Drop a comment below. I’d love to hear your take.
