Estimating the Distribution of Omitted Variable Bias in Causal Inference

CausalInference
How large would an unobserved confounder need to be to overturn your causal estimate? A survey of methods for bounding omitted variable bias.
Published

April 6, 2025

I went down a rabbit hole on omitted variable bias, specifically the question of not just whether a confounder biases your estimate, but how to put a range on how badly. These are my notes on the main families of methods people use, what each one buys you, and where each one breaks.

What OVB actually is

Omitted variable bias shows up when you leave out a variable that is correlated with both your treatment of interest and your outcome. The model has nowhere to put that variable’s influence, so it smears it onto the coefficients you did include, and your estimate of the causal effect drifts up or down.

The classic example: regressing salary on years of education while leaving out ability. If ability raises both how much schooling someone gets and how much they earn, the education coefficient absorbs part of ability’s effect, and you overstate the return to education.

Two conditions have to hold for the bias to exist:

  1. the omitted variable genuinely affects the outcome, and
  2. it is correlated with an included regressor.

If either fails, omission is harmless. The direction of the bias follows the signs: same-sign correlations push the estimate up, opposite signs push it down. Reasoning through those signs without knowing the magnitude is what people call “signing the bias.”

The algebra

Suppose the true model is

y = b\,x + c\,z + u,

but you omit z and fit

y = a\,x + v.

Then the estimated coefficient on x has expectation

\mathbb{E}[a] = b + c\,f,

where f is the coefficient from regressing the omitted z on the included x. The bias term is c\,f: it is zero exactly when z has no effect on y (c=0) or when x and z are uncorrelated (f=0). Clean, but not directly usable, because c and f both involve the thing you never observed. That is the whole problem, and every method below is a different way around it.

Five ways to put a range on the bias

1. Sensitivity analysis. Instead of assuming no confounding, ask how strong a confounder would need to be to overturn your conclusion. The robustness value captures this: the minimum association (measured as partial R^2) an unobserved confounder must have with both treatment and outcome to drive the effect to zero. If a weak, implausible confounder is enough to flip the result, the finding is fragile; if only an implausibly strong one would do it, the finding is robust. Cinelli and Hazlett’s contour plots are the standard way to read this off visually, showing where the estimate stays significant across combinations of confounder strength.

2. Bounding approaches. Rather than scan scenarios, fix an assumption about the most explanatory power any omitted variable could plausibly have, and derive hard upper and lower limits on the effect. The bounds are only as credible as that ceiling assumption. Set it too low and the true effect can fall outside your interval; set it too high and the bounds are so wide they say nothing. Covariate benchmarking anchors the assumption in data: argue that no unobserved confounder is stronger than, say, your strongest observed covariate, and use that as the empirical ceiling.

3. Simulation. Build synthetic datasets where you control the omitted variable’s effect, then estimate the treatment effect while deliberately leaving it out, repeatedly. The spread of estimates is the empirical distribution of the bias. Useful for two things: seeing which conditions make OVB worst (stronger confounder-regressor correlation, larger effect on the outcome), and validating that a sensitivity method’s claimed bounds actually cover the truth.

4. Bayesian methods. Put a prior on the unobserved confounder’s parameters (or directly on the bias), and the posterior on the causal effect carries that uncertainty through. You get a full distribution over plausible effects instead of a point estimate. The catch is the obvious one: if the posterior moves a lot when you change the prior, the data isn’t doing the work, your assumptions are.

5. Machine learning. When relationships are non-linear or high-dimensional, flexible models estimate the nuisance functions better than a hand-specified linear model. DoubleML now ships OVB sensitivity analysis inside the double-machine-learning framework, and methods like BART give uncertainty estimates directly. The open problem is the usual one with flexible models: quantifying causal uncertainty rigorously, and explaining what the model is actually conditioning on.

How they compare

Method Strength Where it breaks
Analytical Gives the fundamental picture of how OVB arises. Relies on quantities you can’t observe.
Sensitivity analysis Quantifies how much confounding it takes to flip the result. Gives no single “corrected” estimate.
Bounding Returns an actual range for the effect. Only as good as the plausibility ceiling you assume.
Simulation Controlled, lets you validate other methods. Conclusions are hostage to the data-generating process you chose.
Bayesian Carries uncertainty through to a full posterior. Sensitive to prior specification.
Machine learning Handles non-linear, high-dimensional confounding. Uncertainty quantification for causal effects is still maturing.

What I took away

The honest core of all of it: you cannot measure what you did not observe, so every method here trades the impossible question (“what is the bias?”) for a tractable one (“how strong would a confounder have to be, and is that plausible here?”). Sensitivity analysis and covariate benchmarking are the two I find most useful in practice, because they hand the judgment back to domain knowledge instead of hiding it inside a prior or a simulated data-generating process.

References

The starting points worth reading if you want the real treatment, not my notes:


More causal inference notes: