Calculate Compliers Instrumental Variable
Estimate the complier share, first stage, reduced form, and Local Average Treatment Effect using the Wald instrumental variables formula under random assignment and monotonicity.
How to calculate compliers with an instrumental variable
When researchers say they want to calculate compliers instrumental variable, they are usually trying to answer two linked questions. First, how many people actually change their treatment status because of the instrument? Second, what is the causal effect of treatment for that specific group? In modern causal inference, this group is called the complier population, and the corresponding treatment effect is the Local Average Treatment Effect, often abbreviated as LATE.
The logic comes from the classic instrumental variables framework. You observe an instrument Z, a treatment D, and an outcome Y. The instrument changes the probability of treatment, but is assumed to affect the outcome only through treatment. In the cleanest setup, the instrument is randomly assigned or as-good-as random after adjustment. A lottery offer, policy threshold, or eligibility rule are common examples.
The key formulas are simple. The first stage is E[D|Z=1] – E[D|Z=0]. Under the monotonicity assumption of no defiers, that difference is the estimated share of compliers. The reduced form is E[Y|Z=1] – E[Y|Z=0]. The Wald estimator then divides the reduced form by the first stage to get the treatment effect for compliers: LATE = RF / FS.
Why compliers matter
An IV estimate is not automatically the average treatment effect for everyone. It is local to the people whose treatment status responds to the instrument. That is why calculating compliers is so useful. It tells you who is being identified by the design. If the first stage is 0.30, then roughly 30% of the study population are compliers. Those are the people who take treatment when encouraged but not when unencouraged.
Under monotonicity, the population can be partitioned into principal strata:
- Always-takers: receive treatment whether or not they are encouraged.
- Never-takers: do not receive treatment whether or not they are encouraged.
- Compliers: receive treatment only when encouraged.
- Defiers: do the opposite of the instrument. Standard LATE analysis assumes this group is absent.
If monotonicity holds, then the principal strata shares can be backed out from treatment rates. Specifically:
- Always-takers = E[D|Z=0]
- Compliers = E[D|Z=1] – E[D|Z=0]
- Never-takers = 1 – E[D|Z=1]
- Defiers = 0 by assumption
That is exactly what the calculator above computes. It also converts the shares into counts if you provide a sample size.
Step-by-step interpretation of the calculator
Suppose treatment take-up is 62% when the instrument equals 1 and 31% when the instrument equals 0. The first stage is 31 percentage points, or 0.31 in proportion terms. If mean outcomes are 58 and 46 for the two instrument groups, then the reduced form is 12 percentage points, or 0.12. Dividing 0.12 by 0.31 yields about 0.387. Interpreted in percentage terms, the treatment raises the outcome by about 38.7 percentage points for compliers.
This result is not the average effect for always-takers or never-takers. It applies to the subgroup induced into treatment by the instrument. That local interpretation is the defining feature of LATE. In policy work, this can be a strength because it tells you the effect for the margin of people who actually change behavior in response to the policy lever.
| Component | Formula | What it means | How to read it |
|---|---|---|---|
| First stage | E[D|Z=1] – E[D|Z=0] | Change in treatment caused by the instrument | Also the complier share under monotonicity |
| Reduced form | E[Y|Z=1] – E[Y|Z=0] | Total effect of instrument assignment on the outcome | Combines treatment response and instrument-induced uptake |
| Wald IV estimate | (E[Y|Z=1] – E[Y|Z=0]) / (E[D|Z=1] – E[D|Z=0]) | Causal effect of treatment for compliers | The estimated LATE |
| Always-takers | E[D|Z=0] | Treated regardless of the instrument | Baseline treatment uptake |
| Never-takers | 1 – E[D|Z=1] | Untreated regardless of the instrument | Share untouched by encouragement |
The assumptions behind calculating compliers
A calculator can do the arithmetic instantly, but the econometric validity depends on assumptions. Researchers usually focus on four core conditions:
- Relevance: the instrument must move treatment uptake. If the first stage is tiny, your complier share is tiny and the estimate may be noisy or weakly identified.
- Independence: the instrument must be independent of potential outcomes and potential treatment states, often justified by random assignment or a credible quasi-experiment.
- Exclusion restriction: the instrument affects the outcome only through treatment, not through another channel.
- Monotonicity: there are no defiers. The instrument does not make some people less likely to take treatment while making others more likely.
If these assumptions hold, then the first stage is not just a descriptive difference. It becomes a structural estimate of the population share of compliers, and the Wald ratio becomes the causal treatment effect for that group.
Real empirical benchmarks from well-known IV settings
Different empirical designs generate very different complier shares. In randomized encouragement designs, first stages can be relatively large. In natural experiments, they are often much smaller. The table below summarizes a few widely cited empirical benchmarks that help calibrate expectations.
| Study context | Instrument | Reported first-stage magnitude | Why it matters for compliers |
|---|---|---|---|
| Oregon Health Insurance Experiment | Medicaid lottery selection | About 0.25 increase in Medicaid coverage among those selected | Implies roughly one quarter of the study population were compliers with respect to lottery selection and coverage |
| Randomized encouragement studies in education and health | Offers, reminders, default enrollment prompts | Commonly 0.10 to 0.40 depending on design and take-up frictions | Larger first stages mean a larger complier population and more precise LATE estimates |
| Quarter-of-birth schooling IV literature | Compulsory schooling exposure tied to birth quarter | Often small shifts in schooling, on the order of tenths of a year | Small first stages still identify effects, but estimates are more sensitive to weak-instrument concerns |
| Policy eligibility threshold designs | Age, distance, income, score cutoffs | Can range from under 0.05 to over 0.50 depending on compliance with the rule | The identified compliers are the people whose behavior changes at the threshold margin |
The lesson is practical. A first stage of 0.30 is economically substantial. A first stage of 0.03 may still be valid, but you should immediately think about instrument strength, sampling variability, and whether the local population is very narrow.
How to judge instrument strength in practice
When you calculate compliers, the first stage is doing double duty. It identifies both the margin of behavioral response and the denominator of the Wald ratio. If that denominator gets too close to zero, the IV estimate becomes unstable. Applied researchers often check the first-stage F statistic in a regression framework, with the classic rule of thumb that values below 10 can indicate weak instrument problems. That threshold is not a law of nature, but it is still a useful screening tool.
| Diagnostic | Rule of thumb | Interpretation | Practical implication |
|---|---|---|---|
| First-stage difference | Closer to 0 is weaker | Fewer compliers and less leverage from the instrument | Expect wider confidence intervals |
| First-stage F statistic | Below 10 often considered concerning | Potential weak-instrument bias and poor finite-sample behavior | Use robust weak-IV diagnostics and report sensitivity checks |
| Sign of first stage | Should align with the design | Unexpected negative values may violate monotonicity or indicate coding issues | Audit definitions of Z and D before interpreting LATE |
| Scale of reduced form | Must be interpreted in the same outcome units | A small reduced form can still imply a large LATE if the complier share is tiny | Always inspect numerator and denominator separately |
Common mistakes when people calculate compliers instrumental variable
- Using raw percentages without converting consistently. If treatment rates are entered in percentages and outcomes are entered in proportions, the ratio will be mis-scaled.
- Ignoring the sign of the first stage. If encouragement lowers treatment in your data, you may have reversed the coding of the instrument or treatment.
- Calling LATE an ATE. The IV estimate is local unless stronger homogeneity assumptions are justified.
- Forgetting the exclusion restriction. A policy offer might directly change expectations, information, or stress, not just treatment take-up.
- Assuming monotonicity without a story. In some settings, defiers are implausible; in others, they are not.
When the outcome is binary versus continuous
The same Wald logic works whether the outcome is a binary rate or a continuous measure. If the outcome is binary, the reduced form is often interpreted in percentage points. Dividing by the first stage gives the treatment effect for compliers on the probability scale. For continuous outcomes like earnings, blood pressure, or test scores, the LATE is in the original units of the outcome. That is why the calculator lets you choose between percentage and continuous input modes.
For example, if a reminder increases treatment take-up by 20 percentage points and raises vaccination by 6 percentage points, the complier effect is 30 percentage points. If a subsidy raises college attendance by 15 percentage points and increases annual earnings by $1,200, then the LATE is $8,000 per induced attendee. The interpretation always follows the same structure: effect on the outcome divided by effect on treatment.
How to explain IV compliers to non-technical audiences
A useful translation is: “The instrument does not tell us the effect for everybody. It tells us the effect for the people whose behavior actually changed because of the instrument.” In policy analysis, that is often exactly the relevant margin. If a scholarship offer affects enrollment only for students on the fence, then the IV estimate speaks to that on-the-fence group. That is the population a policymaker can realistically move using the scholarship lever.
This also explains why different instruments for the same treatment can produce different valid estimates. A distance instrument might identify people sensitive to travel costs. A pricing instrument might identify people sensitive to fees. A lottery instrument might identify people responsive to random access. Each one has its own complier population.
Recommended authoritative readings
If you want to go deeper into the theory and practice of instrumental variables, these resources are useful starting points:
- Columbia University overview of instrumental variables
- National Institutes of Health article on instrumental variable methods and assumptions
- Carnegie Mellon University lecture notes on instrumental variables
Final takeaway
To calculate compliers in an instrumental variables design, you start with the first-stage treatment difference between instrument groups. Under monotonicity, that difference is the complier share. Then you compute the reduced-form outcome difference and divide by the first stage to obtain the LATE. In symbols, Compliers = E[D|Z=1] – E[D|Z=0] and LATE = (E[Y|Z=1] – E[Y|Z=0]) / (E[D|Z=1] – E[D|Z=0]). The calculator on this page automates those steps, presents the implied principal strata, and visualizes the estimated shares so you can interpret your design more clearly and more quickly.
Use the arithmetic as a starting point, not the end of the analysis. The real scientific work is defending the instrument, checking the first stage, and explaining who the compliers are in your empirical context. When those pieces are solid, the IV framework becomes one of the most powerful tools in applied causal inference.