Skip to content

feat: Add non-parametric estimator Weighted Average Quantile (WAQ) analysis method for when cuped is not available #263

@luizhsuperti

Description

@luizhsuperti

Motivation

Online experimentation metrics (revenue, payments) are often thick-tailed,
which inflates the variance of the standard difference-in-means estimator and
leads to wide confidence intervals.

Athey, Bickel, Chen, Imbens & Pollmann (2023) — Semiparametric Estimation of
Treatment Effects in Randomized Experiments
— propose two semiparametrically
efficient estimators for this setting. This issue proposes implementing the
simpler of the two: the Weighted Average Quantile (WAQ) estimator.

WAQ is particularly valuable when CUPED is not available, for example, in
e-commerce contexts where purchase frequency is low (think Zalando, Zara) and
pre-experiment data is scarce.

What is WAQ?

Under a constant additive treatment effect assumption, the WAQ estimator
computes a weighted average of sorted quantile differences between treatment
and control. The weights are proportional to minus the second derivative of the
log density of the control outcome distribution, estimated nonparametrically
via adaptive kernel density estimation.

Properties:

  • For Normal outcomes: reduces to difference-in-means (no loss)
  • For thick-tailed outcomes: substantially lower variance than difference-in-means
  • Retains a causal interpretation even under mild misspecification (estimates a
    weighted average of quantile treatment effects)

Proposed usage

plan = AnalysisPlan.from_metrics_dict({
    "metrics": [{"name": "revenue", "metric_type": "simple"}],
    "variants": [
        {"name": "control", "is_control": True},
        {"name": "treatment", "is_control": False}
    ],
    "variant_col": "variant",
    "analysis_type": "waq"
})

Caveats

  • CUPED outperforms WAQ when pre-experiment data is available; WAQ is the
    better choice when it isn't
  • Computationally heavier than OLS due to adaptive KDE , may need to explore
    parallelization dependencies for large datasets

Out of scope for this issue (that the R package adds)

  • EIF (influence function) estimator
  • Sample splitting / cross-fitting variant
  • Proportional treatment effect model

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions