Tag: utm codes

  • UTMs Don’t Prove Lift. Control Groups Do.

    UTMs Don’t Prove Lift. Control Groups Do.

    While interviewing candidates to backfill a lifecycle role on our LatAm team, I posed a simple scenario:

    You’ve optimized a campaign and you’re seeing a nice lift in conversions. How do you know whether those conversions came from your changes, from seasonality, or from cannibalizing another channel?

    Almost everyone gave the same answer: check the UTM codes.

    A few went a step further and suggested comparing campaign performance to overall conversion trends to rule out seasonality. But fewer than half mentioned the thing that actually answers the question of causality: a control group.

    Why UTMs (and last-click attribution) can’t prove causality

    UTMs are great at showing correlation. They help you understand where traffic or conversions were attributed—but not why they happened.

    For example:
    You send an email to 100 people. Thirty of them click and make a purchase. On paper, that looks like a win.

    But what if those same people also saw a YouTube ad earlier that day? Or a paid social ad? Or searched your brand directly and then went hunting for a promo email before buying?

    Which interaction actually caused the conversion?

    Last-click attribution (and UTMs by extension) can’t answer that. They only tell you which touchpoint happened to be last.

    Now, let’s add a control.

    Say you withheld the email from 100 otherwise eligible users. Thirty people from that group also made a purchase. The “lift” from your email doesn’t look so convincing anymore.

    If none of the control group converted, that’s a very different story. Now you’re much closer to proving impact.

    (For the sake of simplicity, we’ll ignore statistical significance here. That’s a separate post.)

    The control group concept (eligible users, no message)

    I first learned about control groups in science class. I first saw them used properly while working agency-side.

    The concept is simple: you intentionally withhold messaging from a portion of the eligible audience and compare outcomes.

    Agencies, in particular, use controls aggressively. We ran:

    • Global controls that received no messaging from an entire campaign
    • Message-level controls to measure the impact of individual emails or pushes
    • Time-based controls where a group was excluded for a week or a month to account for seasonality

    Controls let you answer the question stakeholders actually care about: Did this campaign change behavior, or would it have happened anyway? That’s why they’re an agency’s secret weapon.

    What to watch out for

    Controls are powerful, but only if you set them up correctly.

    Sample size
    You don’t need a 50/50 split. For most small-to-mid-sized sends, 10–20% is plenty. If your audience is large (20k+), even 5% can be enough.

    Contamination
    If your control group can still see the message through another channel (like a sitewide banner, in-app message, or paid ad), you’ve contaminated the test. And once that happens, your results are no longer clean.

    Overlapping journeys
    If users can enter another lifecycle or promotional flow while they’re supposed to be held out, you’re no longer testing what you think you’re testing.

    Controls require discipline. But when done well, they turn “this performed well” into “this caused lift.” And that’s the difference between reporting and strategy.