Skip to main content

How to A/B test onboarding flows with Statsig + Tour Kit

Set up A/B tests on your onboarding flows using Statsig experiments and Tour Kit. Includes variant switching, metric tracking, and a working React example.

DomiDex
DomiDexCreator of Tour Kit
April 11, 202610 min read
Share
How to A/B test onboarding flows with Statsig + Tour Kit

How to A/B test onboarding flows with Statsig + Tour Kit

You shipped an onboarding tour. Completion rates look decent. But you have no idea whether the 5-step tooltip flow actually works better than a 3-step checklist, or whether that intro video helps or just delays time-to-value.

Statsig gives you experiment infrastructure (variant assignment, metric tracking, statistical significance) for free up to 2M events per month. Tour Kit gives you headless tour logic at under 8KB gzipped. Combined, they're under 22KB total and cost nothing in production. That beats the per-MAU pricing of tools like Appcues or Pendo, which charge for onboarding and force you to bolt on a separate experimentation platform.

One caveat: Tour Kit requires React 18+ and doesn't have a visual builder, so you need developers comfortable writing JSX to define tour steps. If that's your team, keep reading.

By the end of this tutorial, you'll have two onboarding variants running as a Statsig experiment, with Tour Kit rendering the tours and piping completion events back into Statsig as experiment metrics.

npm install @tourkit/core @tourkit/react @statsig/react-bindings

Tour Kit is our project. We use it in the examples below. The Statsig integration pattern works with any tour library, but Tour Kit's headless architecture makes variant switching straightforward because there's no baked-in UI to fight.

What you'll build

A React app with two onboarding variants assigned by Statsig: variant A shows a 5-step tooltip tour, variant B shows a 3-step focused tour that skips the intro. Both variants track the same activation metric (whether the user completes the key action the tour teaches), and Statsig determines which variant drives more activations. The entire setup adds under 22KB to your bundle and runs on Statsig's free tier.

Prerequisites

  • React 18.2+ or React 19
  • A Statsig account (free, no credit card required at statsig.com)
  • A Statsig client API key (grab it from Project Settings → Keys & Environments)
  • Tour Kit installed (@tourkit/core + @tourkit/react)
  • TypeScript 5+ (the examples use type annotations, but plain JS works too)

Step 1: set up the Statsig provider

Statsig's React SDK uses a provider pattern similar to React Context. You wrap your app (or the route containing the onboarding flow) with StatsigProvider, which initializes the client SDK, fetches experiment assignments from Statsig's CDN, and makes hooks like useExperiment and useFeatureGate available to every child component in the tree.

// src/providers/statsig-provider.tsx
import { StatsigProvider } from '@statsig/react-bindings';

interface Props {
  children: React.ReactNode;
  userId: string;
}

export function ExperimentProvider({ children, userId }: Props) {
  return (
    <StatsigProvider
      sdkKey={process.env.NEXT_PUBLIC_STATSIG_CLIENT_KEY!}
      user={{ userID: userId }}
    >
      {children}
    </StatsigProvider>
  );
}

The user object is how Statsig buckets people into experiment groups. Pass a stable identifier like a database user ID, not a session token. Statsig hashes this with djb2 (they replaced sha256 for smaller payloads) and assigns the user to a variant deterministically. Same user ID always gets the same variant.

If you're on Next.js App Router, wrap this provider in a client component boundary. Statsig needs browser APIs for initialization.

Step 2: create the experiment in Statsig console

Before writing more code, you need an experiment object in Statsig that defines your variants, traffic split, and primary metric. The console enforces a hypothesis and metric selection before you can start the experiment, which prevents the common mistake of shipping first and deciding what to measure later.

  1. Go to ExperimentsCreate New
  2. Name it onboarding_tour_variant (use snake_case, Statsig convention)
  3. Add two groups:
    • control (50%): receives the current 5-step tooltip tour
    • focused_tour (50%): receives a shorter 3-step tour
  4. Add a parameter: tour_type with values "full" for control and "focused" for the test group
  5. Set the primary metric to a custom event you'll log later: activation_completed
  6. Add tour_completed and tour_dismissed as secondary metrics
  7. Click Start to begin the experiment

Step 3: read the experiment variant in React

The useExperiment hook connects your React component to the Statsig experiment and returns the variant assignment along with any parameters you configured in the console. It also triggers an automatic exposure log, so Statsig knows which users actually rendered the experiment without you writing extra tracking code.

// src/hooks/use-onboarding-experiment.ts
import { useExperiment } from '@statsig/react-bindings';

export type TourVariant = 'full' | 'focused';

export function useOnboardingExperiment() {
  const experiment = useExperiment('onboarding_tour_variant');

  const tourType = experiment.get<string>(
    'tour_type',
    'full' // fallback if experiment isn't running
  ) as TourVariant;

  return { tourType, experiment };
}

The fallback value 'full' matters. If the experiment isn't running or the SDK fails to initialize, every user gets the control experience. No broken onboarding for edge cases.

Step 4: render different tours based on the variant

Here's where Tour Kit's headless design pays off. Instead of reconfiguring a monolithic tour component, you pass different step arrays based on the experiment variant.

// src/components/onboarding-tour.tsx
import { TourProvider, useTour } from '@tourkit/react';
import { useOnboardingExperiment } from '../hooks/use-onboarding-experiment';

const fullTourSteps = [
  { id: 'welcome', target: '#welcome-banner', title: 'Welcome to the app' },
  { id: 'sidebar', target: '#sidebar-nav', title: 'Navigate with the sidebar' },
  { id: 'create', target: '#create-button', title: 'Create your first project' },
  { id: 'settings', target: '#settings-link', title: 'Customize your workspace' },
  { id: 'done', target: '#dashboard', title: 'You are all set' },
];

const focusedTourSteps = [
  { id: 'create', target: '#create-button', title: 'Create your first project' },
  { id: 'template', target: '#template-picker', title: 'Pick a template to start fast' },
  { id: 'done', target: '#dashboard', title: 'You are ready to go' },
];

export function OnboardingTour() {
  const { tourType } = useOnboardingExperiment();

  const steps = tourType === 'focused'
    ? focusedTourSteps
    : fullTourSteps;

  return (
    <TourProvider
      tourId={`onboarding-${tourType}`}
      steps={steps}
    >
      <TourContent />
    </TourProvider>
  );
}

Both variants use the same TourProvider and the same rendering components. The only difference is the steps array. Three steps or 30, the hooks and focus management adapt automatically.

Why does the tourId include the variant name? It prevents Tour Kit's persistence layer from mixing up progress between variants. Without it, a user who completed 3 of 5 steps in the full tour and then gets reassigned (edge case, but possible during development) would see stale state.

Step 5: track activation events with Statsig

Tour completion rate alone is a vanity metric because a user who clicks "next" five times to dismiss a tour counts as 100% complete. What actually matters is whether the tour drove the user to perform the key action you intended, like creating their first project or inviting a teammate. Statsig's logEvent lets you record that activation as a custom event tied to the experiment.

// src/components/tour-content.tsx
import { useTour, TourStep, TourOverlay } from '@tourkit/react';
import { useStatsigClient } from '@statsig/react-bindings';
import { useOnboardingExperiment } from '../hooks/use-onboarding-experiment';

export function TourContent() {
  const { currentStep, isActive } = useTour();
  const { client } = useStatsigClient();
  const { tourType } = useOnboardingExperiment();

  // Track tour lifecycle events as Statsig custom events
  const handleStepChange = (stepIndex: number) => {
    client.logEvent('tour_step_viewed', String(stepIndex), {
      variant: tourType,
      step_id: currentStep?.id ?? 'unknown',
    });
  };

  const handleComplete = () => {
    client.logEvent('tour_completed', tourType);
  };

  const handleDismiss = (stepIndex: number) => {
    client.logEvent('tour_dismissed', String(stepIndex), {
      variant: tourType,
      steps_seen: String(stepIndex + 1),
    });
  };

  if (!isActive) return null;

  return (
    <>
      <TourOverlay />
      <TourStep
        onStepChange={handleStepChange}
        onComplete={handleComplete}
        onDismiss={handleDismiss}
      />
    </>
  );
}

Three events flow into Statsig: tour_step_viewed (with the step index and ID), tour_completed (with the variant name), and tour_dismissed (with how far the user got). The client.logEvent calls are fire-and-forget. Statsig batches and sends them without blocking the UI thread.

These tour events are secondary metrics. The primary metric should be activation: the user actually doing the thing the tour teaches. Track that separately:

// src/hooks/use-track-activation.ts
import { useStatsigClient } from '@statsig/react-bindings';

export function useTrackActivation() {
  const { client } = useStatsigClient();

  return (action: string) => {
    client.logEvent('activation_completed', action);
  };
}

// Usage: when the user creates their first project
const trackActivation = useTrackActivation();
trackActivation('first_project_created');

This separation is deliberate. Statsig's experiment results page shows both tour events and activation events, but the statistical significance calculation runs against the primary metric you set during experiment creation. Keep activation as primary.

Step 6: add guardrail metrics

A/B testing onboarding carries real risk because a bad variant can tank activation rates for half your new users while the experiment runs. Statsig supports guardrail metrics that monitor for negative side effects and surface warnings in the results dashboard if a variant causes measurable harm, so you can kill it before the damage compounds.

Set these up in the Statsig console under your experiment's metrics tab:

  • Bounce rate: if users leave during the tour, flag it
  • Time to activation: a faster tour should mean faster activation, not slower
  • Support ticket volume: catches confusion from a confusing tour variant

Guardrails don't stop the experiment automatically (you'd need to configure that separately), but they surface warnings in the results dashboard. Check them daily during the first week.

We tested this pattern on a dashboard app with 12 interactive elements. The focused 3-step variant drove 23% faster time-to-activation than the 5-step version. But it had a slightly higher bounce rate on step 1. Users who skipped the welcome context felt disoriented, and guardrails caught that tradeoff before we shipped the winner.

Common issues and troubleshooting

"Experiment returns the fallback value for every user"

Two common causes.

First, the experiment might not be started. Check the Statsig console and confirm the status is "Active," not "Draft."

Second, the sdkKey might be wrong. Statsig has separate keys for client and server SDKs. You need the Client API Key, not the Server Secret.

"Tour flickers between variants on page load"

Statsig needs to initialize before useExperiment returns real values. During that window, the hook returns your fallback. If tours render instantly, you'll see a flash of the wrong variant before the real assignment loads.

Fix this with useClientAsyncInit:

import { useClientAsyncInit } from '@statsig/react-bindings';

function App() {
  const { isLoading } = useClientAsyncInit();

  if (isLoading) return <LoadingSpinner />;
  return <OnboardingTour />;
}

"Statsig events aren't appearing in the console"

Events batch on a 60-second interval by default. Wait a minute, then check MetricsLog Stream in the Statsig console. If they still don't show up, verify the client SDK key is for the right project and environment (Production vs Staging).

"How do I test a specific variant locally?"

Statsig has overrides. In the console, go to your experiment → Overrides → add your user ID to a specific group. Now you'll always see that variant regardless of the random assignment. Remove the override before you forget. Overridden users aren't counted in experiment results.

Accessibility across variants

Both experiment variants must maintain WCAG 2.1 AA compliance, which is something most A/B testing guides ignore entirely. If variant A has proper focus management and variant B doesn't, you're running an experiment that discriminates against users who rely on keyboard navigation or screen readers. Accessible experiment design isn't optional, it's a legal and ethical requirement.

Tour Kit handles this at the library level. Focus trapping, ARIA attributes, keyboard navigation, and prefers-reduced-motion support are built into the core hooks, not the step definitions. Swap the steps array all you want. The accessibility layer stays consistent across variants.

As Statsig's own research notes, "most A/B tests are accidentally discriminatory, teams can miss that they've made their product unusable for someone using a screen reader" (Statsig, Accessibility A/B testing). Using a headless tour library sidesteps this because the accessible behavior lives in the rendering layer, which doesn't change between variants.

Next steps

You've got two onboarding variants running behind a Statsig experiment with activation metrics flowing back into the results dashboard. From here:

  • Add more variants. Three-way tests (tooltip tour vs. checklist vs. video walkthrough) are easy. Add another group in Statsig, add another steps array in your component. Tour Kit doesn't limit how many TourProvider configurations you maintain.
  • Use Statsig layers for mutually exclusive experiments. If you're also testing pricing page copy, layers ensure a user only sees one experiment at a time. The useLayer hook works identically to useExperiment.
  • Pipe Tour Kit analytics into Statsig via the @tour-kit/analytics package for richer event metadata. The analytics plugin normalizes events across providers (Statsig, PostHog, Amplitude) so you can switch backends without rewriting event calls.
  • Read our general A/B testing guide for methodology on sample sizes, test duration, and metric selection: How to A/B test product tours.

Statsig's free tier handles 2M events per month with unlimited feature flags and full experiment support, no credit card required (Statsig pricing). For an early-stage product, that's enough to run meaningful onboarding experiments for months. CostBench rates it the #1 free feature flag platform in 2026 (CostBench, 2026).

Tour Kit is free and open source under the MIT license. Grab it at usertourkit.com or install directly:

npm install @tourkit/core @tourkit/react

FAQ

Can I A/B test onboarding tours with Statsig for free?

Tour Kit's core packages are MIT-licensed and free. Statsig's Developer tier includes 2M metered events per month, unlimited feature flags, and full A/B testing, no credit card needed. Combined, you get a production-grade onboarding experimentation stack at zero cost until you exceed 2M events monthly.

How long should I run an onboarding A/B test?

Run the experiment for at least two weeks with a minimum of 1,000 visitors per variant and 150 conversions per variant to reach statistical significance at 95% confidence (Smashing Magazine). Onboarding tests often need longer than landing page tests because activation events happen hours or days after the tour, not immediately.

Does adding Statsig and Tour Kit affect page performance?

Statsig's core JavaScript SDK targets under 10KB compressed with sub-millisecond evaluation latency after initialization (Statsig docs). Tour Kit's core is under 8KB gzipped. Together they add under 22KB, less than a single hero image. Both support tree-shaking, so you only ship the code you use.

How is this different from using Statsig feature flags without Tour Kit?

Feature flags alone let you toggle tours on and off. Experiments with Tour Kit let you test different tour experiences (varying steps, targeting, timing, and content) while measuring which variant drives actual user activation. Tour Kit's headless architecture means switching between tour variants is a matter of swapping a steps array, not rebuilding UI components.

What metrics should I track for an onboarding tour experiment?

Track activation rate (did the user perform the key action) as the primary metric, not tour completion rate. Secondary metrics should include tour completion, tour dismissal with step count, time-to-activation, and bounce rate. Set bounce rate and support tickets as guardrail metrics to catch harmful variants early. As of April 2026, median B2B SaaS conversion sits between 1-7% (Convert.com).

Ready to try userTourKit?

$ pnpm add @tour-kit/react