Skip to main content

Onboarding tool evaluation checklist for engineering teams

Score product tour tools across 8 engineering criteria before committing. Covers bundle size, accessibility, TypeScript support, and vendor lock-in risk.

DomiDex
DomiDexCreator of Tour Kit
April 9, 202613 min read
Share
Onboarding tool evaluation checklist for engineering teams

Onboarding tool evaluation checklist for engineering teams

Most onboarding tool evaluations are written for product managers comparing SaaS dashboards. They rank tools by "ease of use" and "no-code builder" and call it a day. But if your engineering team is the one integrating, maintaining, and debugging the thing at 2am, you need different criteria.

We built Tour Kit and spent months evaluating the same tools we compete against. That makes us biased, but it also means we know exactly which questions surface real differences between options. Every criterion below is something we tested firsthand. Use this checklist even if you pick a competitor.

npm install @tourkit/core @tourkit/react

What is an onboarding tool evaluation checklist?

An onboarding tool evaluation checklist is a structured scorecard that engineering teams use to compare product tour libraries, digital adoption platforms, and in-house solutions across technical criteria before committing to integration. Unlike vendor-marketing feature matrices, an engineering checklist scores dimensions that affect your daily workflow. Bundle size impact on Core Web Vitals, TypeScript type inference quality, accessibility compliance depth, and migration difficulty if you need to switch later. As of April 2026, no widely-cited evaluation framework targets engineering teams specifically — every existing checklist we found is oriented toward product managers or enterprise procurement (ZTABS, Whatfix).

That gap is the reason this article exists.

Why it matters: generic evaluation criteria fail engineering teams

Generic SaaS evaluation frameworks score communication quality and cultural fit alongside technical capability. The ZTABS vendor scorecard allocates only 25% weight to technical capability while giving 20% to "communication and process" (ZTABS). For an engineering team choosing a product tour library that ships in your bundle and runs in your users' browsers, technical capability isn't 25% of the decision. It's 80%.

PM-oriented checklists care about time-to-first-tour with a no-code builder. Engineering teams care about time-to-first-tour in production, which includes TypeScript types compiling, tests passing, Lighthouse scores holding, and the thing not breaking when React releases a minor version.

Whatfix reports that 90% of use cases benefit from third-party SaaS over building in-house, and that most SaaS tools deploy in under one hour (Whatfix). Those numbers are real. But "deployed" and "production-ready" aren't the same thing. The hour-one deployment becomes a month-three migration when you discover the tool injects 200KB of JavaScript, doesn't support your design system, or requires !important overrides on every tooltip.

The 8-criteria engineering evaluation checklist

We scored each criterion on a 1-5 scale when evaluating tools for our own projects. You should do the same. A spreadsheet with weighted scores beats gut feeling every time. Recency bias skews decisions toward the last vendor you demoed (ZTABS).

Here's the checklist.

1. Bundle size and runtime performance

Measure the gzipped transfer size and parse time of each tool. Not the number on their marketing page, but the actual number from bundlephobia or your own webpack/Vite build analysis.

ToolGzipped sizeDependenciesTree-shakeable
Tour Kit (core + react)<20KB0 runtimeYes (10 packages)
React Joyride~37KB5+ (includes react-floater)No
Shepherd.js~28KB2+ (Floating UI)Partial
Driver.js~5KB0No (single bundle)
Appcues (SaaS)~180KBExternal scriptN/A

Why this matters: Google's research shows pages loading 45KB+ of JavaScript see measurably higher bounce rates on mobile (web.dev). SaaS onboarding tools that inject external scripts can add 100-200KB to your initial load. Run a Lighthouse audit before and after installation — if your Performance score drops more than 5 points, that should weigh heavily in your scoring.

Score 5: Under 15KB gzipped, zero runtime dependencies, fully tree-shakeable. Score 1: Over 100KB, external script injection, no tree-shaking.

2. TypeScript support quality

"Has TypeScript types" is not the same as "good TypeScript support." Check these specifics:

  • Are types shipped with the package or from @types/*? (Shipped types stay in sync. DefinitelyTyped types lag.)
  • Does the API use generics for step data, so your custom metadata is type-safe?
  • Does strict: true compile without errors? Run it. Don't trust the docs.
  • Are hook return types inferred correctly, or do you need manual type annotations everywhere?
// Tour Kit: step metadata is generic and type-safe
import { useTour } from '@tourkit/react';

interface StepMeta {
  analyticsId: string;
  requiredRole: 'admin' | 'user';
}

const { currentStep } = useTour<StepMeta>();
// currentStep.meta.analyticsId is typed as string
// currentStep.meta.requiredRole is 'admin' | 'user'

Sandro Roth's evaluation of React tour libraries found that several major libraries still ship incomplete types or fail under strict: true (Sandro Roth). We run Tour Kit with strict: true in CI. Every commit is type-checked before merge.

Score 5: First-party types, generics for custom data, strict mode clean. Score 1: No types, or @types/* package maintained by a third party and months behind.

3. Accessibility compliance

Most evaluation checklists stop at "supports keyboard navigation." That isn't enough. A new academic framework called POUR+ (Perceivability, Operability, Understandability, Personalisation) was designed specifically for sequential onboarding flows. It's the first formal accessibility evaluation framework for this exact use case (Research Square).

Test each tool against these POUR+ criteria:

  • Perceivability: Screen reader announcement sequence. Does each step announce its content, position ("step 2 of 5"), and available actions?
  • Operability: Focus trap within the tooltip. Keyboard navigation between steps (Tab, Escape, arrow keys). Touch target sizes on mobile (44x44px minimum per WCAG 2.2).
  • Understandability: Logical focus order that follows the visual sequence. Clear labeling of controls.
  • Personalisation: Can users pause, skip, or restart? Does the tool persist their preference? Can step pacing be adjusted?

A real-world accessibility evaluation of an onboarding prototype scored only 2.9/5 overall on the POUR+ framework, with Personalisation scoring the lowest at 2.0/5 (Research Square). Most tools fail on personalisation entirely.

Run axe-core on a page with each tool's tour active. Count the violations. We tested this with Tour Kit and hit zero critical violations, but no automated tool catches everything. Manual testing with VoiceOver and NVDA is still required.

Tour Kit limitation: we don't ship a visual builder, so non-technical team members can't create accessible tours without developer involvement.

Score 5: Zero axe-core violations, full POUR+ compliance, ARIA live regions for step announcements. Score 1: No ARIA attributes, no keyboard navigation, no focus management.

4. Architecture and design system compatibility

This is the criterion that no existing evaluation checklist includes, and it's the one that determines whether you spend 2 hours or 2 months integrating.

Ask three questions:

  1. Headless or opinionated? Headless tools give you logic (step sequencing, scroll handling, highlight positioning) without prescribing UI. You render steps with your own components. Opinionated tools ship pre-built tooltips that may clash with your design system.

  2. Styling approach? Inline styles are impossible to override with Tailwind or CSS modules without !important hacks. CSS class-based approaches compose with your existing setup. CSS-in-JS adds another runtime dependency.

  3. Composition model? Can you use individual pieces (just the highlight, just the step logic, just the scroll handler) or is it all-or-nothing?

// Headless approach: Tour Kit with your shadcn/ui components
import { useTourStep } from '@tourkit/react';
import { Card, CardContent, CardFooter } from '@/components/ui/card';
import { Button } from '@/components/ui/button';

function TourTooltip() {
  const { step, next, prev, isFirst, isLast } = useTourStep();
  return (
    <Card>
      <CardContent>{step.content}</CardContent>
      <CardFooter>
        {!isFirst && <Button variant="ghost" onClick={prev}>Back</Button>}
        <Button onClick={next}>{isLast ? 'Done' : 'Next'}</Button>
      </CardFooter>
    </Card>
  );
}

Sandro Roth found that React Joyride's inline-styles-only approach creates friction for teams using Tailwind or design tokens, and that Shepherd's spotlight rendering breaks in dark mode (Sandro Roth). These are the exact integration problems that surface at month two, not hour one.

Score 5: Headless architecture, CSS class-based styling, composable packages. Score 1: Opinionated UI, inline styles only, monolithic bundle.

5. Framework version compatibility

Check whether the tool works with your current framework versions and has a track record of updating when frameworks ship breaking changes.

The specifics to verify:

  • React 19 support (concurrent mode, use() hook, compiler compatibility)
  • Next.js App Router support (Server Components boundary handling, 'use client' directives)
  • Vite/Webpack compatibility
  • React strict mode double-render behavior

Sandro Roth's evaluation found that React Joyride's class-based architecture and Shepherd's React wrapper both lack React 19 compatibility as of early 2026 (Sandro Roth). If you're already on React 19 (or planning to upgrade in the next 6 months), this is a blocking criterion.

Check the library's GitHub issues for "React 19" and "strict mode." The issue count and response time tell you more than the README.

Score 5: React 18 + 19 native support, strict mode tested in CI, App Router documentation. Score 1: Last framework compatibility update was 12+ months ago, open issues for current framework version.

6. Testability and CI/CD integration

Can you write automated tests for your tours? This criterion is absent from every evaluation framework we found, which tells you those frameworks weren't written by engineers who maintain production code.

Evaluate:

  • Can you programmatically start, advance, and complete tours in test environments?
  • Does the tool work with Playwright, Cypress, or Testing Library?
  • Can you mock or stub the tour in unit tests without importing the entire library?
  • Does the tour system degrade gracefully when target elements don't exist?
// Tour Kit: testable by design
import { renderHook, act } from '@testing-library/react';
import { useTour } from '@tourkit/react';

test('tour advances to next step', () => {
  const { result } = renderHook(() => useTour(), {
    wrapper: TourProvider,
  });

  act(() => result.current.start());
  expect(result.current.currentStep?.id).toBe('step-1');

  act(() => result.current.next());
  expect(result.current.currentStep?.id).toBe('step-2');
});

If you can't test it, you can't refactor it safely. And if you can't refactor it, your tour code becomes the part of the codebase nobody touches.

Score 5: Hooks-based API testable with Testing Library, graceful degradation, no DOM side effects in tests. Score 1: No programmatic API, requires real browser to test, crashes when elements are missing.

7. Vendor lock-in and data portability

No evaluation checklist we found includes vendor lock-in risk, which is baffling given that it's the number one regret teams report after committing to a SaaS onboarding tool.

Evaluate the exit cost:

  • Can you export your tour definitions as JSON or code?
  • How is tour content stored? In your repository or on the vendor's servers?
  • What happens to your data if you cancel your subscription?
  • How many engineering hours would migration take? (We've written migration guides for React Joyride, Shepherd.js, and Appcues. The answer is usually 2-5 days.)

Open-source libraries score highest here by default. Your tour definitions live in your repository. There's no vendor server to depend on, no subscription to cancel, and no data export to negotiate.

The hidden cost: SaaS tools often store tour targeting rules, analytics, and A/B test configurations alongside tour content. Even if you can export the tours, you lose the operational data.

Score 5: Code-owned definitions, no external dependencies, documented migration path. Score 1: Vendor-hosted content, no export, no migration documentation.

8. Licensing and total cost of ownership

"Open source" isn't one bucket. MIT, Apache 2.0, and AGPL have very different implications when your legal team reviews them. Shepherd.js requires AGPL licensing as of April 2026, which means any modifications must be open-sourced. That's a non-starter for many commercial products.

Calculate 3-year TCO, not first-year cost:

  • Open-source (MIT/Apache): $0 license + developer integration time ($150/hr x estimated hours)
  • Open-source (AGPL): $0 license + legal review cost + compliance overhead
  • One-time license: Upfront cost + developer time (Tour Kit Pro: $99 one-time)
  • SaaS subscription: Monthly cost x 36 months + MAU scaling + developer integration time

Whatfix estimates that teams with fewer than 12 full-time engineers should not build in-house (Whatfix). That's true for a from-scratch build. Using a headless library changes the math: you get pre-built logic but keep control over the UI and deployment.

For a deeper cost breakdown with specific numbers, see our build vs buy calculator.

Score 5: MIT license, predictable pricing, no per-MAU scaling. Score 1: AGPL or proprietary with MAU-based pricing that triples at 50K users.

How to run the evaluation

Don't evaluate from documentation alone. Build the same 5-step tour in each tool you're considering, then score against the 8 criteria above.

Here's the process that worked for us:

  1. Define a reference tour. Pick a real feature in your product. 5 steps, at least one scroll-to target, one conditionally-shown step.

  2. Time the integration. From npm install to working tour, including TypeScript setup, test writing, and design system styling.

  3. Run the benchmarks. Lighthouse before and after. Bundle size delta. axe-core violations.

  4. Score each criterion 1-5. Use a spreadsheet. Weight criteria by what matters to your team.

  5. Calculate weighted totals. The tool with the highest score wins. If two are within 5 points, pick the one with better TypeScript support. You'll interact with the types every day.

Budget about 4 hours per tool. The cost of choosing wrong and migrating later is measured in weeks.

# Quick start for evaluating Tour Kit
npm install @tourkit/core @tourkit/react

Explore the full Tour Kit documentation to run your own evaluation against these criteria.

Common evaluation mistakes

Demoing instead of integrating. A 15-minute vendor demo shows you the happy path. It doesn't show you the TypeScript errors, the styling conflicts, or the Lighthouse score impact. Always build a proof-of-concept in your actual codebase.

Ignoring the maintenance tail. Integration is a one-time cost. Maintenance is ongoing. Ask: when the vendor ships a breaking change, how long does the upgrade take? When React ships a major version, how quickly does the tool support it?

Evaluating features you won't use. Chameleon's benchmark data from 15 million interactions shows that simple tours with progress indicators outperform complex multi-modal flows: completion improves 12% with progress bars and dismissals drop 20% (Chameleon). You probably don't need the enterprise feature matrix. You need a tool that does the basics well.

Skipping the legal review. AGPL, GPL, and BSL licenses have specific obligations. Your legal team needs to review the license before engineering commits time to integration. We've seen teams build on AGPL libraries and discover the compliance requirements months later.

The scorecard template

Here's a blank scorecard. Copy it, fill in your scores, and share it with your team.

CriterionWeightTool ATool BTool C
1. Bundle size / performance1-3x/5/5/5
2. TypeScript support1-3x/5/5/5
3. Accessibility1-3x/5/5/5
4. Architecture / design system1-3x/5/5/5
5. Framework compatibility1-3x/5/5/5
6. Testability / CI/CD1-3x/5/5/5
7. Vendor lock-in risk1-3x/5/5/5
8. Licensing / TCO1-3x/5/5/5
Weighted total/120/120/120

Set each weight to 1x, 2x, or 3x based on your priorities. Max possible score with all 3x weights is 120.

FAQ

What criteria should engineering teams prioritize when evaluating onboarding tools?

Engineering teams should weight bundle size impact, TypeScript type safety, and framework version compatibility highest in their onboarding tool evaluation checklist. These three criteria affect daily developer experience and long-term maintainability. Accessibility compliance and vendor lock-in risk follow closely, since both create expensive rework if ignored. Use a weighted scorecard with 1-3x multipliers per criterion.

How long should a proper onboarding tool evaluation take?

A thorough onboarding tool evaluation checklist process takes roughly 4 hours per tool for a senior developer. That includes building a reference 5-step tour, running Lighthouse and axe-core audits, and scoring against all 8 criteria. Budget one full day to compare three tools. Teams that skip structured evaluation report 2-5 days of migration effort when they discover a tool doesn't meet requirements.

Should engineering teams build onboarding in-house or use a library?

Whatfix research shows that building a usable in-house onboarding solution requires at least 12 full-time engineers, and 90% of use cases benefit from third-party tools (Whatfix). But the third-party option isn't binary either. Headless libraries like Tour Kit sit between building from scratch and buying SaaS. You get pre-built logic for step sequencing and scroll handling, but render everything with your own components. Use the onboarding tool evaluation checklist above to score all three paths.

How do SaaS onboarding tools affect Core Web Vitals?

SaaS onboarding tools that inject external scripts typically add 100-200KB to your initial JavaScript payload, directly impacting Largest Contentful Paint and Total Blocking Time. Run Lighthouse before and after installing any tool. Open-source libraries installed via npm are bundled with your code and tree-shaken, keeping the impact small. Tour Kit's core ships under 8KB gzipped with zero runtime dependencies.

What accessibility standards apply to product tour tools?

Beyond WCAG 2.1 AA (the legal baseline in most jurisdictions), the POUR+ framework is the first evaluation model designed specifically for sequential onboarding flows. It adds a Personalisation dimension covering pause, skip, restart, and pacing controls. A real-world POUR+ evaluation scored only 2.9/5 overall (Research Square). Run axe-core with each tool's tour active and manually test with VoiceOver and NVDA before committing.


Get started with Tour Kit. Run npm install @tourkit/core @tourkit/react, build your reference tour, and score it against this checklist. The full documentation includes TypeScript examples, accessibility guides, and migration paths from every major competitor.

Ready to try userTourKit?

$ pnpm add @tour-kit/react