Skip to main content

How We Test

The methodology behind userTourKit benchmarks, bundle-size claims, accessibility scores, and comparison data — how measurements are produced, which versions are tested, and how to reproduce them.

Last updated

Testing philosophy

Every technical claim on this site is either (a) generated by running the current release in CI or (b) cited from a primary source that the reader can verify. When a measurement would require us to re-run a benchmark, we publish the methodology — not just the number — so it can be reproduced. If a claim cannot be verified, it is not published.

Automated quality gates (CI)

Every pull request to github.com/domidex01/tour-kit passes the following before merge:

  • TypeScript strict mode across all packages via pnpm typecheck. No implicit any, no unchecked indexed access.
  • Unit and integration tests via Vitest and @testing-library/react. Target coverage is >80% meaningful coverage on shipped packages.
  • Production build via tsup for ESM + CJS bundles with TypeScript declarations. If the build fails, the PR cannot merge.
  • Bundle-size budgets are defined per package: @tour-kit/core < 8 KB gzipped, @tour-kit/react < 12 KB, @tour-kit/hints < 5 KB. When a PR pushes a bundle over budget, we either reduce scope or justify the increase in the changeset.
  • Linting via Biome for consistency and basic correctness (unused vars, obvious anti-patterns).

Manual verification

  • Browser smoke-test. New components are exercised in the docs app against the current latest Chromium, Firefox, and Safari before release. Keyboard navigation and screen-reader labels are verified for every interactive primitive.
  • Accessibility. We target WCAG 2.1 AA. Components are spot-checked with axe-core and a screen reader (VoiceOver / NVDA). Lighthouse Accessibility score of 100 on the docs site is a release gate.
  • Documentation parity. Code samples in MDX are pasted into a blank Next.js 15 project and run at least once per quarter to catch drift.

Bundle-size methodology

Reported bundle sizes are gzipped measurements of the production tsup build, inspected from bundlephobia.com or measured locally with size-limit against the package's exported entry points. Tree-shaken sizes (subset imports) are labeled separately. When we compare competitor bundles we use the same tooling on the same day, against each package's published npm version — never a stale figure from a blog post. The live bundle-size benchmark publishes the raw numbers with the measurement date and source links.

Benchmark methodology

Performance comparisons (render time, time-to-first-step, memory usage) use a deterministic harness run on a reference machine with fixed CPU throttling in a headless Chromium session. Each scenario runs ≥10 iterations; we report the median and p95, not the single fastest run. The harness source and raw results are published alongside every benchmark so they can be rerun.

Until published harness URLs land, benchmark figures are labeled preliminary and the underlying data file is linked directly from the article.

Comparison article methodology

For articles that compare userTourKit against other libraries or SaaS tools:

  • We read the current official docs of the compared tool — version and access date are footnoted in the article.
  • Feature matrices map each row to a linkable evidence URL on the competitor's site, repo, or pricing page.
  • Subjective claims ("easier to style", "better DX") are marked as opinion and justified with a concrete example.
  • Our own product is bracketed inside a labeled "From the authors" note so readers can weigh the commercial bias.

Reproducibility

Everything we publish should be reproducible by a reader:

  • The full source of the library and the docs site lives in the public repository.
  • Build commands (pnpm build, pnpm typecheck, pnpm test) are documented in the root CLAUDE.md and package READMEs.
  • Benchmark data files (when published) are committed as JSON alongside the article MDX.

Related

See the editorial policy for how content is sourced, reviewed, corrected, and disclosed.

Frequently asked questions

Bundle-size methodology, benchmark reproducibility, accessibility testing, comparison sourcing, and the CI gates that run on every PR.

How does userTourKit measure React product tour library bundle sizes?
Reported sizes are gzipped measurements of the production tsup build, inspected from bundlephobia.com or measured locally with size-limit against the package public entry points. Tree-shaken sizes (subset imports) are labeled separately. Competitor bundles are compared with the same tooling on the same day against each library published npm version — never a stale figure from a blog post.
How are React tour library benchmarks made reproducible?
Performance comparisons (render time, time-to-first-step, memory usage) run on a deterministic harness with fixed CPU throttling in a headless Chromium session. Each scenario runs at least 10 iterations and reports the median and p95, not the single fastest run. The harness source and raw results are committed alongside every benchmark so a reader can rerun them.
What does userTourKit do to test accessibility?
Every release targets WCAG 2.1 AA. Components are spot-checked with axe-core and a screen reader (VoiceOver or NVDA). Keyboard navigation, focus traps, ARIA live regions, and screen-reader labels are verified on every interactive primitive. Lighthouse Accessibility 100 on the docs site is a release gate.
How do you compare userTourKit to React Joyride, Shepherd.js, and SaaS platforms?
For each comparison page we read the competitor current official docs (version and access date footnoted), map every feature-matrix row to a linkable evidence URL, mark subjective claims as opinion with a concrete example, and bracket commercial bias in a labeled "From the authors" note so readers can weigh it.
What automated quality gates run on every userTourKit pull request?
TypeScript strict typecheck across all packages, Vitest unit + integration tests targeting >80% meaningful coverage, full tsup ESM+CJS production build, per-package bundle-size budgets (core <8 KB, react <12 KB, hints <5 KB gzipped), and Biome linting. If any gate fails, the PR cannot merge.