Reference

Admin: skills smoke test

/admin/skills-smoke is an internal page that runs every built-in skill against a stock prompt and reports pass/fail. Useful when verifying a deployment, debugging a regression, or sanity-checking a model swap.

#Who can run it

The page is gated behind admin access. Workspace owners on Max and Enterprise plans can see it. If you don't see it in the sidebar, you don't have access.

#Running the test

1

Open the page

Navigate to /admin/skills-smoke. The page lists every built-in skill grouped by pack, with a status pill per row (idle / running / pass / fail).

2

Pick a scope

Three buttons at the top:

  • Run all — every skill in batches of 4.
  • Run failed — re-run only previously failed rows.
  • Run pack — pick a single pack (Roast, Marketing, Career, Motion, Augmenters).
3

Watch the run

Skills run in parallel batches. Each row updates live with a spinner during the call, then turns green (pass) or red (fail). Click any row to expand the full request/response payload — useful for debugging.

4

Investigate failures

A failure means the skill returned an error event, parsed badly, or timed out. Expand the row for the raw response. The most common causes:

  • LLM provider is rate-limiting (transient).
  • System prompt drifted and now produces invalid JSON (augmenters).
  • Stock prompt no longer triggers the expected skill behavior.

#What's in a row

FieldTypeDescription
aliasstringThe skill's slash command (e.g. /roast).
kindcritique | augmenterWhat the skill is supposed to do.
statusidle | running | pass | failCurrent state of the smoke run.
durationmsHow long the call took.
outputexpandableThe full streamed response (markdown for critique, JSON for augmenter).
errorexpandableError message + stack trace if failed.

#When to run

  • After a deployment — verify nothing broke in the LLM pipeline.
  • After a model swap — confirm the new model handles all skill prompts correctly.
  • After editing a built-in's system prompt — sanity check before merging.
  • On a regression report — narrow the scope to the affected pack first.

#It's not a test suite

The smoke test isn't a unit/integration test substitute. Skills have non-determinism by design — a "pass" means "produced output of the right shape", not "produced the right answer". For deeper guarantees, write actual tests against the skill engine.

#Where to go next

PromptFloe docs · last updated Jun 2026Report a doc issue