Production Onboarding

A short checklist for taking an agent plugin from first publish to production telemetry and evals.

Use this checklist when a plugin is ready for a design partner, customer pilot, or internal production team. The goal is to publish the plugin product, verify metadata-only telemetry, and create the first eval loop without adding Telvine as a runtime dependency.

1. Prepare the plugin repo

The plugin should have one installable product shape and one or more harness install surfaces:

my-plugin/
  .codex-plugin/plugin.json        # when Codex can install it
  .claude-plugin/plugin.json       # when Claude can install it
  skills/my-skill/SKILL.md
  evals/my-skill/cases.jsonl

Keep the plugin product harness-neutral. Codex, Claude Cowork, Claude Code, Copilot Cowork, and other hosts are install paths for the same plugin.

npm i -g telvine
telvine login

telvine login opens the browser sign-in flow and stores a local CLI session.

3. Dry-run the publish

telvine publish ./my-plugin --dry-run

The dry run should show the plugin manifest, component inventory, discovered Skills, and eval suites. Fix manifest paths, missing SKILL.md files, or eval case formatting before publishing.

4. Publish the plugin

telvine publish ./my-plugin

The publish command registers the plugin, creates a version, imports eval suites, and prints a plugin-scoped write key once. Store these values in the runtime environment:

export TELVINE_PLUGIN_ID=plg_...
export TELVINE_WRITE_KEY=tel_wk_...

Do not commit write keys to the plugin repo.

5. Verify first production events

Run the plugin once in the target harness. The first observed run should emit plugin.install once per installation id, followed by Skill or component events:

skill.invocation.start, skill.invocation.end, skill.invocation.error
plugin.component.invoked, plugin.component.error
feedback.submitted when a user or reviewer provides feedback

Then inspect the plugin from the CLI:

telvine plugins metrics <plugin-id> --since 1h
telvine plugins events <plugin-id> --limit 20

You should see metadata-only events. Do not send prompts, file contents, connector payloads, tool arguments, browser captures, retrieved records, or model outputs.

6. Add the first eval loop

Add human-review eval cases for the most important workflow checkpoints:

evals/my-skill/cases.jsonl
evals/my-skill/rubric.md

Keep fixtures synthetic or explicitly approved for testing. Use evals to compare the current plugin version against the next version before rollout.

Leave agent experience feedback enabled unless the suite is not suitable for self-review. The harness should ask the AI agent using the plugin to rate usability and submit only the 1-5 score plus a short metadata-only note. When starting from the Telvine Create Plugin workflow, keep the generated agent-experience-reviewer Skill or equivalent harness prompt unless the plugin owner explicitly opts out.

If a plugin owner enables a public listing, Telvine serves a minimal page that contains distribution-safe metadata and harness install instructions. Public listings are opt-in per plugin and return 404 when disabled.

Use the listing as a distribution aid, not as an authority for private telemetry, source contents, eval results, or customer usage.

Exit criteria

The plugin is production-ready when:

The plugin publishes cleanly without dry-run warnings.
The runtime has TELVINE_PLUGIN_ID and TELVINE_WRITE_KEY.
The first run produces plugin.install and at least one Skill/component event.
The dashboard and CLI show recent metrics for the plugin.
At least one eval suite exists for the workflow that matters most.
Telemetry has been reviewed for metadata-only compliance.