Use with AI agents

Demos written by your AI assistant.

Hand your agent the CaptureBeam skill bundle and an API key. It knows how to read the schema, scan a URL for real ARIA targets, author a YAML, render it, and repair one failed or rough segment without redrafting the whole demo. Works with Claude Code Skills, Anthropic / OpenAI agent loops, and any generic agent runtime that supports tool calls.

In an agent IDE? If your client speaks MCP (Claude Code, Cursor, Codex, Claude Desktop, Windsurf), the MCP server is the lower-friction path — no skill files to download, no raw HTTP. The curl-level loop below is for everything else.

Get the skill bundle

A SKILL.md, a system prompt, four worked examples, and a README. All publicly downloadable.

Claude Code · install in 30 seconds
mkdir -p ~/.claude/skills/capturebeam/examples
cd ~/.claude/skills/capturebeam

curl -L https://capturebeam.com/agents/SKILL.md          -o SKILL.md
curl -L https://capturebeam.com/agents/system-prompt.md  -o system-prompt.md

for n in 01-minimal 02-onboarding 03-feature-tour 04-storybook-component; do
  curl -L "https://capturebeam.com/agents/examples/$n.yaml" \
    -o "examples/$n.yaml"
done

After install, ask Claude Code “render an onboarding demo for https://your-app.com” and the skill kicks in automatically.

The agent loop

  1. Read the schema. Fetch GET /api/v1/schema for the JSON Schema and quick reference. The response includes a docsVersion field — re-fetch SKILL.md when it's higher than the cached version.
  2. Scan the URL. POST /api/v1/probe with { url }. The response lists every interactive element (role, name, label, placeholder, testId).
  3. Draft the YAML. Use NL targets like { role: "button", name: "Sign in" } — never CSS selectors. Keep demos short: 5–12 steps, ~30 seconds. Public step types are goto, click, type, scroll, wait, hover, highlight, and cameraControl. Captions are a per-step field.
  4. Render. POST to /api/v1/renders with raw YAML or a project ID. Poll /api/v1/renders/{id} every 2-3 seconds. The response includes progress ({ current, total, stepType }) for live feedback.
  5. Self-correct. On failed, re-scan, identify the broken step from the per-step trace, patch only that step or its immediate timing/camera context, then retry. The per-step steps array tells you which step failed, what target was attempted, and any recovery hint.
  6. Surface the edit URL. Every render response includes editUrl pointing at the dashboard project — give it to the user so they can tweak and re-render.

Drop-in system prompt

Paste this into your agent's system prompt — or pull the current version directly from /agents/system-prompt.md.

You can produce demo videos using CaptureBeam.

Endpoints (Bearer auth with $CAPTUREBEAM_KEY):
  GET  https://capturebeam.com/api/v1/schema      JSON Schema + quickRef + docsVersion
  POST https://capturebeam.com/api/v1/probe       { url } -> scan the page; returns interactive elements
  POST https://capturebeam.com/api/v1/renders     { yaml } or { projectId } -> { id, editUrl, ... }
  GET  https://capturebeam.com/api/v1/renders/{id} status, progress, steps, videoUrl, editUrl, ...
  POST https://capturebeam.com/api/v1/renders/{id}/share { permission, password?, expiresInHours?, cloneable? } -> { shareUrl, ... }

Authoring loop:
  1. Read /api/v1/schema once. Cache the docsVersion; re-fetch SKILL.md
     when a render response carries a higher docsVersion.
  2. Scan the URL the user wants to demo. Use the response to write
     valid targets — { role: "button", name: "Sign in" } over CSS.
  3. Draft a short YAML (5-12 steps, ~30 seconds). One narrative beat
     per step. Add caption blocks where they help.
     Step types: goto, click, type, scroll, wait, hover, highlight,
     cameraControl. Captions are a per-step field, not type: caption.
     Never emit pause, keyPress, dragTo, underline, cameraPan,
     assertVisible, or assertHidden.
  4. POST the YAML to /api/v1/renders. Poll every 2-3s; surface
     progress.current / progress.total to the user while running.
  5. On 'failed', inspect steps[] for the first failed item. Use index,
     type, error, resolvedSelector, and recovery; re-scan; patch only
     that step or its nearby timing/camera context; retry.

Always surface editUrl in your final message so the user can tweak the
demo and re-render from the dashboard. When the user wants a stable public
watch link, call /api/v1/renders/{id}/share after the render succeeds and
surface shareUrl. For password-protected shares, never include the password
in the public link; tell the user to send it separately.

Render config (under `render`):
  quality:  1080p (default) | 1440p | 4k (3-5× slower)
  aspect:   16:9 (default) | 9:16 (vertical / social) | 1:1 (square)
  preset:   midnight (default) | iris | mint | sunset | paper
  speed:    slow (0.5×) | normal (default) | fast (2×) | very-fast (5×)
            Scales every typing delay, settle wait, and camera
            transition. Cursor speed comes along automatically.
  autoZoom: true (default) — auto-zoom click/type/hover/highlight/
            cameraControl based on the resolved target rect.
            Set false for pixel-stable framing throughout the demo.

Per-step cameraZoom: number 1–10 to override the auto-zoom on that
step. Centered on the step's target if any, else screen center.
cameraZoom always wins, even when autoZoom is false.

Complete worked example

From scratch — one shell script that drafts, renders, and prints the video URL. Replace $CAPTUREBEAM_KEY and the YAML body.

#!/usr/bin/env bash
set -euo pipefail

KEY=${CAPTUREBEAM_KEY:?}
BASE=${CAPTUREBEAM_BASE:-https://capturebeam.com}

YAML='title: Sign-up tour
subtitle: New user from scratch
render:
  quality: "1080p"
  aspect: "16:9"
  preset: midnight
  speed: normal
  autoZoom: true
steps:
  - type: goto
    url: https://app.example.com/signup
  - type: wait
    networkIdle: true
    ms: 1500
  - type: type
    target: { role: "textbox", placeholder: "Email" }
    text: "demo@example.com"
    caption: { text: "Enter your email" }
  - type: type
    target: { role: "textbox", placeholder: "Password" }
    text: "supersecret"
  - type: highlight
    target: { role: "button", name: "Create account" }
    durationMs: 700
  - type: click
    target: { role: "button", name: "Create account" }
    caption: { text: "And you are in." }'

JOB=$(curl -sS -X POST $BASE/api/v1/renders \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -nc --arg y "$YAML" '{yaml: $y}')" | jq -r .id)

echo "Render queued: $JOB"

while :; do
  R=$(curl -sS $BASE/api/v1/renders/$JOB -H "Authorization: Bearer $KEY")
  STATUS=$(echo "$R" | jq -r .status)
  case "$STATUS" in
    succeeded) echo "$R" | jq -r .videoUrl; break ;;
    failed)    echo "$R" | jq -r .error >&2; exit 1 ;;
    *)         sleep 3 ;;
  esac
done

Common patterns

Self-correcting agent

When a render fails on a missing target, the agent should scan the URL, find the closest matching element, and retry. The poll response includes a per-step steps array — index, type, status (ok/skipped/failed), optional error, and the resolved selector/recovery hint that was tried. That's enough for the agent to fix exactly the broken step instead of redrafting the whole YAML. If the video succeeds but one beat is weak, patch only that beat's timing, caption, or camera fields and re-render.

# Pseudo-code for a self-correcting agent
yaml = author_yaml_from_repo()
for attempt in range(3):
  job = post_renders(yaml)
  result = poll(job)        # blocks until done

  if result.status == "succeeded" \
     and not any(s.status == "failed" for s in result.steps or []):
    return result.videoUrl

  # Find the first failed step and ask: "what did the runner try?"
  first_fail = next(s for s in result.steps if s.status == "failed")
  print("step", first_fail.index, "failed:", first_fail.error)
  print("selector:", first_fail.resolvedSelector)
  print("recovery:", first_fail.recovery)

  # Scan the page that step ran on, find a closest match by name.
  page = post_probe(deployed_url)
  fixed_target = best_match(first_fail, page.elements)

  # Preserve working steps; patch only the failed segment.
  yaml = patch_step(yaml, first_fail.index, fixed_target)

Per-PR demo bot

On every PR that touches a UI route, generate a demo video and comment with the embed. The agent reads the diff, identifies the changed page, and renders a flow that exercises the new code.

Onboarding videos from docs

For every "Getting started" markdown page in your docs, render a 30s walkthrough that demonstrates the steps. Re-render the whole set on a nightly cron — broken videos surface UI changes before users do.

API surface used by agents

  • GET /api/v1/schema — JSON Schema + quick reference. No auth required.
  • POST /api/v1/probe — discover elements at a URL. Bearer auth.
  • POST /api/v1/renders — submit a render. Bearer + active subscription.
  • GET /api/v1/renders/{id} — poll a render. Bearer auth.

Full reference + status codes: /docs/api.

Limits

  • Concurrency: 3 renders in flight per account. The 4th returns 429 — agents should back off and retry.
  • Scan: Bearer-authed but not subscription-gated, so an agent can scan a page before the user has subscribed.