Turn prompts into metrics.
Measure how well your agent actually performs.
Wrap your LLM calls with Agent Plasticity. We break down prompts into measurable metrics and score every response against them.
const { text } = await generateText({
model: openai("gpt-5.2"),
system: "Be concise. Use markdown.
Cite sources.",
prompt: userQuery
})
// send to Agent Plasticity
plasticity.track({
prompt, output: text, model
})How it works
Collect
Add one line to your agent. We collect every LLM call with zero latency impact.
Extract
We read your prompts and break them down into measurable metrics.
Score
Every output is scored on every metric. See trends, catch regressions, improve.
Every score maps back to your prompt
We score each metric against the instruction it came from. Not generic quality, your specific expectations.
Second paragraph restates the introduction. Could be 40% shorter.
Proper heading hierarchy, code blocks, and bullet lists throughout.
Two of four claims cite sources. BMI threshold is uncited.
Your prompts already define what good looks like.
We turn that into something you can measure.
Get started freeFree to start. No credit card required.