How to Write Integration Metadata that AI Agents Can Use Correctly

You and your team have spent years perfecting your product's UX, so your customers enjoy using it. Now, AI agents are using your product (or at least its integrations), and they need a similar level of attention.

The difference is what "experience" means for an agent. A person using your integration marketplace sees UI labels, tooltips, and onboarding steps. An AI agent using your integration's flows as tools sees three things: your flows’ names and descriptions, and their JSON invocation schema (and maybe their result schema). That's all the context the agent has when deciding which flow (tool) to call, what data to input, and what to do with the results.

AX (agent experience): The measure of how well AI agents can understand and successfully interact with your product.

If that metadata is vague, the agent guesses, and that guess will be wrong more than 0% of the time. In B2B integrations, where a poor guess can result in an incorrect invoice being sent to a customer or a record being overwritten with the wrong data, we need to eliminate guesswork. Otherwise, the AX won't be as wonderful as the UX.

Agents infer rather than understand

When a person sees two buttons labeled "Draft Invoice" and "Issue Invoice," they bring years of context to bear. Draft means editable, internal, and not yet final. Issue means finalized and legally in motion. That distinction is obvious.

However, an agent viewing two tools with similar names tries to match them to a user's prompt: "Create an invoice but don't send it yet." If your metadata doesn't make the distinction explicit, the agent picks the most plausible-sounding option (with a bit of RNG guessing!). And if it guesses wrong, you've sent a real invoice to a real customer that you shouldn't have.

From a dev perspective, when you create integration flow metadata, you are defining the logic that governs the AI's decisions. The names, descriptions, and schemas you write aren't only documentation. They are also the critical instructions that govern agent behavior. You need to give them the same careful attention you'd give a public API contract.

What the agent reads

We’ll build our examples around the Prismatic MCP flow server, because that’s what we know. Different integration setups will require adjustments.

When your customer's AI agent connects to Prismatic's MCP flow server, it queries that customer's integration instance and receives structured tool definitions. Each tool definition includes:

The flow's name (from the title in your invocation schema)
A description of what the flow does (from the $comment field)
The invocation schema (the shape of the payload the tool expects)
An optional result schema describing what the flow returns

That's it. No other context. Just those fields, read by a model that will select a tool in milliseconds.

Here's a properly formed invocation schema for a contact search flow:

1234567891011121314151617
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "search-contacts-in-acme",
  "$comment": "Search for contact records by name in Acme CRM. Return an empty array if no contacts match.",
  "type": "object",
  "properties": {
    "first": {
      "description": "The contact's first name. Partial matches are supported.",
      "type": "string"
    },
    "last": {
      "description": "The contact's last name. Partial matches are supported.",
      "type": "string"
    }
  },
  "required": []
}

And here's the schema, how most devs would actually write it:

123456789
{
  "title": "contact-search",
  "$comment": "Searches contacts",
  "type": "object",
  "properties": {
    "first": {"type": "string"},
    "last": {"type": "string"}
  }
}

The agent reading the second version learns almost nothing. What system? Read-only, or does it create a record if no match is found? Are both fields required? So, it guesses. And that's not usually a good thing. But the first version makes guessing unnecessary.

Think of your invocation schema as the contract between your integration flow and the LLM. Explicit contracts lead to useful results. Vague ones do the opposite.

Let's get explicit.

Optimize flow names for meaning

Flow names optimized for human dashboards tend to be short and contextually obvious because humans have UI and other tools to fill in the gaps. An AI agent has only the name. As a result, the conventions that work well for UX are less than helpful for AX.

For AX, here's a naming pattern that works: [verb]-[object]-[system]

Human-friendly	Agent-friendly
Invoice Actions	(too vague to use)
CRM Sync	sync-acme-contact-to-salesforce
Send Notification	send-invoice-alert-via-slack
Update Record	update-opportunity-stage-in-acme
Handle Invoice	create-draft-invoice-in-acme

The pattern gives the agent three critical pieces of information at a glance: the action being taken, the type of data it operates on, and the system involved. That last part matters especially when you have flows that touch multiple CRMs, ERPs, or billing platforms. The agent needs to know whether it's talking to QuickBooks or NetSuite before it decides to call anything.

Here are some naming rules (not guidelines):

Use imperative verbs, not nouns – create-draft-invoice is better than invoice-creation. It reads as a command, which is how agents think about tools.
Be specific about state and consequence – create-draft-invoice-in-quickbooks and finalize-and-send-invoice-in-quickbooks should be different flows with different names, not the same flow with a mode parameter. (More on this later.)
Avoid abbreviations – upd-opp-sfdc saves six characters and costs you reliability. In the grand scheme of things, six tokens won’t run up your LLM bill.
Treat your flow name like a private method – You wouldn't name a function do_stuff(). Don't name a flow process-data.

Write detailed descriptions

The $comment field in your invocation schema is your primary channel for communicating intent to the agent. Many devs treat it like a code comment: a quick note for whoever reads this next. Unfortunately, the agent doesn't find that helpful.

Treat the $comment field like a docstring that also serves as a safety constraint. A strong description answers five questions:

What does this flow do? Specific operation, not a category.
Which system does it interact with? Name the target explicitly.
What state should the data be in before calling this flow? Prerequisites.
What are the side effects? Writes, sends, triggers, charges, etc. In short, anything that's irreversible.
What does this flow NOT do? When adjacent flows cover similar territory.

That last question is the most commonly skipped (but no less important).

Weak descriptions

Creates a draft invoice
Issues an invoice

An agent trying to "create a new invoice for review" sees two options that both involve invoices. It picks the most plausible one.

Strong descriptions

Create draft invoice: "Creates a new invoice in Acme and saves it as draft. The invoice is NOT sent to the customer. Use this when an invoice needs to be staged for internal review. To finalize and send the invoice, call finalize-and-send-invoice-in-acme instead."
Finalize and send invoice: "Finalizes a draft invoice in Acme and transmits it to the customer immediately. This action is not reversible. The invoice will be sent upon execution. Only call this flow when the invoice has been reviewed and approved. To create a new draft invoice, call create-draft-invoice-in-acme instead."

Both descriptions cross-reference each other by name. That redundancy is deliberate. The agent shouldn't have to infer the difference because you've directly stated it.

If you've ever received a support ticket because a human misunderstood what a flow did, that misunderstanding should be cleared up in the description. If it was confusing to a person with access to your docs, it will absolutely be confusing to an agent that only has access to the flow's metadata.

As a general guideline, try to create descriptions under 200 characters where possible. Token efficiency does affect LLM performance. At the same time, don't sacrifice clarity for brevity. A 50-word description that prevents an agent from invoking the wrong flow is worth far more than an efficient 15-word description that doesn't.

Invocation schemas aren't just for validation

Engineers typically use JSON Schema for input validation. For agentic flows, schemas serve an equally critical second purpose: they are the agent's briefing document for every field it needs to populate.

Every description on every property is text that the LLM reads when deciding what value to pass. Skinny or absent descriptions force inference.

Write sentences, not as little as possible

The most common mistake:

1234
"invoice_id": {
	"type": "string",
	"description": "invoice_id"
}

The description restates the field name, so the agent learns nothing new. Here's what it should say:

1234
"invoice_id": {
	"type": "string",
	"description": "The unique identifier of the invoice in Acme. This is a UUID string found in the invoice URL, example: 'a3f7c291-88d4-4e12-b5a9-1c3f4b9de027'."
}

Now the agent knows the format, where to find the value if the user didn't provide it, and which system it belongs to.

Use contextual, specific field names. id tells the agent nothing. acme_company_id tells it exactly what kind of ID it needs, and from which system.

If the agent has already retrieved an Acme company ID in a previous step, it can map that value with confidence.

Specify format constraints explicitly

JSON Schema's type: string indicates that a field is a string. It doesn't say if it expects a date/time string, number of seconds since UNIX epoch, etc. Add that context:

1234
"start_date": {
  "type": "string",
  "description": "The start of the billing period. ISO 8601 date string, example: '2026-06-01'."
}

Use enums for specificity, and describe each value

When an API expects open, pending, or resolved, don't leave the input as a raw string. Define an enum and describe when to use each value, not just what it is:

12345
"invoice_status": {
"type": "string",
"enum": ["draft", "approved", "issued"],
"description": "The lifecycle state of the invoice. Use 'draft' when creating for review. Use 'approved' after internal sign-off. Use 'issued' only when ready to transmit since this triggers an immediate send."
}

When the agent sees a fixed list of behavioral descriptions, it's much less likely to hallucinate a status like "in-progress" and generate an error.

Mark required fields deliberately

Every field in required is a hard dependency. The agent must resolve it before calling the flow. If a field is optional and the flow handles its absence gracefully, leave it out of required. Don't add fields as required to indicate that they're important if the underlying process does not actually require them.

Describe your result schema too

A useful result schema lets the agent understand and relay the response without having to guess what the fields mean. If your flow returns a status field, describe what each possible value indicates. If it returns an id, specify what that ID identifies and which system it belongs to:

12345678910111213141516171819
{
  "title": "create-draft-invoice-result",
  "$comment": "Returns the newly created draft invoice record",
  "type": "object",
  "properties": {
    "invoice_id": {
      "type": "string",
      "description": "UUID of the newly created invoice in Acme. Use this ID in subsequent calls to update or finalize the invoice."
    },
    "status": {
      "type": "string",
      "description": "Current invoice status. Will always be 'draft' for responses from this flow."
    },
    "created_at": {
      "type": "string",
      "description": "ISO 8601 timestamp of when the invoice was created."
    }
  }
}

The agent can now tell the user: "I've created a draft invoice with ID a3f7c291...." It's beautifully deterministic.

One flow encompasses one operation

A common pattern is to build a flexible integration flow that accepts a mode or action parameter and branches internally: if action === 'create'... else if action === 'update'... This approach is completely logical, but it can add uncertainty when invoked by an agent.

The agent now has to make two decisions: choose the correct flow, then choose the correct mode. Your schema is now doing double duty as a router, and the description must cover multiple distinct operations without becoming an essay.

Split your multi-mode flows where possible. create-contact-in-acme and update-contact-in-acme should be separate tools. The routing logic lives in the agent's tool selection, not inside your flow. Yes, it means more flows. However, it ensures accurate outcomes. Agents perform more consistently with clearly scoped tools.

Clearly document idempotency and side effects

If your flow is safe to retry, say so. Agents operating under uncertainty may retry a call after a transient failure. If retrying creates duplicate records or sends duplicate emails, the agent has no way to know. Add it to the description: "This flow is idempotent. If a contact with the provided email already exists, the flow updates the existing record rather than creating a duplicate."

For non-idempotent flows: "This flow is NOT idempotent. Each call sends an email to the customer. Do not retry without confirming that the previous call failed."

And, if your flow writes to multiple systems, sends a notification, charges a card, or triggers a webhook, document those actions. The agent cannot extrapolate side effects (at least not with total accuracy). It only sees what you put in the schema. If an operation is irreversible, make it plain.

Redundancy is a feature

When working with the UX for an app, repetition feels clunky. In integration metadata (intended for consumption by an agent), repetition improves accuracy.

If your flow name, description, and schema $comment all say the same thing in slightly different ways, that's message reinforcement. An agent that reads "Creates a draft invoice" in the name, "invoice is not sent to the customer" in the description, and "use only when the invoice needs to be staged for review" in the $comment will not prematurely finalize a transaction.

Keep your flow name, description, and $comment in sync. They should tell the same story. When you update one, update all three.

Test your metadata as though it matters

You test your flow logic. Test your metadata too.

Prismatic's MCP test client guide walks through connecting Claude or another MCP-compatible agent to your integration's endpoint. Use it to verify that flows run and that agents call them for the right reasons.

Test with intent-driven prompts, not schema-valid inputs

Don't prompt the agent using the exact flow name. Prompt it the way a real user would. See what happens.

"Create a new invoice for Imaginary, $5,357.22, net 30." Does the agent call create-draft-invoice-in-acme? Or does it call finalize-and-send-invoice-in-acme because both descriptions mention invoices?
"Update the shipping address on order #29654." Does it call update-order-shipping-address? Or does it call a generic update-order flow and pass a field: "shipping_address" parameter your flow doesn't support?
"Who is the account owner for Imaginary in Acme?" Does it correctly call a read-only search flow, or does it try to write something?

Run adversarial prompts

Your customers are not deterministic people, and will phrase things in different ways. Try deliberately vague inputs: "Handle billing," "Process the invoice," or "Fix the customer record." See where the agent hesitates, asks a clarifying question, or picks the wrong path. Each failure should help you refine the metadata.

When an agent makes the wrong call, the problem is almost always in the metadata. Read the description through the lens of the incorrect choice it made. You'll find the ambiguity right away. Work through the metadata as if you were debugging, not documenting.

AX is engineering's responsibility

It can be easy to frame integration flow metadata as a documentation matter, something the tech writer can handle after the flow is built. But that's looking at things from the old perspective, when humans were the only ones who would see those values.

Integration flow metadata is an engineering concern because it directly determines the correctness of agent behavior. Shipping an agentic flow without a proper invocation schema is the equivalent of shipping a public API with no docs. Sure, it's technically functional, but operationally unreliable.

Your customers are deploying AI agents that will interact with the integrations you've built. Those agents will read the metadata you wrote and make consequential decisions based on it.

Check out our free trial to see exactly how Prismatic’s platform interacts with AI agents for the best possible data outcomes.