Connectors can reduce repetitive development work and shift API maintenance to the platform. But Prismatic recently open-sourced its entire connector library because connectors aren't the key. They're table stakes. The proper question for any integration platform is what the platform provides in addition to the connectors.
B2B SaaS companies know they need integrations. The question you may have is how to build them without turning your engineering team into a perpetual maintenance department.
When integration demand first hits, the solution seems obvious: assign a developer, read the API docs, implement the OAuth flow, ship the integration. For integration one, this works fine. By integration fifteen or twenty, engineers are spending more time on infrastructure that doesn't differentiate your product than on the things that do.
Built-in API connectors can help with that. But before getting into what they do, let's consider what they don't do – because integration platform vendors have spent years overstating their importance, and that confusion leads buyers to evaluate platforms with the wrong focus.
Connectors are a commodity
In June 2026, Prismatic open-sourced its entire connector library under the Apache-2.0 license. Not because the connectors were holding customers back. Because we don't believe they're what makes a platform worth buying.
Many embedded iPaaS vendors put connector counts on their homepages (1,200+ connectors! ) and treat a large library as a moat. It isn't. Point a capable AI model at an API's documentation and a clear SDK, and it will scaffold a working connector with auth, actions, and triggers in minutes. The marginal cost of "one more connector" is approaching zero. As a result, a giant pre-built library isn't a differentiator. It's now table stakes that anyone, including you, can meet on demand.
Raw count is a vanity metric. A library of 1,200 connectors (where the three your customers need are shallow or stale) is worse than a library of 200, where the relevant ones provide deep functionality and are actively maintained. Openness is a related red herring: "open source" sounds like a trust signal, but it's only meaningful if the connector layer was valuable IP in the first place. We opened ours up because we're confident that our value as a platform doesn't sit with our connectors.
So, when you're evaluating an embedded iPaaS, connector count and openness badges are the wrong lens. They tell you almost nothing about whether the platform will scale your integration program.
Here's what built-in connectors actually do, and what to look at instead.
What connectors do for integrations
A built-in connector (aka component) is a pre-built module that handles all communication with a specific third-party application's API, including authentication, triggers, and data sources. Prismatic currently has 190+ of them covering the apps your customers actually run: CRMs, ERPs, marketing platforms, cloud storage, databases, and more.
The clearest benefit is speed. Without connectors, every integration project starts by covering the same development ground. Built-in connectors eliminate that repetitive work and let your team start with the business logic specific to your product. Depending on the integration, a connector can account for 30% to 90% of the total implementation work.
Connectors can also shift the maintenance burden. Third-party vendors regularly deprecate endpoints, change rate limits, and ship breaking changes (often without warning). When you use our built-in connectors, Prismatic updates the related connector when a third-party API changes, and your integrations inherit the fix.
And connectors enable non-engineers to participate. When every integration requires raw code, engineering becomes a bottleneck. Built-in connectors are visual building blocks that onboarding, customer success, and customers themselves (via a self-serve marketplace) can use without writing code. Prismatic's code-native path lets devs write TypeScript in their own IDEs with full access to AI tooling, version control, and CI/CD, drawing from the same connector library as the low-code designer. Non-devs handle what they can; engineers go deeper as business logic requires.
These benefits are real. But they're execution details, and not why an embedded integration platform exists.
The important part is what happens in addition to the connector
Building a connector (or generating one with AI) is the easy part. Running thousands of customer-specific integrations in production, for years, without overwhelming your engineering team. That's the capability that you need to ensure is available with your embedded iPaaS.
- Deploying at scale. One integration needs to become hundreds of customer-specific instances, each with its own credentials, field mappings, and run schedules, without having to recreate each one by hand. "Build once, deploy to every customer" is what makes your integration catalog an asset rather than a maintenance backlog.
- Self-serving via an embedded marketplace. A native, self-serve catalog that your customers use inside your product means customers can activate integrations without a support ticket and go live on day one, without involving your team.
- Logging, monitoring, and alerting. When a customer's integration breaks, can you identify the cause and fix it before they notice? This is where embedded integration programs succeed or fail. It's also where the platform quality shows up after you've shipped.
- Versioning and change management. Shipping an update to an integration without causing issues for customers running the previous version is a harder infrastructure problem than it sounds.
- Two build paths, both accelerated by AI. We have both a low-code designer that non-devs can use and a code-native path for engineers who need the full toolkit. Most platforms support one well and bolt on the other. Prismatic's AI tooling (Prismatic Skills) can generate custom components and integrations that match the structure of existing built-in connectors by referencing production code rather than inferring patterns from documentation.
None of that fits in a connector-count headline. All of it determines whether your integrations will continue to scale successfully or whether you'll hit a wall.
When no pre-built connector covers your use case
No integration catalog covers every application. Niche vertical software, proprietary internal tools, and legacy systems will always leave gaps.
Prismatic's custom component SDK lets your team build custom connectors for any API using the same tooling that powers the built-in library. They behave identically to built-in connectors. They are reusable, composable, accessible to both devs and non-devs, and private to your organization. And because our connector library is now open source, a custom connector built for a niche app can reference Prismatic's production implementations.
A connector built for a specialized app your competitors don't support becomes a real differentiator – not because the connector itself is hard to build, but because having it enables the rest of the platform to do its job.
The right question for an integration platform evaluation
Connectors reduce boilerplate, shift API maintenance to the platform, and create reusable building blocks. That's value, and we don't want to ignore it. But it's the value that lets the harder, more important work happen on the platform.
The question for any embedded iPaaS evaluation isn't "How many connectors do you have?" or "Are they open source?"
Instead, the question is "What happens next (after the connector exists)?"
How are integrations deployed across hundreds of customers, how are they monitored in production, how do non-devs participate without creating an engineering bottleneck, and how does the whole program scale without requiring a growing team of devs to maintain what's already been built?
Connectors help you get into the game. The platform is why you stay.
Ready to see what Prismatic provides beyond the connector? Try out our free trial or get a demo to see the platform in action.
Researchers at Sierra found that agents passing a task once often can't pass it eight times in a row (Yao et al., 2024). They defined pass^k as a measure of the gap between “worked in a single demo" and “works in general".
So what is the problem? It's the ever-present, classic lack of alignment among the humans. Every stakeholder carries a different mental model of what "good" looks like and what the agent should do, from the macro decisions down to the minutiae of execution.
Sales needs to demo the vision. Customer success needs it to nail each customer's use case. Marketing needs it to wow in a webinar.
When the agent behaves unexpectedly, each group interprets that through its own lens – and reaches for "make it more deterministic" as a proxy for "make it do what I expect." The question feels technical, but it's disguising a people problem.
So who's right? As it turns out, everyone is – at least partially.
Consensus
Each person has a unique perspective on the requirements. But given that AI is non-deterministic by definition, how do we achieve alignment within the team on what AI should be doing?
We've always had the answer to this with other, more conventional, coding tracks: codifying requirements into automated tests and tools, including stylistic choices through committed linter configurations. Those certainly behave deterministically, but more importantly, they provide objective documentation of expected use cases and their handling and encode desired and agreed-upon behavior.
However, simply applying traditional unit testing approaches yields many challenges for validating a non-deterministic tool. What would you assert? Snapshots of generated code or the chat transcript? Would you mock tool calls and assert that those were called with the expected arguments? These approaches flounder as they attempt not only to encode what "good" looks like but also how "good" was achieved at a level of precision that fails far more often than it succeeds.
1234567891011
Jamming determinism into an agent or skill can lead to numerous anti-patterns:
- Setting the temperature to 0 in a vain attempt to make the stochastic parrot say the same thing each time
- Enabling retries of tests in the suite
- Writing precise and ever-expanding prose that forces the AI into a single golden happy path, leaving spontaneity behind
- Continually adding more subagents (SDK or prose) to pull more responsibilities from AI into deterministic code.
But all this is like attempting to hold sand with a clenched fist – you end up squeezing the magic and capabilities out of the system. The result is lots of green-checked tests but a much less capable product.
Objectivity over determinism
So, what do you do? I'd suggest starting with an evaluation framework. Eval frameworks are built with AI's non-determinism in mind. They enable you to codify specific traits of generated solutions rather than fully asserting against the final result. The line can be a bit blurry between them and unit tests that call an AI SDK or wrap CLI calls, since some tests and assertions still have merit.
1234567891011121314151617181920
For example, if your skill or agent is generating a Node project, you can at least expect the project to run. Or rephrased, an expected trait of all produced solutions is that they should be runnable by the Node interpreter. You can do this deterministically with a traditional automated testing approach. You can also leverage deterministic assertions in evals:
12345678910111213
But having the project run isn't sufficient for validating what the user asked the AI for in the first place. Does it solve the problem? You'll likely be able to run snapshot tests, but they'll fall flat because of the myriad valid (and invalid) solutions to the user's request.
This is where LLM grading comes into play. It enables validating the looser characteristics of the solution. For example, your agent helping schedule meetings across time zones would want to ensure that times across those zones are as reasonable as possible. Another LLM call acts as a judge on the outcome – how did it handle complex requirements with less-than-ideally specified outcome expectations?
12345678
These LLM judgments can also be canonicalized into more deterministic assertions (a form of promotion, if you will) where, given the indirection grading above, a deterministic test could ensure that the final solution had fewer than or exactly five files.
You should absolutely allow a fair amount of push-and-pull between the layers as you gain confidence in what the skill or agent should be doing and how it should behave.
Reaping the benefits
Evals yield the same benefits as more traditional testing frameworks: tests are committed to the repo, new tests are shipped with subsequent changes, and older tests validate that prior cases continue to work satisfactorily. Even more beautifully, this aids in dealing with the unending stream of changes you do not control:
- New models, even just versions within the same model family
- Agent updates, particularly when you're shipping Skills
- User expectations, both in novel use cases and expanded expectations of existing use cases
Most importantly, it eliminates the very slow, very manual feedback loop of many individuals running small-sample tests and trying to draw broad conclusions. Instead, we can capture the expectations, even those not yet handled by the agent or skill, and codify them into the repo for future work. I'd absolutely recommend not being afraid to add known-failing tests to drive towards objectivity and consensus before diving into resolutions!
Evals for improvement
So you now have an agreed-upon rubric by which to evaluate your tool against usage. And you're now even able to track incremental improvement in the precise handling of those use cases.
That enables something fascinating. What if you could leverage the same coding agents you already use to drive improvements to your skill or agent without requiring a human to be within that hot loop? Doing this moves beyond LLM-as-grader into LLM-as-optimizer territory, involving techniques such as TextGrad (Yuksekgonul et al., 2024) and, more recently, GEPA (Agrawal et al., 2025). More to come on this!
At Prismatic, we've been exploring various techniques to measure and improve our own Skills, agents, and MCP flow server. And we're not done.
Building our Skills was definitely the easy part – that's table stakes at this point. Continually improving it without breaking it (from anyone's perspective) is the really hard part. But we are doing it. How? We build consensus, commit it in-repo, make it executable, and enjoy the many payoffs from leveraging that objectivity.
In part 2 of this post, we’ll dig into what should happen once you have scoring in place.




