Data sync integrations
Data sync is the most common integration type built on Prismatic. A sync integration has two phases: an initial backfill that loads existing records, and ongoing incremental updates that keep data current as things change.
Initial sync
Use the Instance Deploy trigger to run a backfill when a customer enables the integration. Because this trigger also fires when upgrading integration versions, your flow must be idempotent - re-running it should not create duplicates.
For large datasets that exceed the 15-minute execution limit, use recursive flows to process data a few pages at a time. Each execution processes a hand-full of pages, then saves a cursor to Cross-Flow State (for idempotency). Configure the initial sync flow to run only one execution at a time to avoid concurrently processing the same set of pages.
Incremental updates
Webhooks (preferred). Use webhook triggers for near-real-time updates. Register webhook subscriptions in lifecycle handlers so subscriptions are created and cleaned up automatically.
Polling (fallback). If webhooks aren't available, use polling triggers. Persist a cursor (last processed timestamp or ID) in Flow State so each execution fetches only records that changed since the previous run.
Best practices
- Idempotency. Use stable record identifiers and track processed keys to avoid duplicates across retries or restarts.
- Checkpoints. Persist cursors in Flow State or Cross-Flow State so the sync resumes cleanly after a failure.
- Rate limiting. Use batching and concurrency controls to respect source and destination API quotas.
- Error handling. Use retries with backoff for transient errors. Route persistent failures to a dead-letter flow for investigation.
- Observability. Log records scanned, updated, skipped, and errored so sync behavior is visible and diagnosable.
- Handle large files. Large files can consume lots of memory - use streaming strategies.
Related documentation
- Management triggers - run flows on instance deploy / remove
- Recursive flows - process large datasets across multiple executions
- Persisting state - store cursors and checkpoints between executions
- Handling large files - strategies for file transfers
- Integration runner limits - memory, execution time, and payload constraints