Persisting State between Executions

Integrations often need to remember information between executions, such as tracking which records have been processed or maintaining synchronization state. Prismatic provides built-in state management capabilities to persist data across multiple execution runs.

Persisting state between executions

Sometimes it's useful to save data from one execution of an instance so it can be used in a subsequent execution. For example, imagine you have an integration that pulls down and processes records from a third-party API. Your integration recently processed a record with ID 123, and the next time your integration runs, you want to ensure it processes ID 124 and above.

Prismatic provides components and programmatic access to persisted state, so you can save data in one execution and use it in the next. You can persist 123 using a Save Value action, and then the next time your integration runs, it can use Get Value to know that 123 was the most recently processed record.

Levels of persisted state

There are four levels of persisted data:

Execution state stores state for a single execution of a flow. Data stored are ephemeral and not persisted between executions.

Execution state is generally used as a temporary variable or as an accumulator. For example, if you are looping over an array of records and fetching data for each one, you could use the Execution - Append Value to List action to append each record to a list, which you could load up after your loop in its entirety.
Flow state (programmatically called instanceState for historical reasons) stores persisted data for a single flow. A flow can access its own state but not its sibling flows' states. Each flow has its own state, and two different flows can run concurrently without overwriting one another's flow state.

Flow state is useful if you have a scheduled process that checks for new records in a third-party app. You can use flow state to persist a cursor, so the next time your flow runs, it can pick up where the previous execution left off.
Cross-Flow state is shared between all flows within an instance. A flow can access its sibling flows' cross-flow state. If two flows run concurrently and both change state, the flow that finishes last overwrites the data that the first stored.
Integration state stores persisted data for all flows of all instances of an integration. All instances of an integration deployed to different customers share state.

Integration state can be useful if you're building an integration with an app that only allows you to specify a single inbound webhook URL for all of your customers. In that situation, you could generate a key-value store, matching customers' third-party external IDs to their instance's webhook URLs, allowing you to route requests that arrive at a shared endpoint to the proper instances.

How persisted state works in Prismatic

The persist data lifecycle is straightforward:

When an execution begins, flow state, cross-flow state, and integration state are downloaded and parsed from JSON files. Execution state is initialized to an empty object {}.
Throughout your execution, you may create, update, or delete key-value pairs in one or more of the states. You can either use the Persist Data component or programmatically do something like context.crossFlowState["Last Product ID"] = "abc-123";.
When the execution completes successfully, execution state disappears. Flow state, cross-flow state, and integration state are compared to their initial values. If their values changed, they are serialized to JSON and written to storage to be loaded in the next execution.

Levels of state are evaluated independently

Flow, cross-flow, and integration state are evaluated independently. If you change cross-flow state but not flow or integration state, only cross-flow state will be persisted at the end of the execution.

Limitations of persisted state

It's important to know what persisted state is, and more importantly, what it is not. Persisted state is a useful tool to cache small key/value pairs between executions. It is not a database (and certainly not an ACID database).

Generally, either your app or the app you're integrating with should be considered the source of truth.

Concurrent execution limitations

Persisted state is loaded at the start of an execution and written at the end of a successful execution. State is written out in its entirety when it is changed.

Let's look at a few scenarios where you may run concurrent executions:

Suppose you want to keep track of a list of records to process. You have two flows that use cross-flow state (one flow adds items to cross-flow state, and one flow reads and removes items from state).

Suppose that both flows are invoked at the same time with an initial cross-flow state of ["a", "b"]. The first flow adds "c" to the list and finishes first. It writes out ["a", "b", "c"] to persisted state. The second flow reads and removes "a" and "b" from state and writes [] to persisted state.

In this case, the second flow would overwrite the first flow's state ([] would overwrite ["a", "b", "c"]), and item "c" would never be processed. Depending on which flow completes first, you may miss items or double-process items.

When processing items, if order is important, consider leveraging a FIFO queue to ensure that each item is processed exactly once. If order is not important, consider omitting persisted state and process records in the same flow that you receive them.
Suppose you have a flow that is invoked via webhook and tracks orders that are processed as key-value pairs. Flow state might look like this:
```
{
  "id-abc-123": { "item": "Widgets", "qty": 5 },
  "id-def-456": { "item": "Gadgets", "qty": 10 }
}
```
If two invocations of the same flow occur at the same time, and each attempts to add a key-value pair to flow state, each flow will write out state with different key-value pairs. The flow that finishes last will overwrite (effectively removing) the key that the first flow wrote.

State is written in its entirety
Note that state is written in its entirety (rather than key by key). That means that for two concurrently running flows, if one flow writes a value for instanceState["foo"], and then another writes a value for instanceState["bar"], the change to "foo" will be overwritten.

Generally, Prismatic should be used as the mechanism to move data between systems. The systems (your app and the app you're integrating with) should be the sources of truth where records are stored.
Suppose you have two flows, and one calls another via cross-flow trigger. The first flow writes state for the second flow to read.

This scenario doesn't work, since the second flow starts before the first completes. The second would load state that doesn't contain the first flow's persisted values.

When invoking sibling flows, consider sending the data to the sibling flow via POST request. The cross-flow trigger lets you specify data to send to the sibling flow.
Suppose you have a flow that processes records that are stored in a list. When your flow runs, it loads 50 records from persisted state. After processing 20 records and removing them from the persisted list, the API you're working with throws an error. This causes your flow to throw an error and stop.

In this scenario, the 30 remaining records would not be persisted, since your flow did not complete successfully. When it runs next, it will attempt to process the first 20 records a second time.

If processing and removing records is important, consider leveraging a FIFO queue. Alternatively, if you know an API is unreliable and may throw errors, you can configure step-level error handling to ignore errors from records that cannot be processed, or you can send bad records to a dead-letter queue that you can examine later.

Persisted state size limitations

Persisted state is ideal for storing small amounts of data in key-value storage between executions. When serialized to JSON, integration state, cross-flow state, and flow state combined should not exceed 64 MB.

If you attempt to store more than 64 MB of state, you will encounter an error stating Unable to complete execution, persisted state exceeded maximum limit of 67108864 bytes.

When should I use alternative data stores?

If you are attempting to persist large items (like PDFs or images), consider writing the files to a file storage system like Amazon S3 or Google Drive.

If you need to store thousands of key-value pairs, consider a purpose-built key-value store, like Firebase or Amazon DynamoDB.

If you need to process records that you receive in order, consider leveraging a FIFO queue.

The persist data component

Data can be persisted between runs using the Persist Data component. Data are stored in key-value pairs, and values can be strings, numbers, objects, or lists. You can choose to persist data with the Flow - actions - that lets you persist data scoped to the current flow. You can also use the Cross Flow - actions to persist data that can be shared between flows of an instance. Or you can use the Integration - actions to persist data between instances of the same integration (so multiple customers can share a data store).

You can store a key/value pair using the Save Value action, or you can use Persist Data's other actions to append to a persisted list. If you would like to save a timestamp instead, you can use the Save Current Time action to save the current time into a key of your choosing.

Later, in a subsequent run, you can fetch the value you saved using the Get Value action. If a key is not set, Get Value will return null.

You can remove data from an array or remove a key/value pair altogether using Persist Data's other actions.

Accessing persisted data in a code block or custom component

Persisted state is accessible through the context parameter, which can be referenced in custom components and code steps.

Reading persisted state programmatically

The context parameter contains execution state (executionState), flow state (instanceState for historical reasons), cross-flow state (crossFlowState), and integration state (integrationState).

Within a perform function or code step, you can access variables like this:

for (const item of context.crossFlowState["My Items"]) {
  // Process each item
}

For more information, see Execution, instance, and cross-flow state.

Writing persisted state programmatically

To set new values for persisted state keys, you can either return the new values in your return block or mutate context.* objects directly.

return {
  data: "Some Data",
  crossFlowState: { exampleKey: "example value", anotherKey: [1, 2, 3] },
};

// or

context.crossFlowState["exampleKey"] = "example value";
context.crossFlowState["anotherKey"] = [1, 2, 3];

To delete a value from state, return a null value for the key you want removed:

return {
  data: "Some Data",
  crossFlowState: { exampleKey: null },
};

Persisted data in code-native integrations

Persisted data can be accessed in code-native integrations using the same context parameter, just as you do for custom connectors. See Code-Native Flows.