Replaying Failed Executions

Errors are almost inevitable in software integrations.

A third-party API might be unavailable when your instance tries to access it. That might be remedied through instance retry, but retry doesn't handle the scenario that the API is down for hours or days at a time.
A third-party might start sending you data in a new format that you didn't expect.
Your instance may encounter some other edge-case that it doesn't handle gracefully.

Whatever the reason for the error, it's handy to be able to re-run your instance with the exact same incoming data when the third-party API is back up and running, or after you've fixed your integration to better handle the data coming in.

Replay allows you to run a previous execution's data through your instance again, and it's easy to do through Prismatic's GraphQL API.

Querying for failed executions

First, let's query an instance for failed executions. You can find an instance's ID in your browser's URL bar when you view the instance, or you can query for it via the API.

We'll search only for executions that have error_Isnull: false (in other words, it had an error). We also only want original executions (and not replays), so we'll include replayForExecution_Isnull: true. Finally, let's fetch replays for those original executions that have since succeeded with replays(error_Isnull: true):

If we have multiple pages of executions (more than 100), we can use the endCursor we get back as the startCursor to get the next page of results.

The GraphQL API will respond with any failed executions for that instance along with any replays that have succeeded since the original execution failed:

{
  "data": {
    "instance": {
      "executionResults": {
        "nodes": [
          {
            "id": "SW5zdGFuY2VFeGVjdXRpb25SZXN1bHQ6NjBkZDliOWMtOGIyOS00NDQyLWFkNDctMjZkZTg5Y2NlNWM5",
            "startedAt": "2023-07-26T17:18:15.886806+00:00",
            "replays": {
              "nodes": []
            },
            "error": "Unable to connect to API"
          },
          {
            "id": "SW5zdGFuY2VFeGVjdXRpb25SZXN1bHQ6MmYxNTcxZTktNDVmOS00Mzc2LTg2OGUtMTJkNjZkNDhiNzRl",
            "startedAt": "2023-07-26T17:18:13.800335+00:00",
            "replays": {
              "nodes": [
                {
                  "id": "SW5zdGFuY2VFeGVjdXRpb25SZXN1bHQ6N2U1MDBkZTgtY2ZmYS00NWY5LWI0OGYtNGU1YjU2YWMzMzFh",
                  "startedAt": "2023-07-26T17:28:10.003443+00:00"
                }
              ]
            },
            "error": "Unable to connect to API"
          }
        ]
      }
    }
  }
}

Replaying failed executions programmatically

Now, with the IDs of the executions that failed in hand, we can issue a replayExecution mutation for each one that does not have a replay that succeeded:

That mutation will return the ID of the new execution that occurs. You can the query the API with that new execution to verify that it runs successfully, or do further debugging if it does not.

For an example script that wraps the above GraphQL calls, see our examples GitHub repo.

For more information on the Prismatic API, see the API docs.

Querying for failed executions​

Replaying failed executions programmatically​

Querying for failed executions

Replaying failed executions programmatically