Monitoring and Alerting with Prismatic

Monitoring and Alerting with Prismatic

This post refers to an earlier version of Prismatic. Consult our latest docs on monitoring and alerting or contact us if you'd like help with any of the topics addressed in this post.

Do you know what's worse than a PagerDuty alert that comes in at 02:00 AM? Fielding a call from an irate customer at 03:05 AM who is yelling that their integration has been down for over an hour. When I was a DevOps engineer, I was on the receiving end of those calls, and always pined for a system that proactively – or at least quickly reactively – alerted me to problems.

Integrations sometimes break. It's often not your fault – a third party system or API that your integration relies on may be down – but regardless of what is causing problems, it's important that your teams be alerted to problems and be able to diagnose and resolve issues quickly.

In addition to alerting when things go wrong, you may also want to be alerted when things go right. Your project management team might want to be alerted when a new integration instance is enabled for a customer, or your customer may want to be alerted when an instance runs successfully.

It's understandable why many B2B software teams' integration alerting processes leave a lot to be desired. Reinventing the monitoring/alerting wheel for each new customer integration is time-consuming and prone to errors, and it's nearly impossible to set aside the big chunk of time it would take to build a good integration monitoring and alerting framework – especially one that ties in with the alerting systems and processes you already use for your core product.

Prismatic's built-in monitoring and alerting capabilities make it easy to keep tabs on all of your customers' integrations and alert your teams about whatever is important to you: if an instance is enabled for the first time, fails to run, begins to take too long, is unexpectedly disabled, and so on. Alerts can be configured for any team – dev, DevOps, support, onboarding, whatever makes sense in your organization – and can be sent via text, email, or to your existing incident management or messaging systems.

In this post I want to outline how Prismatic enables you to be notified of failures or other noteworthy things going on with your customers' integrations. We'll look at a few use cases for alert monitors, then create an alert monitor that notifies the proper groups when an instance fails run properly. After that, we'll look at how you can leverage webhooks to post alerts to systems like PagerDuty and Slack.

Adding an alert monitor to an instance

Let's walk through adding an alert monitor to an integration instance. In this example, we'll alert our DevOps team when an instance that's deployed to one of our customers fails to run to completion, or when it takes more than five minutes to run. (It could just as well be the support team or whatever group makes sense in your organization!) Let's suppose this is a mission-critical integration that our customer cares a lot about, so we'll configure it to notify the customer too.

First, we'll create an alert group. An alert group is a set of users to notify (by email or SMS – their choice) when an instance does something noteworthy or unexpected. Our DevOps team currently consists of three people – Samantha, Ed, and Kristin – so we'll create an alert group that includes them:

Screenshot of DevOps Alert Group in Prismatic

Alert groups are reusable for multiple instances, and can be modified later. This is handy, since we can add and remove people from our DevOps alert group in one place as our team changes without having to modify individual monitors.

Next, we'll create an alert monitor for an instance that we deployed to one of our customers. An alert monitor is attached to an instance and defines who should be notified, and when.

First, when. Under the Triggers card of our alert monitor we'll list the alert triggers that will cause this alert monitor to fire. In this case, our alert monitor will fire when an instance fails to run, or takes more than 300 seconds to run:

Screenshot of Monitors and Triggers in Prismatic

Now, who. In addition to notifying the DevOps alert group we created, we'll also alert our customer user, Brenda, by adding her user under the alert monitor's Notifications card.

Types of alert triggers

An alert monitor can be triggered by a variety of events, both good and bad. We can configure an alert monitor to be triggered:

  • When an instance is enabled, disabled, or removed.
  • When an instance starts, runs successfully, or fails to run.
  • When an instance takes too long to run. For example, if we expect an instance to take a minute to run, and it took 10, we'll want to know.
  • When an instance hasn't run in a while. We might expect our instance to run every 30 minutes, but want to know if it hasn't run in over an hour.
  • When logs of our instance include warn, error, or fatal lines.

We can add one or many triggers to an alert monitor. Our support team, for example, might want to be alerted if an instance is enabled or disabled so they can communicate proactively with customers. Both triggers can be added to their alert monitor.

Responding to trigger events

When an alert monitor has been triggered, team members in the alert group receive an email or SMS message with a link to the alert monitor. Our team can follow that link to quickly get to a screen where we can see relevant logs:

Screenshot of trigger alerts in Prismatic

From the triggered alert monitor, we can even "clear" the event, signaling to the rest of our team that the issue has been acknowledged.

Sending alerts to your existing systems

Many dev, DevOps, and support teams employ incident management systems like PagerDuty, or notify their teams of issues and outages via Slack. Since it's so important for your integration workflows to fit smoothly into your existing tools, we made it easy to send Prismatic alerts to PagerDuty or Slack using webhooks. Similar approaches can be used to integrate Prismatic's alerting functionality with other incident management or messaging systems.

We can add alert webhooks within the Settings screen, under the Alert Webhooks tab. For PagerDuty, we can configure Prismatic to send a POST request to their incident event API endpoint, with a payload that looks like this:

  "routing_key": "d4685282example0aa5464e8fexample",
  "event_action": "trigger",
  "links": [{ "href": "$URL", "text": "Link to Prismatic alert monitor" }],
  "payload": {
    "summary": "$NAME triggered - $INSTANCE failed to run.",
    "severity": "error",
    "source": "$SUBJECT"

The message sent to PagerDuty, then, will be filled in with the $NAME of the alert monitor, the name of the $INSTANCE, and a $URL where responders can go to get more information about the incident. Additional fields can be sent to PagerDuty, and are outlined in their docs.

Let's attach our alert webhook to our alert monitor within the alert monitor's Webhooks card:

Screenshot of webhook and alert monitor in Prismatic

Now, the next time our alert monitor fires we will see an incident within PagerDuty:

Screenshot of PagerDuty alert triggered

A Slack integration works in a very similar way. We can create an incoming Slack webhook, and use that Slack webhook URL to post a message to a Slack channel from a Prismatic alert monitor. The payload that we send to Slack can be a simple line of text:

Screenshot of Post to Slack webhook for alerts in Prismatic

When our alert monitor fires, a message will pop up in Slack and our team will be able to respond to the incident:

Screenshot of Slack of DevOps Alert from Prismatic

With just a couple minutes' work, our alert monitors can begin shuttling messages about incidents to our existing incident management systems. Modern APIs are pretty great, huh?

More about alerting

We covered a good deal in this post, from adding monitoring to instances to shipping Prismatic alerts to your existing incident management tools. For more information on monitoring and alerting in Prismatic, check out our docs.

About Prismatic

Prismatic is the integration platform for B2B software companies. It's the quickest way to build integrations to the other apps your customers use and to add a native integration marketplace to your product. A complete embedded iPaaS solution that empowers your whole organization, Prismatic encompasses an intuitive integration designer, embedded integration marketplace, integration deployment and support, and a purpose-built cloud infrastructure. Prismatic was built in a way developers love and provides the tools to make it perfectly fit the way you build software.

Get the latest from Prismatic

Subscribe to receive updates, product news, blog posts, and more.