In this tutorial we will build an integration that downloads and processes files stored in Google Cloud Storage.
For this integration, assume that some third-party service writes a timestamped file to a Google Cloud Platform (GCP) storage bucket whenever a user authenticates against one of their services. Our integration will examine the files written to the bucket, and will announce via Slack who logged in, and when.
Our integration will take advantage of the loop component to process files one by one.
We'll configure our integration to run every five minutes, and our integration will do the following:
- Look for files in the
unprocessed/directory of our GCP Storage bucket
- Loop over each file that we found:
- Download the file
- Deserialize the JSON contained in the file
- Use a code component to format a Slack message based on the deserialized contents
- Post the generated message to Slack
- Move the file from the
unprocessed/directory to a
If you would like to view the YAML definition of this example integration, it's available on GitHub.
Our integration is going to interact with Google Cloud storage and Slack. For the sake of a more configurable integration, let's create three required config variables for our integration:
bucketNamewill represent the name of the GCP bucket where files are stored.
gcpProjectIdwill represent the ID of the GCP project that owns the GCP bucket.
slackWebhookUrlwill represent a Slack webhook URL - see generating a Slack webhook URL
The first two steps we'll add to our integration will (1) list files in the GCP storage bucket, and (2) create an empty loop iterating over those files.
First, we'll add a add a step to list files in our GCP storage bucket.
I've already created a Google Cloud Storage bucket and service account to test with, following GCP's documentation.
We'll configure our action to point to our bucket and account, and under prefix we'll enter
unprocessed/ so we only loop over files in the
Next, we'll add a loop step. Under items we will reference the list of files our previous step output:
Our loop is now configured to run once for each file that was found in the
unprocessed/ directory in our GCP bucket.
Our loop will contain five steps:
First, we'll download the file we're currently looping over.
The item that we're currently processing from our items is accessible using the
For example, if there's a file named
unprocessed/2020-10-22T15-30-55.521Z in our bucket,
loopOverEachFile.currentItem would be equal to
Next, we'll add a Deserialize JSON step to process the file we pulled down. The file we downloaded is a text file containing some JSON:
This step will make those JSON keys accessible for subsequent steps.
Next, we need a helper function to generate two things:
- The Slack message we're going to send
- The path where we're going to move the log file after processing (in the
Let's add a code component and enter this code to execute:
Note that this returns an object with two values:
slackMessage containing a message to send, and
outFileName that contains something like
Next, we'll send the slack message that we generated in the previous step. We'll do that by adding a Slack - Send Message step to our integration. If you do not have a Slack channel, you can try sending an email instead with SendGrid or an SMS with Twilio.
Finally, we'll move the file that we downloaded out of the way by moving the file from
To do that we'll add a GCP - Move File step to our integration.
For the source file name, we'll reference our loop's
currentItem key again.
For the destination file name, we'll reference our code component's
That's it! At this point we have an integration that loops over files in a directory, processes them, and sends alerts based on their contents. This integration can be published, and instances of this integration can be configured and deployed to customers.