AWS Glue Component
Manage AWS Glue crawlers, jobs and triggers
Component key: aws-glue
Description
AWS Glue is a serverless data integration service from Amazon Web Services. The AWS Glue component allows you to interact with jobs, triggers, and crawlers in your AWS Glue account.
Connections
AWS Glue Access Key and Secret
An AWS IAM access key pair is required to interact with AWS Glue. Make sure that the key pair you generate in AWS has proper permissions to the AWS Glue resources you want to access. Read more about Glue IAM actions in the AWS docs.
Input | Notes | Example |
---|---|---|
Access Key ID string / Required accessKeyId | An AWS IAM Access Key ID | AKIAIOSFODNN7EXAMPLE |
Secret Access Key password / Required secretAccessKey | An AWS IAM Secret Access Key | wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
AWS Role ARN
To enable the IAM role authentication begin by logging into the AWS Console and navigate to Identity and Access Management (IAM).
To create an ARN user and generate credentials:
- Navigate to Users and select Create User.
- Provide a User name and check the box providing them user access to the AWS Management Console if needed.
- Once completed with the User creation, copy the ARN provided in the summary for a later step.
- To obtain the ARN for an existing User, click on the designated username from the Users page and the ARN will be provided in the summary section.
- From the summary section, select Create access key
- Select Third-party service as the access key type and select next.
- Set a description and select create access key.
- Copy the Access Key and Secret access key and enter those into the connection configuration of your integration along with the ARN.
To create and assign a user a role:
- Navigate to Roles and select Create Role.
- Select Custom Trust Policy for the Trusted entity types
- Copy the following statement into the statement console. Making sure to replace the ARN with the user's actual ARN from the previous section
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "ARN"
},
"Action": "sts:AssumeRole"
}
]
}
- When adding permissions provide the AWSGlueConsoleFullAccess permission
- Complete remaining steps and select Create Role
Input | Notes | Example |
---|---|---|
Access Key ID string / Required accessKeyId | An AWS IAM Access Key ID | AKIAIOSFODNN7EXAMPLE |
Role ARN string / Required roleARN | An AWS IAM Role ARN | arn:aws:iam::OtherAccount-ID:role/assumed-role-name |
Secret Access Key password / Required secretAccessKey | An AWS IAM Secret Access Key | wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
Actions
Get Job Run
Retrieves the metadata for a given job run. | key: getJobRun
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Name string / Required name | Provide a string value for the name (NOT the ARN). | |
Run Id string / Required Value List runId | Provide a string value for the run Id. |
{
"data": {
"JobRun": ""
}
}
List Crawlers
List Crawlers available in AWS Glue | key: listCrawlers
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Marker string marker | Specify the pagination token that's returned by a previous request to retrieve the next page of results | lslTXFcbLQKkb0vP9Kgh5hy0Y0OnC7Z9ZPHPwPmMnxSk3eiDRMkct7D8E |
Max Items string maxItems | Provide an integer value for the maximum amount of items that will be returned. Provide a value from 1 to 50. | 20 |
{
"data": {
"NextToken": "",
"CrawlerNames": [
"crawler-1",
"crawler-2"
]
}
}
List Jobs
List job schemas available in AWS Glue | key: listJobs
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Marker string marker | Specify the pagination token that's returned by a previous request to retrieve the next page of results | lslTXFcbLQKkb0vP9Kgh5hy0Y0OnC7Z9ZPHPwPmMnxSk3eiDRMkct7D8E |
Max Items string maxItems | Provide an integer value for the maximum amount of items that will be returned. Provide a value from 1 to 50. | 20 |
{
"data": {
"JobNames": [
"job1",
"job2"
],
"NextToken": ""
}
}
List Triggers
List the names of all triggers in the account. | key: listTriggers
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Marker string marker | Specify the pagination token that's returned by a previous request to retrieve the next page of results | lslTXFcbLQKkb0vP9Kgh5hy0Y0OnC7Z9ZPHPwPmMnxSk3eiDRMkct7D8E |
Max Items string maxItems | Provide an integer value for the maximum amount of items that will be returned. Provide a value from 1 to 50. | 20 |
{
"data": {
"NextToken": "",
"TriggerNames": [
"trigger-1",
"trigger-2"
]
}
}
Start Crawler
Starts an existing crawler in AWS Glue. | key: startCrawler
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Name string / Required name | Provide a string value for the name (NOT the ARN). |
{
"data": {
"Name": "exampleCrawlerName"
}
}
Start Job Run
Starts a job run using a AWS Glue job definition. | key: startJobRun
Input | Notes | Example |
---|---|---|
args string Key Value List args | Optional key value parameters to pass into a job. | |
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Allocated Capacity string capacity | The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. If this is omitted, Glue will use the default number of DPUs configured for your job. | 10 |
Name string / Required name | Provide a string value for the name (NOT the ARN). | |
Security Configuration string security | The name of the SecurityConfiguration structure to be used with this job. This can be left blank if you do not have a security configuration. |
{
"data": {
"Name": "exampleJobRunName"
}
}
Start Trigger
Starts an existing trigger in AWS Glue. | key: startTrigger
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Name string / Required name | Provide a string value for the name (NOT the ARN). |
{
"data": {
"Name": "exampleTriggerName"
}
}
Stop Crawler
If the specified crawler is running, stops the crawl | key: stopCrawler
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Name string / Required name | Provide a string value for the name (NOT the ARN). |
{
"data": {
"Name": "exampleCrawlerName"
}
}
Stop Job Run
Stops one or more job runs for a specified job definition | key: stopJobRun
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Job Run Ids string / Required Value List jobRunIds | Provide a list of job run Ids | |
Name string / Required name | Provide a string value for the name (NOT the ARN). |
{
"data": {
"SuccessfulSubmissions": [
""
],
"Errors": [
""
]
}
}
Stop trigger
Stops a specified trigger | key: stopTrigger
Input | Notes | Example |
---|---|---|
Connection connection / Required awsConnection | ||
AWS Region string awsRegion | AWS provides services in multiple regions, like us-west-2 or eu-west-1. | us-east-1 |
Name string / Required name | Provide a string value for the name (NOT the ARN). |
{
"data": {
"Name": "exampleTriggerName"
}
}