AWS Glue: get job_id from within the script using pyspark

Zeitgeist

I am trying to access the AWS ETL Glue job id from the script of that job. This is the RunID that you can see in the first column in the AWS Glue Console, something like jr_5fc6d4ecf0248150067f2. How do I get it programmatically with pyspark?

Brett

I haven't found this documented anywhere but it's passed in as a command line argument.

import sys
from awsglue.utils import getResolvedOptions

args = getResolvedOptions(sys.argv, ['JOB_NAME'])
job_run_id = args['JOB_RUN_ID']

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Using AWS X-Ray within a Glue Python Shell Job

How to debug an aws glue pyspark job

aws glue / pyspark - how to create Athena table programmatically using Glue

Can we trigger AWS Lambda function from aws Glue PySpark job?

Starting an AWS Glue job from Lambda using Python?

Calling stored procedure from aws Glue Script

AWS Glue pyspark UDF

Passing parameters to Glue job from AWS Lambda

ETL job failing with pyspark.sql.utils.AnalysisException in AWS Glue

How to fetch data from AWS RDS in AWS Glue job script and transform the data accordingly and insert it back in aws rds?

Using arguments with Glue pyspark

AWS EMR Spark Glue PySpark -

import error : No module in AWS Glue job script- Python

Not able to put a join and query on two tables in AWS Glue job script

Save Data to AWS Glue via Glue Script

Problem using Get-Help from within PowerShell script

AWS GLUE job latency

AWS Glue - using Crawlers or not

AWS : Passing Job parameters Value to Glue job from Step function

How to apply job only on specific partition using AWS Glue

Add primary key in dynamodb table by using AWS Glue Job

AWS Glue job consuming data from external REST API

Pipeline from AWS RDS to S3 using Glue

Get path to Swift script from within script

How to run parallel threads in AWS Glue PySpark?

TypeError: 'JavaPackage' object is not callable on PySpark, AWS Glue

Issue in Pyspark code when running Glue Script

PySpark (Step/Job) on EMR cannot connect to AWS Glue Data Catalog but Zeppelin can

How to Trigger Glue ETL Pyspark job through S3 Events or AWS Lambda?

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    pump.io port in URL

  3. 3

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  4. 4

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  5. 5

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  8. 8

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

  9. 9

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  10. 10

    How to remove the extra space from right in a webview?

  11. 11

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  12. 12

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  13. 13

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

  14. 14

    java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

  15. 15

    How to use merge windows unallocated space into Ubuntu using GParted?

  16. 16

    flutter: dropdown item programmatically unselect problem

  17. 17

    Pandas - check if dataframe has negative value in any column

  18. 18

    Nuget add packages gives access denied errors

  19. 19

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  20. 20

    Generate random UUIDv4 with Elm

  21. 21

    Client secret not provided in request error with Keycloak

HotTag

Archive