Running AWS Glue Job in Scala Locally and Connect to Redshift

3 min readDec 10, 2022

While I developed the data warehouse project, It was hard to write the code on AWS Glue UI. That’s why I started looking for a way to write the code locally. I found a solution to add the AWS Glue dependencies to the data warehouse project in Scala, then I could use the AWS glue functions locally.
This functionality helps me to write the code easily on my machine :)

Today, I’m gonna tell you how I implemented and used it.

If I sort the topics,
- Create a new project on Intellij Idea
- Configure dependencies & modules
- Use AWS Glue Functions

Create a new project on Intellij Idea

File -> new project -> and choose the following steps

After creating the project, right-click on the project name -> new -> Module
And add a new module

Configure dependencies & modules

Add the following Gradle file to the main project and update the Gradle.

Add the following Gradle file to the job_scripts module and update the Gradle

We added redshift, AWS glue, and Scala libraries into our project then we are able to use AWS glue functions and connect to redshift.

as you see we haven’t had a scala directory.
Right-click on main -> new -> directory

Choose scala directory.

Use AWS Glue Functions

As you remember I wrote about how can we create crawler transform data etc..
Now we imagine we had a database as I create it by AWS Glue Crawler my old post.

How Can We Transform JSON / CSV files to Parquet through Aws Glue?

AWS Glue is a fully managed ETL (extract, transform, and load) service that makes it simple and cost-effective to…

hkdemircan.medium.com

Let’s read the code

As you see in the code,
- we can use glue functions
- get the data from glue database-table
- bringing the data to redshift.

If you want to connect to redshift with JDBC connection;

GitLab Link: Here
For more information: Here