Skip to content

Conversation

@aazam-gh
Copy link
Contributor

@aazam-gh aazam-gh commented Oct 18, 2025

What changes are being made and why?


This PR introduces AWS EMR Serverless support to the plugin-aws repository.
It extends Kestra’s existing EMR integration beyond cluster-based execution, enabling users to build, run, and manage EMR Serverless workflows natively in Kestra.
Closes #579

How the changes have been QAed?


Setup Instructions


Contributor Checklist ✅

  • PR Title and commits follows conventional commits
  • Add a closes #ISSUE_ID or fixes #ISSUE_ID in the description if the PR relates to an opened issue.
  • Documentation updated (plugin docs from @Schema for properties and outputs, @Plugin with examples, README.md file with basic knowledge and specifics).
  • Setup instructions included if needed (API keys, accounts, etc.).
  • Prefix all rendered properties by r not rendered (eg: rHost).
  • Use runContext.logger() to log enough important infos where it's needed and with the best level (DEBUG, INFO, WARN or ERROR).

⚙️ Properties

  • Properties are declared with Property<T> carrier type, do not use @PluginProperty.
  • Mandatory properties must be annotated with @NotNull and checked during the rendering.
  • You can model a JSON thanks to a simple Property<Map<String, Object>>.

🌐 HTTP

  • Must use Kestra’s internal HTTP client from io.kestra.core.http.client

📦 JSON

  • If you are serializing response from an external API, you may have to add a @JsonIgnoreProperties(ignoreUnknown = true) at the mapped class level. So that we will avoid to crash the plugin if the provider add a new field suddenly.
  • Must use Jackson mappers provided by core (io.kestra.core.serializers)

New plugins / subplugins

  • Make sure your new plugin is configured like mentioned here.
  • Add a package-info.java under each sub package respecting this format and choosing the right category.
  • Icons added in src/main/resources/icons in SVG format and not in thumbnail (keep it big):
    • plugin-icon.svg
    • One icon per package, e.g. io.kestra.plugin.aws.svg
    • For subpackages, e.g. io.kestra.plugin.aws.s3, add io.kestra.plugin.aws.s3.svg
      See example here.
  • Use "{{ secret('YOUR_SECRET') }}" in the examples for sensible infos such as an API KEY.
  • If you are fetching data (one, many or too many), you must add a Property<FetchType> fetchType to be able to use FETCH_ONE, FETCH and even STORE to store big amount of data in the internal storage.
  • Align the """ to close examples blocks with the flow id.

🧪 Tests

  • Unit Tests added or updated to cover the change (using the RunContext to actually run tasks).
  • Add sanity checks if possible with a YAML flow inside src/test/resources/flows.
  • Avoid disabling tests for CI. Instead, configure a local environment whenever it's possible with .github/setup-unit.sh (which can be executed locally and in the CI) all along with a new docker-compose-ci.yml file (do not edit the existing docker-compose.yml).
  • Provide screenshots from your QA / tests locally in the PR description. The goal here is to use the JAR of the plugin and directly test it locally in Kestra UI to ensure it integrates well.

📤 Outputs

  • Do not send back as outputs the same infos you already have in your properties.
  • If you do not have any output use VoidOutput.
  • Do not output twice the same infos (eg: a status code, an error code saying the same thing...).

@github-project-automation github-project-automation bot moved this to To review in Pull Requests Oct 18, 2025
@MilosPaunovic MilosPaunovic requested review from a team and Malaydewangan09 October 18, 2025 05:45
@aazam-gh
Copy link
Contributor Author

This PR is currently paused in order to obtain an increased EMR serverless quota for instances for testing.

@MilosPaunovic MilosPaunovic added kind/external Pull requests raised by community contributors area/plugin Plugin-related issue or feature request labels Oct 22, 2025
@fdelbrayelle
Copy link
Contributor

Any news on this @aazam-gh ?

@aazam-gh
Copy link
Contributor Author

Followed up with AWS support team again. Are there any alternatives for testing?

@Malaydewangan09
Copy link
Member

Malaydewangan09 commented Oct 29, 2025

Hey @aazam-gh 👋, did you try using localStack for testing?

@aazam-gh
Copy link
Contributor Author

@Malaydewangan09 Not yet, I will try that now. thanks!

@MilosPaunovic
Copy link
Member

Hey @aazam-gh, are there any updates on this?

@aazam-gh
Copy link
Contributor Author

Apparently localstack does not support emr serverless application features yet.
And i received an updated email for the AWS support team that my service quota is still under process ):

Copy link
Member

@Malaydewangan09 Malaydewangan09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aazam-gh, could you please at least write some unit tests for these new tasks? So that we can check on our side too, Thanks!

@fdelbrayelle
Copy link
Contributor

Any news from the AWS Support for your quota @aazam-gh ?

@fdelbrayelle fdelbrayelle self-assigned this Nov 26, 2025
@fdelbrayelle fdelbrayelle marked this pull request as ready for review November 26, 2025 08:56
@fdelbrayelle
Copy link
Contributor

fdelbrayelle commented Nov 26, 2025

QA

Create and Start a Serverless Application

Flow:

id: create_and_run_emr_serverless
namespace: company.team

tasks:
  - id: create_and_run
    type: io.kestra.plugin.aws.emr.CreateServerlessApplicationAndStartJob
    accessKeyId: "${{ secret('AWS_ACCESS_KEY_ID') }}"
    secretKeyId: "${{ secret('AWS_SECRET_KEY_ID') }}"
    region: "eu-north-1"
    releaseLabel: "emr-7.12.0"
    applicationType: "SPARK"
    executionRoleArn: "arn:aws:iam::707969873520:role/EMRServerlessRole"
    jobName: "example-job"
    entryPoint: "s3://my-bucket/jobs/script.py"

Execution Gantt result:

image

The app is created and visible in EMR Studio:

image

Delete a Serverless Application

Here are the 2 existing apps we will delete:

image

Flow:

id: aws_emrserverless_delete_app
namespace: company.team

tasks:
  - id: delete_app
    type: io.kestra.plugin.aws.emr.DeleteServerlessApplication
    accessKeyId: "${{ secret('AWS_ACCESS_KEY_ID') }}"
    secretKeyId: "${{ secret('AWS_SECRET_KEY_ID') }}"
    region: "eu-north-1"
    applicationIds:
        - 00g1dvc093m5tj1d
        - 00g1dvbn2429t01d

Execution Gantt result:

image

The 2 flows were deleted from EMR Studio:

image

Start an existing Serverless Application

Let's take this stopped app:

image

Flow:

id: start_emr_job
namespace: company.team

tasks:
  - id: start_job
    type: io.kestra.plugin.aws.emr.StartServerlessJobRun
    accessKeyId: "${{ secret('AWS_ACCESS_KEY_ID') }}"
    secretKeyId: "${{ secret('AWS_SECRET_KEY_ID') }}"
    region: "eu-north-1"
    applicationId: "00g1dvclhbhl781d"
    executionRoleArn: "arn:aws:iam::707969873520:role/EMRServerlessRole"
    jobName: "example-job"
    entryPoint: "s3://my-bucket/scripts/spark-app.py"

Execution Gantt result:

image

The application appears now has "started" in EMR Studio:

image

@fdelbrayelle
Copy link
Contributor

Hello @aazam-gh 👋 I added a bunch of unit tests and did a QA which passed ✔️ So congratulations and thank you! I'll merge and do a release!

@fdelbrayelle fdelbrayelle merged commit 1e37390 into kestra-io:main Nov 26, 2025
1 check passed
@github-project-automation github-project-automation bot moved this from To review to Done in Pull Requests Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/plugin Plugin-related issue or feature request kind/external Pull requests raised by community contributors

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Extend the AWS EMR task to support AWS EMR Serverless

4 participants