Skip to content

SankarPatnaik/etl-framework-icebreg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Generic ETL Framework with Apache Iceberg & Python

This ETL framework is designed to extract data from 100+ sources, transform using Pandas/Arrow, and load into Apache Iceberg format on AWS S3.

๐Ÿ”ง Features

  • Source connectors: MySQL, Oracle, SQL Server, MongoDB, .dat files
  • Modular, config-driven pipeline
  • Pandas โ†’ Arrow โ†’ Iceberg on S3
  • Prefect-based orchestration

๐Ÿ› ๏ธ Usage

python main.py mysql_customers

Or trigger via Prefect UI/CLI:

prefect deployment build flows/etl_flow.py:etl_flow -n etl-deployment
prefect deployment apply etl_flow-deployment.yaml
prefect agent start

๐Ÿ“ Config Example (YAML)

See config/etl_config.yaml

๐Ÿ” Environment

Define AWS creds in .env file or via environment variables

About

Generic ETL framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages