Skip to content

dsillman2000/tf-gcs-spark-cluster-nb

Repository files navigation

google-spark-cluster-notebook

Requirements

Name Version
terraform >=1.10, <2.0
google >=6.21, <7.0

Providers

Name Version
google 6.21.0

Modules

No modules.

Resources

Name Type
google_dataproc_cluster.dataproc_cluster resource
google_storage_bucket.gcs_bucket resource

Inputs

Name Description Type Default Required
gcp_dataproc_image_version The Dataproc image version to use for the cluster. string "2.2-debian12" no
gcp_dataproc_master_machine_type The machine type to use for the Dataproc master node. string "n1-standard-2" no
gcp_dataproc_worker_count The number of worker nodes to create in the Dataproc cluster. number 2 no
gcp_dataproc_worker_machine_type The machine type to use for the Dataproc worker nodes. string "n1-standard-2" no
gcp_project_id The GCP project ID to create the bucket and Dataproc cluster in. Must already be created. string n/a yes
gcp_region The GCP region to create the bucket and Dataproc cluster in. string "us-east1" no
gcs_bucket_name The name of the GCS bucket to create to use for the underlying Dataproc Hadoop storage. string "gcp-spark-storage-bucket" no

Outputs

Name Description
gcp_dataproc_cluster_id The ID of the Dataproc cluster.
gcp_dataproc_cluster_name The name of the Dataproc cluster.
gcs_bucket_name The name of the GCS bucket.

About

Example terraform project using GCP to provision an Apache Spark Cluster with a Jupyter Notebook interface.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published