Skip to content

Commit 3a19559

Browse files
committed
docs: exemplify kubectl output
docs: remove duplicate Deploy section and clarify test status docs: credential setup instructions with OVH Manager links - workflows/README: explain secret purposes (event ingestion, storage, API auth) - workflows/README: add direct OVH Manager links for kubeconfig and S3 credentials - README: delegate setup to workflows/README - Separate operator usage (root README) from deployment setup (workflows/README) docs: update README and gitignore
1 parent 6283d63 commit 3a19559

File tree

4 files changed

+125
-167
lines changed

4 files changed

+125
-167
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,5 @@ Thumbs.db
6262
*.zarr
6363
out/
6464
reports/
65+
*.pyc
66+
pipeline_utils.py

README.md

Lines changed: 19 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# EOPF GeoZarr Data Pipeline
22

3-
**Kubernetes pipeline: Sentinel CPM Zarr → Cloud-Optimized GeoZarr + STAC Registration**
3+
**Kubernetes pipeline: Sentinel Zarr → Cloud-Optimized GeoZarr + STAC Registration**
44

55
Automated pipeline for converting Sentinel-1/2 Zarr datasets to cloud-optimized GeoZarr format with STAC catalog integration and interactive visualization.
66

@@ -56,33 +56,21 @@ Transforms Sentinel-1/2 satellite data into web-ready visualizations:
5656
- Sentinel-1 GRD (SAR backscatter)
5757

5858

59-
## Requirements & Setup
59+
## Setup
6060

61-
### Prerequisites
61+
**Prerequisites:**
62+
- Kubernetes cluster with [platform-deploy](https://github.com/EOPF-Explorer/platform-deploy) (Argo Workflows, RabbitMQ, STAC API, TiTiler)
63+
- Python 3.13+ with `uv`
64+
- `kubectl` configured
6265

63-
- **Kubernetes cluster** with [platform-deploy](https://github.com/EOPF-Explorer/platform-deploy) infrastructure
64-
- Argo Workflows (pipeline orchestration)
65-
- RabbitMQ (event-driven automation)
66-
- STAC API & TiTiler (catalog & visualization)
67-
- **Python 3.13+** with `uv` package manager
68-
- **S3 storage** credentials (OVH de region)
69-
- **Kubeconfig** in `.work/kubeconfig`
66+
**📖 Complete setup guide:** See [workflows/README.md](workflows/README.md) for:
67+
- kubectl configuration (OVH Manager kubeconfig download)
68+
- Required secrets (RabbitMQ, S3, STAC API)
69+
- Workflow deployment (`kubectl apply -k`)
7070

71-
Verify infrastructure:
71+
**Quick verification:**
7272
```bash
73-
export KUBECONFIG=$(pwd)/.work/kubeconfig
74-
kubectl get pods -n core -l app.kubernetes.io/name=argo-workflows
75-
kubectl get pods -n core -l app.kubernetes.io/name=rabbitmq
76-
```
77-
78-
### Deploy Workflows
79-
80-
```bash
81-
# Apply to staging
82-
kubectl apply -k workflows/overlays/staging
83-
84-
# Apply to production
85-
kubectl apply -k workflows/overlays/production
73+
kubectl get wf,sensor,eventsource -n devseed-staging
8674
```
8775

8876
---
@@ -208,60 +196,19 @@ docker/Dockerfile # Pipeline image
208196
tools/submit_burst.py # RabbitMQ burst submission tool
209197
```
210198

211-
Tests are available in `tests/` directory (unit and integration tests using pytest).
212-
213-
---
214-
215-
## Deploy
216-
217-
```bash
218-
# Apply to staging
219-
kubectl apply -k workflows/overlays/staging
220-
221-
# Apply to production
222-
kubectl apply -k workflows/overlays/production
223-
```
224-
225-
**Config:** Image version, S3 endpoints, STAC API URLs, RabbitMQ exchanges configured via kustomize overlays.
199+
Tests are planned for `tests/` directory (structure exists, test files to be added).
226200

227201
---
228202

229203
## Configuration
230204

231-
### S3 Storage
232-
233-
```bash
234-
kubectl create secret generic geozarr-s3-credentials -n devseed-staging \
235-
--from-literal=AWS_ACCESS_KEY_ID="<your-key>" \
236-
--from-literal=AWS_SECRET_ACCESS_KEY="<your-secret>"
237-
```
238-
239-
| Setting | Value |
240-
|---------|-------|
241-
| **Endpoint** | `https://s3.de.io.cloud.ovh.net` |
242-
| **Bucket** | `esa-zarr-sentinel-explorer-fra` |
243-
| **Region** | `de` |
205+
**📖 Full configuration:** See [workflows/README.md](workflows/README.md) for secrets setup and parameters.
244206

245-
### RabbitMQ
246-
247-
Get password:
248-
```bash
249-
kubectl get secret rabbitmq-password -n core -o jsonpath='{.data.rabbitmq-password}' | base64 -d
250-
```
251-
252-
| Setting | Value |
253-
|---------|-------|
254-
| **URL** | `amqp://user:[email protected]:5672/` |
255-
| **Exchange** | `geozarr-staging` |
256-
| **Routing key** | `eopf.items.test` |
257-
258-
**Message format:**
259-
```json
260-
{
261-
"source_url": "https://stac.core.eopf.eodc.eu/collections/sentinel-2-l2a/items/...",
262-
"collection": "sentinel-2-l2a-dp-test"
263-
}
264-
```
207+
**Quick reference:**
208+
- S3: `s3.de.io.cloud.ovh.net` / `esa-zarr-sentinel-explorer-fra`
209+
- Staging collection: `sentinel-2-l2a-dp-test`
210+
- Production collection: `sentinel-2-l2a`
211+
- **Enable debug logs:** `export LOG_LEVEL=DEBUG` (or add to workflow env)
265212

266213
---
267214

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,7 @@ warn_no_return = true
145145
strict_equality = true
146146
exclude = ["examples/"]
147147

148+
# Relax type checking for test files (structure exists, tests to be added)
148149
[[tool.mypy.overrides]]
149150
module = "tests.*"
150151
disallow_untyped_defs = false

workflows/README.md

Lines changed: 103 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -1,142 +1,150 @@
11
# Workflows
22

3-
Argo Workflows configuration using Kustomize for environment management.
3+
Event-driven Argo Workflows for Sentinel-2 GeoZarr conversion and STAC registration.
44

5-
## Purpose
5+
**Architecture**: RabbitMQ messages → Sensor → WorkflowTemplate (convert → register) → S3 + STAC API
66

7-
Event-driven pipeline orchestration for Sentinel-2 GeoZarr conversion and STAC registration. RabbitMQ messages trigger workflows that run a 2-step DAG: **convert → register**.
7+
---
88

9-
## Structure
9+
## Quick Setup
1010

11-
```
12-
workflows/
13-
├── base/ # Core resources (namespace-agnostic)
14-
│ ├── kustomization.yaml # References all resources
15-
│ ├── workflowtemplate.yaml # 2-step pipeline DAG
16-
│ ├── sensor.yaml # RabbitMQ → Workflow trigger
17-
│ ├── eventsource.yaml # RabbitMQ connection config
18-
│ └── rbac.yaml # ServiceAccount + permissions
19-
└── overlays/
20-
├── staging/
21-
│ └── kustomization.yaml # devseed-staging namespace patches
22-
└── production/
23-
└── kustomization.yaml # devseed namespace patches
24-
```
11+
### 1. Configure kubectl
2512

26-
## Apply to Cluster
13+
Download kubeconfig from [OVH Manager → Kubernetes](https://www.ovh.com/manager/#/public-cloud/pci/projects/bcc5927763514f499be7dff5af781d57/kubernetes/f5f25708-bd15-45b9-864e-602a769a5fcf/service) (**Access and Security** tab).
2714

28-
**Staging (devseed-staging):**
2915
```bash
30-
kubectl apply -k workflows/overlays/staging
16+
mv ~/Downloads/kubeconfig-*.yml .work/kubeconfig
17+
export KUBECONFIG=$(pwd)/.work/kubeconfig
18+
kubectl get nodes # Verify: should list 3-5 nodes
3119
```
3220

33-
**Production (devseed):**
21+
### 2. Create Required Secrets
22+
23+
The pipeline needs 3 secrets for: **event ingestion** (RabbitMQ), **output storage** (S3), and **STAC registration** (API auth).
24+
25+
**RabbitMQ credentials** (receives workflow trigger events):
3426
```bash
35-
kubectl apply -k workflows/overlays/production
27+
# Get password from cluster-managed secret
28+
RABBITMQ_PASS=$(kubectl get secret rabbitmq-password -n core -o jsonpath='{.data.rabbitmq-password}' | base64 -d)
29+
30+
kubectl create secret generic rabbitmq-credentials -n devseed-staging \
31+
--from-literal=username=user \
32+
--from-literal=password="$RABBITMQ_PASS"
3633
```
3734

38-
**Verify deployment:**
35+
**S3 credentials** (writes converted GeoZarr files):
3936
```bash
40-
# Check resources (expected output shows 1 of each)
41-
kubectl get workflowtemplate,sensor,eventsource,sa -n devseed-staging
37+
# Get from OVH Manager → Users & Roles → OpenStack credentials
38+
# https://www.ovh.com/manager/\#/public-cloud/pci/projects/bcc5927763514f499be7dff5af781d57/users
4239

43-
# Example output:
44-
# NAME AGE
45-
# workflowtemplate.argoproj.io/geozarr-pipeline 5m
46-
#
47-
# NAME AGE
48-
# sensor.argoproj.io/geozarr-sensor 5m
49-
#
50-
# NAME AGE
51-
# eventsource.argoproj.io/rabbitmq-geozarr 5m
52-
#
53-
# NAME SECRETS AGE
54-
# serviceaccount/operate-workflow-sa 0 5m
55-
56-
# Watch for workflows (should show Running/Succeeded/Failed)
57-
kubectl get wf -n devseed-staging --watch
40+
kubectl create secret generic geozarr-s3-credentials -n devseed-staging \
41+
--from-literal=AWS_ACCESS_KEY_ID=<your-ovh-access-key> \
42+
--from-literal=AWS_SECRET_ACCESS_KEY=<your-ovh-secret-key>
5843
```
5944

60-
## Required Secrets
45+
**STAC API token** (registers items, optional if API is public):
46+
```bash
47+
kubectl create secret generic stac-api-token -n devseed-staging \
48+
--from-literal=token=<bearer-token>
49+
```
6150

62-
The pipeline requires these Kubernetes secrets in the target namespace:
51+
### 3. Deploy Workflows
6352

64-
### 1. `rabbitmq-credentials`
65-
RabbitMQ authentication for EventSource:
53+
```bash
54+
kubectl apply -k workflows/overlays/staging # Staging (devseed-staging)
55+
kubectl apply -k workflows/overlays/production # Production (devseed)
56+
```
6657

58+
**Verify deployment:**
6759
```bash
68-
kubectl create secret generic rabbitmq-credentials \
69-
--from-literal=username=<rabbitmq-user> \
70-
--from-literal=password=<rabbitmq-password> \
71-
-n devseed-staging
60+
kubectl get workflowtemplate,sensor,eventsource,sa -n devseed-staging
61+
# Expected: 1 WorkflowTemplate, 1 Sensor, 1 EventSource, 1 ServiceAccount
7262
```
7363

74-
### 2. `geozarr-s3-credentials`
75-
S3 credentials for GeoZarr output:
64+
---
65+
66+
## Structure
7667

77-
```bash
78-
kubectl create secret generic geozarr-s3-credentials \
79-
--from-literal=AWS_ACCESS_KEY_ID=<access-key> \
80-
--from-literal=AWS_SECRET_ACCESS_KEY=<secret-key> \
81-
-n devseed-staging
8268
```
69+
workflows/
70+
├── base/ # Core resources (namespace-agnostic)
71+
│ ├── workflowtemplate.yaml # 2-step DAG: convert → register
72+
│ ├── sensor.yaml # RabbitMQ trigger
73+
│ ├── eventsource.yaml # RabbitMQ connection
74+
│ ├── rbac.yaml # Permissions
75+
│ └── kustomization.yaml
76+
└── overlays/
77+
├── staging/ # devseed-staging namespace
78+
└── production/ # devseed namespace
79+
```
80+
81+
---
8382

84-
### 3. `stac-api-token` (optional)
85-
Bearer token for STAC API authentication (if required):
83+
## Monitoring
8684

85+
**Watch workflows:**
8786
```bash
88-
kubectl create secret generic stac-api-token \
89-
--from-literal=token=<bearer-token> \
90-
-n devseed-staging
87+
kubectl get wf -n devseed-staging --watch
88+
```
89+
90+
**Example output:**
91+
```
92+
NAME STATUS AGE
93+
geozarr-79jmg Running 5m
94+
geozarr-95rgx Succeeded 9h
95+
geozarr-jflnj Failed 10h
9196
```
9297

93-
## WorkflowTemplate Parameters
98+
---
99+
100+
## Configuration
101+
102+
### S3 Storage
103+
104+
- **Endpoint**: `https://s3.de.io.cloud.ovh.net` (OVH Frankfurt)
105+
- **Bucket**: `esa-zarr-sentinel-explorer-fra`
106+
- **Paths**: `tests-output/` (staging), `geozarr/` (production)
94107

95-
See main [README.md](../README.md) for complete parameter reference.
108+
### Workflow Parameters
96109

97-
| Parameter | Default | Description |
98-
|-----------|---------|-------------|
99-
| `source_url` | - | STAC item URL or direct Zarr URL |
100-
| `register_collection` | sentinel-2-l2a-dp-test | STAC collection ID |
101-
| `stac_api_url` | https://api... | STAC API endpoint |
102-
| `raster_api_url` | https://api... | TiTiler endpoint |
103-
| `s3_output_bucket` | esa-zarr... | S3 output bucket |
104-
| `pipeline_image_version` | fix-unit-tests | Docker image tag |
110+
Key parameters (see [../README.md](../README.md) for full reference):
105111

106-
## Resource Configuration
112+
- `source_url`: STAC item URL or Zarr URL
113+
- `register_collection`: Target STAC collection (default: `sentinel-2-l2a-dp-test`)
114+
- `s3_output_bucket`: Output bucket
115+
- `pipeline_image_version`: Docker image tag
107116

108-
To adjust CPU/memory limits, edit `workflows/base/workflowtemplate.yaml`:
117+
### Resource Tuning
118+
119+
Edit `workflows/base/workflowtemplate.yaml`:
109120

110121
```yaml
111-
- name: convert-geozarr
112-
resources:
113-
requests:
114-
memory: 4Gi # Increase for larger datasets
115-
cpu: '1'
116-
limits:
117-
memory: 8Gi
118-
cpu: '2'
122+
resources:
123+
requests: { memory: 4Gi, cpu: '1' }
124+
limits: { memory: 8Gi, cpu: '2' } # Increase for larger datasets
119125
```
120126
127+
---
128+
121129
## Troubleshooting
122130
123-
**Kustomize build fails:**
131+
**Workflow not triggered:**
124132
```bash
125-
# Validate structure
126-
kubectl kustomize workflows/overlays/staging
133+
kubectl logs -n devseed-staging -l eventsource-name=rabbitmq # Check RabbitMQ connection
134+
kubectl get sensor -n devseed-staging geozarr-trigger -o yaml # Check sensor status
135+
```
127136

128-
# Check for duplicate resources
129-
find workflows -name "*.yaml" -not -path "*/base/*" -not -path "*/overlays/*"
137+
**Workflow fails:**
138+
```bash
139+
kubectl logs -n devseed-staging <workflow-pod-name> # View logs
140+
kubectl get secret -n devseed-staging # Verify secrets exist
130141
```
131142

132-
**Workflow not triggered:**
133-
- Check EventSource connection: `kubectl logs -n devseed-staging -l eventsource-name=rabbitmq`
134-
- Check Sensor status: `kubectl get sensor -n devseed-staging geozarr-trigger -o yaml`
135-
- Verify RabbitMQ port-forward or service access
143+
**Kustomize validation:**
144+
```bash
145+
kubectl kustomize workflows/overlays/staging # Validate YAML
146+
```
136147

137-
**Workflow fails:**
138-
- Check pod logs: `kubectl logs -n devseed-staging <workflow-pod-name>`
139-
- Verify secrets exist: `kubectl get secret -n devseed-staging geozarr-s3-credentials stac-api-token`
140-
- Check RBAC: `kubectl auth can-i create workflows --as=system:serviceaccount:devseed-staging:operate-workflow-sa`
148+
---
141149

142-
For full pipeline documentation, see [../README.md](../README.md).
150+
For complete documentation, see [../README.md](../README.md).

0 commit comments

Comments
 (0)