You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Address PR review comments for REST catalog documentation
- Fix description to remove S3 buckets reference (not relevant for this guide)
- Fix docker-compose YAML network configuration to avoid duplication
- Add step-by-step setup instructions for better clarity
- Add troubleshooting guidance for users who don't see expected tables
- Include note about sample data loading requirements
Addresses feedback from PR #4031 review comments
@@ -44,15 +44,15 @@ For local development and testing, you can use a containerized REST catalog setu
44
44
45
45
You can use various containerized REST catalog implementations such as **[Databricks docker-spark-iceberg](https://github.com/databricks/docker-spark-iceberg/blob/main/docker-compose.yml?ref=blog.min.io)** which provides a complete Spark + Iceberg + REST catalog environment with docker-compose, making it ideal for testing Iceberg integrations.
46
46
47
-
You'll need to add ClickHouse as a dependency in your docker-compose setup:
47
+
**Step 1:** Clone or download the docker-compose setup from the Databricks repository.
48
+
49
+
**Step 2:** Add ClickHouse as a service to your docker-compose.yml file:
48
50
49
51
```yaml
50
52
clickhouse:
51
53
image: clickhouse/clickhouse-server:main
52
54
container_name: clickhouse
53
55
user: '0:0'# Ensures root permissions
54
-
networks:
55
-
iceberg_net:
56
56
ports:
57
57
- "8123:8123"
58
58
- "9002:9000"
@@ -68,6 +68,30 @@ clickhouse:
68
68
- CLICKHOUSE_PASSWORD=
69
69
```
70
70
71
+
**Step 3:** Ensure your docker-compose.yml includes the necessary network configuration:
72
+
73
+
```yaml
74
+
networks:
75
+
iceberg_net:
76
+
driver: bridge
77
+
```
78
+
79
+
**Step 4:** Start the entire stack:
80
+
81
+
```bash
82
+
docker-compose up -d
83
+
```
84
+
85
+
**Step 5:** Wait for all services to be ready. You can check the logs:
86
+
87
+
```bash
88
+
docker-compose logs -f
89
+
```
90
+
91
+
:::note
92
+
The REST catalog setup requires that sample data be loaded into the Iceberg tables first. Make sure the Spark environment has created and populated the tables before attempting to query them through ClickHouse. The availability of tables depends on the specific docker-compose setup and sample data loading scripts.
93
+
:::
94
+
71
95
### Connecting to Local REST Catalog {#connecting-to-local-rest-catalog}
72
96
73
97
Connect to your ClickHouse container:
@@ -97,13 +121,27 @@ USE demo;
97
121
SHOW TABLES;
98
122
```
99
123
124
+
If your setup includes sample data (such as the taxi dataset), you should see tables like:
125
+
100
126
```sql title="Response"
101
127
┌─name──────────┐
102
128
│ default.taxis │
103
129
└───────────────┘
104
130
```
105
131
106
-
To query a table:
132
+
:::note
133
+
If you don't see any tables, this usually means:
134
+
1. The Spark environment hasn't created the sample tables yet
135
+
2. The REST catalog service isn't fully initialized
136
+
3. The sample data loading process hasn't completed
137
+
138
+
You can check the Spark logs to see the table creation progress:
139
+
```bash
140
+
docker-compose logs spark
141
+
```
142
+
:::
143
+
144
+
To query a table (if available):
107
145
108
146
```sql
109
147
SELECTcount(*) FROM`default.taxis`;
@@ -190,4 +228,4 @@ Then load the data from your REST catalog table via an `INSERT INTO SELECT`:
0 commit comments