Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -100,3 +100,6 @@ hs_err_pid*

# macOS
*.DS_Store

# VSCode
.vscode/settings.json
2 changes: 1 addition & 1 deletion gradle/libs.versions.toml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ commons-lang3 = { module = "org.apache.commons:commons-lang3", version = "3.18.0
commons-text = { module = "org.apache.commons:commons-text", version = "1.14.0" }
eclipselink = { module = "org.eclipse.persistence:eclipselink", version = "4.0.7" }
errorprone = { module = "com.google.errorprone:error_prone_core", version = "2.41.0" }
google-cloud-storage-bom = { module = "com.google.cloud:google-cloud-storage-bom", version = "2.55.0" }
google-cloud-libraries-bom = { module = "com.google.cloud:libraries-bom", version = "26.64.0" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to change the License file for this change, somewhere like here

Group: com.google.api.grpc Name: proto-google-cloud-storage-v2 Version: 2.53.0
. cc @jbonofre

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about LICENSE updates, however using a BOM does not necessary require LICENSE changes... only real dependencies need to be mentioned... IMHO, that can be done later (we have to double check dependencies for every release anyway).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with a followup PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering: does libraries-bom get updated as frequently as any of its upstream artifacts are published?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe most of the Cloud SDKs, libraries included, are on a two week cadence more or less. The advantage (coming from the Beam experience with the Cloud Java SDKs) of using the BOM is that it keeps the various support libraries synchronized across specific SDKS. What happens otherwise is you get version drift in shared components like Protobuf or gRPC core libraries which can be really hard to spot.

guava = { module = "com.google.guava:guava", version = "33.4.8-jre" }
h2 = { module = "com.h2database:h2", version = "2.3.232" }
dnsjava = { module = "dnsjava:dnsjava", version = "3.6.3" }
Expand Down
1 change: 1 addition & 0 deletions gradle/projects.main.properties
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ polaris-runtime-common=runtime/common
polaris-runtime-test-common=runtime/test-common
polaris-eclipselink=persistence/eclipselink
polaris-relational-jdbc=persistence/relational-jdbc
polaris-google-cloud-spanner=persistence/google-cloud-spanner
polaris-tests=integration-tests
aggregated-license-report=aggregated-license-report
polaris-immutables=tools/immutables
Expand Down
43 changes: 43 additions & 0 deletions persistence/google-cloud-spanner/build.gradle.kts
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

plugins {
id("polaris-server")
alias(libs.plugins.jandex)
}

dependencies {
implementation(project(":polaris-core"))
implementation(libs.slf4j.api)
implementation(libs.guava)

implementation(platform(libs.google.cloud.libraries.bom))
implementation("com.google.cloud:google-cloud-spanner")

compileOnly(libs.jakarta.annotation.api)
compileOnly(libs.jakarta.enterprise.cdi.api)
compileOnly(libs.jakarta.inject.api)

compileOnly(libs.smallrye.common.annotation) // @Identifier
compileOnly(libs.smallrye.config.core) // @ConfigMapping

testImplementation(libs.mockito.junit.jupiter)
testImplementation(libs.h2)
testImplementation(testFixtures(project(":polaris-core")))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the package really be org.apache.polaris.persistence.relational.spanner;? Or org.apache.polaris.persistence.spanner? I thought relational was meant for the JDBC metastore or RDBMS more generally

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any preference (personally I think relational vs not is just implementation detail), though if you have the pick Spanner in 2025 really fits better in the relational model than the nosql model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so this is what I'm saying, I think the package name is a bit misleading. This is really for relational-jdbc, it's not saying everything in this package uses a relational model (and implicitly everything outside of it does not)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what mostly happened is that when I started this y'all had a relational package for eclipselink and jdbc but somewhere along the way you moved things up a level in the project structure... Probably the best way to resolve that would be to merge the PRs and then do a package name refactor since the refactoring tool could do it all in one go. Also makes it clear what's happening in the commit chain.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh, changed my mind and just started moving things now. A little more work building the follow on PRs but maybe clearer for folks

Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner;

import com.google.cloud.spanner.DatabaseAdminClient;
import java.util.function.Supplier;

public interface DatabaseAdminClientSupplier extends Supplier<DatabaseAdminClient> {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner;

import com.google.cloud.spanner.DatabaseClient;
import java.util.function.Supplier;

public interface DatabaseClientSupplier extends Supplier<DatabaseClient> {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner;

import io.smallrye.config.ConfigMapping;
import java.util.Optional;

@ConfigMapping(prefix = "polaris.persistence.spanner")
public interface GoogleCloudSpannerConfiguration {
Comment on lines 25 to 26
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since we're calling the config spanner, I wonder if we need the GoogleCloud- prefix everywhere? It's making things quite long. For similar cloud-specific types like S3StorageLocation or GcpStorageConfigInfo we are not quite as verbose

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real preference since my editor autocompletes everything for me... It's already implemented as GoogleCloudSpanner* so it would be a fair amount of toil to rename for what amounts to an aesthetic decision though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry where is it already implemented? Autocompletion / writing long type names is not an issue, but reading them and dealing with line-size constraints can be.

It is an aesthetic decision, but GoogleCloudSpannerDatabaseClientLifecycleManager.java would set a new record for the longest .java filename in the project :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full implementation is here: https://github.com/byronellis/polaris/tree/spanner-persistence just didn't want to drop all of that on y'all in one go.

I'm just going to leave the name as-is for right now, if someone feels really strongly about it they can file a PR to change all the callsites later if that works for you.


public Optional<String> quotaProjectId();

public Optional<String> projectId();

public Optional<String> instanceId();

public Optional<String> databaseId();

public Optional<String> emulatorHost();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner;

import com.google.cloud.spanner.Database;
import com.google.cloud.spanner.DatabaseAdminClient;
import com.google.cloud.spanner.DatabaseId;
import com.google.cloud.spanner.Dialect;
import com.google.cloud.spanner.Spanner;
import com.google.cloud.spanner.SpannerException;
import com.google.common.collect.ImmutableList;
import jakarta.annotation.PostConstruct;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.enterprise.inject.Produces;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.function.Consumer;
import org.apache.polaris.core.context.RealmContext;
import org.apache.polaris.core.persistence.bootstrap.SchemaOptions;
import org.apache.polaris.persistence.relational.spanner.model.Realm;
import org.apache.polaris.persistence.relational.spanner.util.SpannerUtil;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

@ApplicationScoped
public class GoogleCloudSpannerDatabaseClientLifecycleManager {

private static Logger LOGGER =
LoggerFactory.getLogger(GoogleCloudSpannerDatabaseClientLifecycleManager.class);

protected final GoogleCloudSpannerConfiguration spannerConfiguration;

public GoogleCloudSpannerDatabaseClientLifecycleManager(
GoogleCloudSpannerConfiguration spannerConfiguration) {
this.spannerConfiguration = spannerConfiguration;
}

protected Spanner spanner;
protected DatabaseId databaseId;

@PostConstruct
protected void init() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering: why not do the init work in the constructor?

spanner = SpannerUtil.spannerFromConfiguration(spannerConfiguration);
databaseId = SpannerUtil.databaseFromConfiguration(spannerConfiguration);
}

protected List<String> getSpannerDatabaseDdl(SchemaOptions options) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make this method static.

final InputStream schemaStream;
if (options.schemaFile() != null) {
try {
schemaStream = new FileInputStream(options.schemaFile());
} catch (IOException e) {
throw new IllegalArgumentException("Unable to load file " + options.schemaFile(), e);
}
} else {
if (options.schemaVersion() == null || options.schemaVersion() == 1) {
schemaStream =
getClass().getResourceAsStream("/org/apache/polaris/persistence/spanner/schema-v1.sql");
} else {
throw new IllegalArgumentException("Unknown schema version " + options.schemaVersion());
}
}
try (schemaStream) {
String schema = new String(schemaStream.readAllBytes(), Charset.forName("UTF-8"));
List<String> lines = new ArrayList<>();
for (String s : schema.split("\n")) {
s = s.trim();
if (s.startsWith("--") || s.length() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to check if the line ends with ;? Later we split on that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check for lines only containing ';'

continue;
}
lines.add(s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (s.startsWith("--") || s.length() == 0) {
continue;
}
lines.add(s);
if (!s.startsWith("--") && s.length() > 0) {
lines.add(s);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm OK? I wanted to remove lines that are comments or just blank... Added a check for only containing ';' in case some monster does that... that said it's not like we're sending arbitrary SQL through this thing so we don't need to be super careful.

}
return List.of(String.join(" ", lines).split(";"));
} catch (IOException e) {
throw new RuntimeException("Unable to retrieve DDL statements", e);
}
}

@Produces
public SchemaInitializer getSchemaInitializer() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. The scheme options are only used while bootstrapping. Bootstrappping was done by the Admin tool. It doesn't seem necessary to me that we need a SchemaInitializer bean for any dynamic options inside Polaris server. A normal function with schema options should be good enough.

return (options) -> {
List<String> ddlStatements = getSpannerDatabaseDdl(options);
LOGGER.info(
"Attempting to initialize Spanner database DDL with {} statements,",
ddlStatements.size());
DatabaseAdminClient client = spanner.getDatabaseAdminClient();
Database dbInfo =
client.newDatabaseBuilder(databaseId).setDialect(Dialect.GOOGLE_STANDARD_SQL).build();
try {
spanner.getDatabaseAdminClient().updateDatabaseDdl(dbInfo, ddlStatements, null).get();
LOGGER.info("Successfully applied DDL update.");
} catch (InterruptedException | ExecutionException e) {
LOGGER.error("Unable to update Spanner DDL.", e);
throw new RuntimeException(
"Unable to update Spanner DDL. Please disable this option for this database configuration.",
e);
}
};
}

@Produces
public Consumer<RealmContext> getRealmInitializer() {
return (realmContext) -> {
try {
spanner
.getDatabaseClient(databaseId)
.write(ImmutableList.of(Realm.upsert(realmContext.getRealmIdentifier())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be done under the "bootstrap" call path as opposed to on observing new realm IDs in runtime. The difference would be delegating realm initialization to the "admin" user / admin tool. Cf. #2196

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also RealmContext CDI beans may come and go very frequently in runtime (once per request at least).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @dimas-b. The realm initialization doesn't happen very often. It only happens when we bootstrap a new realm, https://polaris.apache.org/in-dev/unreleased/admin-tool/#bootstrapping-realms-and-principal-credentials. Producing a bean here isn't necessary to me, as Polaris server will never use it for realm initialization. Here is the reference code path in JDBC impl.: https://github.com/polaris-catalog/polaris/blob/main/persistence/relational-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcMetaStoreManagerFactory.java#L142-L142

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, moving this to the bootstrap code.

} catch (SpannerException e) {
LOGGER.error("Unable to initialize realm " + realmContext.getRealmIdentifier(), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw a runtime exception instead of logging an error? So that the stack trace will show the complete call chain.

}
};
}

@Produces
public DatabaseClientSupplier getDatabaseClientSupplier() {
return () -> spanner.getDatabaseClient(databaseId);
}

@Produces
public DatabaseAdminClientSupplier getDatabaseAdminClientSupplier() {
return () -> spanner.getDatabaseAdminClient();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner;

import java.util.function.Consumer;
import org.apache.polaris.core.context.RealmContext;

public interface RealmInitializer extends Consumer<RealmContext> {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner;

import java.util.function.Consumer;
import org.apache.polaris.core.persistence.bootstrap.SchemaOptions;

public interface SchemaInitializer extends Consumer<SchemaOptions> {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.polaris.persistence.relational.spanner.model;

import com.google.cloud.spanner.Key;
import com.google.cloud.spanner.Mutation;

public final class Realm {

public static String TABLE_NAME = "Realms";

public static Mutation upsert(String realmId) {
return Mutation.newInsertOrUpdateBuilder(TABLE_NAME).set("RealmId").to(realmId).build();
}

public static Mutation delete(String realmId) {
return Mutation.delete(TABLE_NAME, Key.of(realmId));
}
}
Loading