Skip to content

Add catalog SPI modules and refactor coral-common to SPI dispatch#590

Open
wmoustafa wants to merge 1 commit intolinkedin:masterfrom
wmoustafa:wmoustaf/catalog-spi-refactor
Open

Add catalog SPI modules and refactor coral-common to SPI dispatch#590
wmoustafa wants to merge 1 commit intolinkedin:masterfrom
wmoustafa:wmoustaf/catalog-spi-refactor

Conversation

@wmoustafa
Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request, and why are they necessary?

This PR introduces a 3-module catalog SPI architecture that decouples Hive and Iceberg implementations from the core classpath. Currently, coral-common bundles hive-metastore, hadoop-common, and iceberg-* as api dependencies, which means every consumer transitively pulls all of them. This refactor isolates catalog-specific code behind an SPI boundary.

New modules:

  • coral-catalog-spi — Pure interfaces (CoralCatalog, CoralTable, TableType), the com.linkedin.coral.common.types package, and new SPI contracts (CoralCalciteTableAdapterFactory, CoralCalciteTableAdapterRegistry). Zero Hive/Iceberg dependencies.
  • coral-catalog-hive — Hive catalog implementations (HiveTable, HiveCalciteTableAdapter, HiveCalciteViewAdapter, HiveToCoralTypeConverter, TypeConverter) with HiveCalciteTableAdapterFactory registered via META-INF/services.
  • coral-catalog-iceberg — Iceberg catalog implementations (IcebergTable, IcebergCalciteTableAdapter, IcebergToCoralTypeConverter, IcebergHiveTableConverter) with IcebergCalciteTableAdapterFactory registered via META-INF/services.

coral-common changes:

  • CoralDatabaseSchema.getTable() now uses CoralCalciteTableAdapterRegistry for SPI-based dispatch instead of instanceof checks.
  • 9 implementation classes converted to @Deprecated wrappers delegating to the new modules.
  • Hive/Hadoop/Iceberg dependencies demoted from api to implementation scope.

Backward compatibility: All original FQCNs still resolve — interfaces/types kept the same package, and implementation classes have @Deprecated wrappers at the old locations. Consumers relying on transitive api exposure of hive-metastore or iceberg-* from coral-common may need to add explicit dependencies.

Known limitations:

How was this patch tested?

  • ./gradlew compileJava — all modules compile successfully
  • ./gradlew :coral-common:test — existing tests pass
  • Verified ServiceLoader discovers both HiveCalciteTableAdapterFactory and IcebergCalciteTableAdapterFactory
  • Confirmed coral-catalog-spi has no Hive/Iceberg on compile classpath
  • Verified deprecated wrapper classes at old FQCNs still resolve and delegate correctly

Introduce a 3-module catalog SPI architecture that decouples Hive and
Iceberg implementations from the core classpath:

- coral-catalog-spi: Pure interfaces (CoralCatalog, CoralTable, TableType),
  type system classes, and new SPI contracts (CoralCalciteTableAdapterFactory,
  CoralCalciteTableAdapterRegistry) with zero Hive/Iceberg dependencies.

- coral-catalog-hive: Hive catalog implementations (HiveTable,
  HiveCalciteTableAdapter, HiveCalciteViewAdapter, TypeConverter) with
  SPI factory registration via META-INF/services.

- coral-catalog-iceberg: Iceberg catalog implementations (IcebergTable,
  IcebergCalciteTableAdapter, IcebergHiveTableConverter) with SPI factory
  registration via META-INF/services.

coral-common is refactored to use CoralCalciteTableAdapterRegistry for
SPI-based dispatch instead of instanceof checks. Original implementation
classes are retained as @deprecated wrappers for backward compatibility.
Hive/Hadoop/Iceberg dependencies are demoted from api to implementation
scope.

Known limitation: ToRelConverter.toSqlNode() still couples the core to
Hive's Table type (issue linkedin#575).
Copy link
Copy Markdown
Contributor

@ruolin59 ruolin59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually appears to mostly be moving things around and refactoring, but have a few questions about the module rewiring, and also, it would be good to add some new unit tests to cover the new items introduced in this pr, including

  • CoralCalciteTableAdapterRegistryTest: priority ordering, register() merging with ServiceLoader, UnsupportedOperationException on no-match.
  • One smoke test per new factory confirming supports() / createAdapter() return the expected adapter class for a representative CoralTable.
  • One test per deprecated wrapper confirming new (...) produces an instance assignable to the new type and exposes the same public surface.

}

return null;
return CoralCalciteTableAdapterRegistry.createAdapter(coralTable,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method throws UnsupportedOperationException, I'd just like to confirm that this is the new behavior we want (so no catching an return null, correct?), if so, may be a good idea to at least update the javadoc

Comment thread coral-common/build.gradle
}

api(deps.'hive'.'hive-metastore') {
// New catalog SPI and implementation modules
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coral-catalog-hive and coral-catalog-iceberg appear to still have api dependencies on hive-metastore and iceberg-core respectively, so the consumer would still be pulling these deps transitively right? To clarify,

  1. Was the intent to demote coral-common's project deps to implementation, or is keeping them as api a deliberate compatibility choice?
  2. If it's deliberate, which downstream consumers are expected to migrate to depending on coral-catalog-spi directly?
  3. Can the redundant implementation block at lines 21-34 be dropped?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants