|
| 1 | +# PIP-433: Optimize the conflicts of the replication and automatic creation mechanisms, including the automatic creation of topics and schemas |
| 2 | + |
| 3 | +# Background knowledge |
| 4 | + |
| 5 | +#### Topic auto-creation by Rest API if you have enabled Geo Replication. |
| 6 | + |
| 7 | +The source broker will copy the REST requests that is a partitioned topic creation to the remote cluster if you have already enabled a namespace-level Geo Replication. |
| 8 | + |
| 9 | +The source broker will do the following things when you try to enable a topic-level Geo-Replication, and you did not enable a namespace-level Geo-Replication. |
| 10 | +- The source broker checks whether the partitioned topic exists under the remote cluster |
| 11 | +- If yes, the source broker compares both partition counts, enabling the topic-level replication if both clusters have the same number of partitions; otherwise, you will get a bad request error. |
| 12 | +- If not, it will create the partitioned topic with the same partition count automatically under the remote cluster. |
| 13 | + |
| 14 | +#### Topic auto-creation by clients if you have enabled Geo Replication. |
| 15 | +- `Client of source cluster`: start a consumer/producer for a topic |
| 16 | +- `Client of source cluster`: Get partitions for topic |
| 17 | + - At this step, the client will try to get the partitioned metadata of the topic. If the topic does not exist, the broker will create the partitioned metadata automatically. |
| 18 | +- `Client of source cluster`: Start internal consumers/producers for each partition |
| 19 | + - At this step, the client will try to connect to the partition. If the partition does not exist, the broker will create the partitioned metadata automatically. |
| 20 | +- `Source broker`: starts the geo replicator when a partition is loading up. |
| 21 | + - The geo replicator maintains an internal producer for the topic under the remote cluster. |
| 22 | + - The internal producer is a single partition producer; it will not trigger a partitioned metadata auto-creation. |
| 23 | + - When starting the internal producer, it confirms that the target topic under the remote cluster should be a non-partitioned topic or a partition. |
| 24 | + - **(Highlight)** Otherwise, prints error logs and stops. |
| 25 | + |
| 26 | +#### Schemas replication if you have enabled Geo Replication. |
| 27 | +The internal producer of the geo replicator starts with an `auto-produce` schema, copies a new schema it reads from the source cluster to the remote cluster, and it will be stuck once a schema is incompatible to the remote cluster. |
| 28 | + |
| 29 | +# Motivation |
| 30 | + |
| 31 | +#### Issue 1: conflict topic creation if enabled Geo-Replication |
| 32 | +**Steps to reproduce the issue** |
| 33 | +- Configurations of both the source cluster and the remote cluster |
| 34 | + - `allowAutoTopicCreation`: `true` |
| 35 | + - `allowAutoTopicCreationType`: `partitioned` |
| 36 | + - `defaultNumPartitions`: `2` |
| 37 | +- The namespace `public/default` exists, but you have not enabled Geo-Replication for the namespace yet. |
| 38 | +- Start a producer for a topic `persistent://public/default/topic` on the source cluster. |
| 39 | + - It triggers a partitioned topic with `2` partitions created. |
| 40 | +- Enable namespace-level Geo-Replication for the namespace `public/default`. |
| 41 | +- Start geo replicator for the existing partition `public/default/topic-partition-0` |
| 42 | + - Without [PIP-414: Enforce topic consistency check](https://github.com/apache/pulsar/pull/24213), the geo replicator will trigger a non-partitioned topic creation, which is named `public/default/topic-partition-0` |
| 43 | + - With [PIP-414: Enforce topic consistency check](https://github.com/apache/pulsar/pull/24213), the geo replicator get a denied error. |
| 44 | +- However, the user wants to allow the replicator to copy messages to the remote cluster. |
| 45 | + |
| 46 | +#### Issue 2: Replication is stuck because the remote side does not allow schema updates |
| 47 | +**Steps to reproduce the issue** |
| 48 | +- The topic`public/default/topic` has enabled a geo replication. |
| 49 | +- Users controls the topic schema manually, do not allow auto update schema by consumers/producers for both clusters. |
| 50 | +- The internal producer is stuck after the user sets a new schema on the source cluster. |
| 51 | +- However, the user wants to allow the replicator to copy the schema to the remote cluster. |
| 52 | + |
| 53 | +# Goals |
| 54 | +1. Add an optional choice: always allow the replicator to register schemas if compatible, even if users set `set-is-allow-auto-update-schema` to false. |
| 55 | +2. Regarding auto-creation that is triggered by replication, rather than the original solution that triggers auto-creation when the internal producer of the replicator, we add the admin api client into the replicator to call the admin api to create topics. |
| 56 | +3. Checks compatibility between two clusters when enabling namespace-level replication, which includes the following |
| 57 | + - All topics’ partitions that have been created should be the same, including `__change_events` |
| 58 | + - Auto-creation policies between both clusters should be the same, including broker-level and namespace-level. |
| 59 | + |
| 60 | +# Detailed Design |
| 61 | + |
| 62 | +### Implementation overview |
| 63 | + |
| 64 | +**Regarding goal 2: uses admin API client to trigger topic creation on the remote side** |
| 65 | +- Replicators will involve an admin api client, which previously only involved a Pulsar client. |
| 66 | + - Do not trigger a topic auto-creation if the value of the configuration `createTopicToRemoteClusterForReplication` is `false`, which keeps the previous behavior. See more details [PIP-370: configurable remote topic creation in geo-replication ](https://github.com/apache/pulsar/pull/23124). |
| 67 | + - Create a topic if it does not exist on the remote side, ignoring the auto-creation policies that were defined at the broker-level and namespace-level. |
| 68 | + - Difference with the previous behavior: the previous used producer creation events to trigger a topic auto-creation. |
| 69 | + - Print error logs if the partitions between the source and the remote cluster are different, which keeps the previous behavior. |
| 70 | +- Removes the mechanism that copies Rest API commands that create a partitioned topic to the remote cluster, including the following mechanisms, but will not remove the check logic. |
| 71 | + - Broker replicates the topic creation Rest API request if the namespace-level replication is enabled. |
| 72 | + - Broker triggers a topic creation request to the remote cluster when enabling a topic-level Geo Replication |
| 73 | + |
| 74 | +We skip adding the overview of the implementation, since goals 1 and 3 are simple and clear enough. |
| 75 | + |
| 76 | +### Public API |
| 77 | + |
| 78 | +#### Regarding Goal 1: "always allow the replicator to register schemas if compatible" |
| 79 | + |
| 80 | +**The original design of `pulsar-admin namespaces set-schema-autoupdate-strategy`** |
| 81 | + |
| 82 | +``` |
| 83 | +pulsar-admin namespaces set-schema-autoupdate-strategy |
| 84 | + -c, --compatibility=<strategyParam> |
| 85 | + Compatibility level required for new schemas created |
| 86 | + via a Producer. Possible values (Full, Backward, |
| 87 | + Forward). |
| 88 | + -d, --disabled Disable automatic schema updates |
| 89 | +``` |
| 90 | + |
| 91 | +To add a new param `--enable-for-replicator`, which means that always allow the replicator to register a new schema if compatible. The default value is `true`. |
| 92 | + |
| 93 | + |
| 94 | +#### Regarding the goal-2: "uses admin API client to trigger topic creation on the remote side" |
| 95 | + |
| 96 | +**`pulsar-admin topics create-partitioned-topic <topic name>`** |
| 97 | +- Previous behavior: |
| 98 | + - It copies the creation request to the remote cluster if the topic does not exist on the remote cluster |
| 99 | + - It creates a partitioned topic on the local cluster |
| 100 | +- The behaviors with the proposal |
| 101 | + - It creates a partitioned topic on the local cluster. |
| 102 | + |
| 103 | +**`pulsar-admin topics set-replication-clusters -c <clusters> <topic name>`** |
| 104 | +- Previous behavior: |
| 105 | + - It confirms that both partitioned topics between the two clusters have the same partitions. |
| 106 | + - It copies the creation request to the remote cluster if the topic does not exist on the remote cluster |
| 107 | + - It sets the policy. |
| 108 | +- The behaviors with the proposal |
| 109 | + - It confirms that both partitioned topics between the two clusters have the same partitions. |
| 110 | + - It sets the policy. |
| 111 | + |
| 112 | +#### Regarding the goal-3: "checks compatibility between two clusters when enabling namespace-level replication" |
| 113 | + |
| 114 | +Add additional checks when calling `pulsar-admin namespaces set-clusters -c <clusters> <namespace>`, which brokers will do |
| 115 | +- Auto-creation policies must be the same, including broker-level and namespace-level. |
| 116 | +- All existing topics that have the same name between two clusters should have the same partitions, including `__change_events`. |
| 117 | + |
| 118 | +### Configuration |
| 119 | + |
| 120 | +The following configurations will never limit the behavior of replication anymore, since replicators have changed to use an Admin API client to trigger the topic creation. |
| 121 | +- `broker.conf -> allowAutoTopicCreation` |
| 122 | +- `broker.conf -> isAllowAutoUpdateSchemaEnabled` |
| 123 | +- `namespace level policy: auto-topic-creation` |
| 124 | + |
| 125 | +### Metrics |
| 126 | + |
| 127 | +Nothing. |
| 128 | + |
| 129 | +# Monitoring |
| 130 | +Nothing. |
| 131 | + |
| 132 | +# Security Considerations |
| 133 | +Nothing. |
| 134 | + |
| 135 | +# Backward & Forward Compatibility |
| 136 | + |
| 137 | +There's nothing that needs to be done. |
| 138 | + |
| 139 | +## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations |
| 140 | +Nothing. |
| 141 | + |
| 142 | +# Alternatives |
| 143 | +Nothing. |
| 144 | + |
| 145 | +# Links |
| 146 | + |
| 147 | +<!-- |
| 148 | +Updated afterwards |
| 149 | +--> |
| 150 | +* Mailing List discussion thread: https://lists.apache.org/thread/p16gwhfx6rkxdp8dm9pckn43o5875o1s |
| 151 | +* Mailing List voting thread: https://lists.apache.org/thread/3nwsbqlkgorswr1oynjwmcz6blkkl5vm |
0 commit comments