-
Notifications
You must be signed in to change notification settings - Fork 606
Description
Describe the bug
Upgrade from tile38 version 1.29.1 to 1.33.4. Did an upgrade by adding 1.33.4 nodes to the existing cluster and removing old nodes one node at a time, ensuring nodes were caught up before removing old nodes. To replace the master did a failover from sentinel and then removed the old master. After two hours of the nodes being stable and caught up, noticed that latency had significantly increased. Noticed the error “follow: Protocol error: invalid bulk line ending” on the follower nodes. Had to bring down the follower nodes to stabilize the cluster. Every time we try to bring up follower nodes we get the same error. If I copy the AOF to a new cluster and add followers the error is not there. Are there any steps we can take to allow following on the existing cluster.
To Reproduce
Steps to reproduce the behavior:
- have a 1.29.1 cluster
- Add all 1.33.4 nodes making sure they’re caught up
- Remove old 1.29.1 followers
- Failover master to a new node and remove old master node
- Confirm everything is caught up
- 2 hours later errors start
Expected behavior
When follower nodes are added should connect to master and be caught up
