Skip to content

Bulk line ending issue upgrading from tile38 1.29.1 to 1.33.4 #756

@ssajnani

Description

@ssajnani

Describe the bug
Upgrade from tile38 version 1.29.1 to 1.33.4. Did an upgrade by adding 1.33.4 nodes to the existing cluster and removing old nodes one node at a time, ensuring nodes were caught up before removing old nodes. To replace the master did a failover from sentinel and then removed the old master. After two hours of the nodes being stable and caught up, noticed that latency had significantly increased. Noticed the error “follow: Protocol error: invalid bulk line ending” on the follower nodes. Had to bring down the follower nodes to stabilize the cluster. Every time we try to bring up follower nodes we get the same error. If I copy the AOF to a new cluster and add followers the error is not there. Are there any steps we can take to allow following on the existing cluster.

To Reproduce
Steps to reproduce the behavior:

  1. have a 1.29.1 cluster
  2. Add all 1.33.4 nodes making sure they’re caught up
  3. Remove old 1.29.1 followers
  4. Failover master to a new node and remove old master node
  5. Confirm everything is caught up
  6. 2 hours later errors start

Expected behavior
When follower nodes are added should connect to master and be caught up

Logs
IMG_8995

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions