Skip to content

Commit bfcf533

Browse files
lllamnyphiddenmartencoderabbitai[bot]
authored
Create a design-document for the controller (#181)
# Motivation I started some "R'n'D" (scare quotes intended) for implementing scale up, scale down, self-healing and so on and quickly realized, that the coding of the member add/member remove and similar steps is the more trivial part of the undertaking. The difficult part is coming up with a working algorithm that can correctly deduce the cluster's state and execute the necessary actions at the right time. To better reason about the controller's algorithm now, and to better develop it going forward, I feel it is important to have good documentation of the current design and the intended next steps, so I started with trying to document the current state of the code. # Results This document contains a mermaid flowchart that outlines the reconciliation loop. It is better viewed in [rendered form](https://github.com/aenix-io/etcd-operator/blob/docs/design/docs/DESIGN.md). Going forward, I envision this document to have at least three purposes: * Let the developers spot flaws and prompt them to open issues. * Act as a more detailed form of documentation for advanced users. * Be a blueprint for implementing anything non-trivial. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Updated the design document for the `EtcdCluster` custom resources with a detailed flowchart illustrating the reconciliation process and lifecycle management within a Kubernetes environment. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Hidden Marten <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent 40373b6 commit bfcf533

File tree

2 files changed

+85
-0
lines changed

2 files changed

+85
-0
lines changed

docs/DESIGN.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Design
2+
3+
This document describes the interaction between `EtcdCluster` custom resources and other Kubernetes
4+
primitives and gives an overview of the underlying implementation.
5+
6+
## Reconciliation flowchart
7+
8+
```mermaid
9+
flowchart TD
10+
Start(Start) --> A[Ensure service.]
11+
A --> AA{Are there any\nendpoints?}
12+
AA --> |Yes| AAA[Connect to the cluster\nand fetch all statuses.]
13+
AAA --> |Got some response| AAAA{All reachable\nmembers have the\nsame cluster ID?}
14+
AAAA --> |Yes| AAAAA{Is cluster\nin quorum?}
15+
AAAAA --> |Yes| AAAAAA{Are all members \nmanaged by the operator?}
16+
AAAAAA --> |Yes| AAAAAAA["`
17+
Promote any learners.
18+
Ensure configmap with initial cluster matching existing members and cluster state=existing.
19+
Ensure StatefulSet with replicas = max member ordinal + 1
20+
`"]
21+
AAAAAAA --> |OK| AAAAAAAA{Are all\nmembers healthy?}
22+
AAAAAAAA --> |Yes| AAAAAAAAA{Are all STS pods present\nin the member list?}
23+
AAAAAAAAA --> |Yes| AAAAAAAAAA{Is the\nEtcdCluster\nsize equal to the\nStatefulSet\nsize?}
24+
AAAAAAAAAA -->|Yes| AAAAAAAAAAA[Set cluster\nstatus to ready.]
25+
AAAAAAAAAAA --> HappyStop([Stop])
26+
27+
AAAAAAAAAA --> |No, desired\nsize larger| AAAAAAAAAAB[Ensure ConfigMap with\ninitial cluster state existing\nand initial cluster URLs\nequal to current cluster\nplus one member, do\n'member add' API call and\nincrement StatefulSet size.]
28+
AAAAAAAAAAB --> ScaleUpStop([Stop])
29+
30+
AAAAAAAAAA --> |No, desired\nsize smaller| AAAAAAAAAAC[Member remove API\ncall, then decrement\nStatefulSet size\nthen delete PVC.]
31+
AAAAAAAAAAC --> ScaleDownStop([Stop])
32+
33+
AAAAAAAAAA --> |Etcd replicas=0\nSTS replicas=1| AAAAAAAAAAD[Decrement\nSTS to zero]
34+
AAAAAAAAAAD --> ScaleToZeroStop([Stop])
35+
36+
AAAAAAAA --> |No| AAAAAAAAB1[On timeout evict member.]
37+
AAAAAAAAB1 --> AAAAAAAAB2[Delete PVC, ensure ConfigMap with\nmembers + this one and delete pod.]
38+
39+
AAAAAAAAA --> |No| AAAAAAAAB2
40+
41+
AAAAAAA -->|Error| AAAAAAAB([Requeue])
42+
43+
AAAAAA --> |No| AAAAAAB([Not implemented,\nstop.])
44+
45+
AAAAA --> |No| AAAAAB([Quorum Loss Detected:
46+
1. Check for temporary issues:
47+
- Network partitions
48+
- Pod scheduling problems
49+
2. If temporary, wait for recovery
50+
3. If permanent:
51+
- Alert operators
52+
- Document disaster recovery steps
53+
- Consider backup restoration])
54+
55+
AAAA --> |No| AAAAB[Cluster is in\nsplit-brain. Set\nerror status.]
56+
AAAAB --> AAAABStop([Stop])
57+
58+
AAA --> |No members\nreached| AAAB{Is the STS\npresent?}
59+
AAAB --> |Yes| AAABA{"`Does it have the correct pod spec?`"}
60+
AAABA --> |Yes| AAABAA(["`The statefulset cannot be ready, as the ready and liveness probes must be failing. Hope it becomes ready or wait for user intervention.`"])
61+
AAABA --> |No| AAABAB["`Patch the podspec`"]
62+
63+
AAAB --> |No| AAABB(["`Looks like it was deleted with cascade=orphan. Create it again and see what happens`"])
64+
65+
AA --> |No| AAB{Is the STS\npresent?}
66+
AAB --> |Yes| AABA{Does it have the\ncorrect pod spec?}
67+
AABA --> |Yes| AABAA{Is it\nready?}
68+
AABAA --> |Yes| AABAAA{Then it must have\nspec.replicas==0\n Is EtcdCluster\n.spec.replicas==0?}
69+
AABAAA --> |Yes| AABAAAA([Cluster successfully\nscaled to zero, stop.])
70+
AABAAA --> |No| AABAAAB["`
71+
Ensure ConfigMap with initial cluster = new,
72+
initial cluster peers with single member name-0,
73+
increment STS size.
74+
`"]
75+
76+
AABAA --> |No| AABAAB([Stop and wait, either\nit will turn ready soon\nand the next reconcile\nwill move things along,\nor user intervention is\nneeded])
77+
78+
AABA --> |No| AABAB[Patch the podspec]
79+
80+
AAB --> |No| AABB[Create configmap, initial state new\ninitial cluster according to spec.\nreplicas, create statefulset.]
81+
```

0 commit comments

Comments
 (0)