Add support for distributed SR-SIM#239
Conversation
| // For grouped nodes (distributed systems like SR-SIM), it creates a single sub-topology | ||
| // containing all nodes in the group so they can be deployed in the same pod and share | ||
| // the network namespace. | ||
| // The secondaryNodes map is used to resolve tunnel destinations - if a remote node is a | ||
| // secondary, the tunnel should point to its primary's service instead. | ||
| func (p *containerlabDefinitionProcessor) processConfigForNodeGroup( | ||
| containerlabConfig *clabernetesutilcontainerlab.Config, | ||
| nodeName string, | ||
| primaryNodeName string, | ||
| group *nodeGroup, | ||
| secondaryNodes map[string]string, | ||
| defaultsYAML []byte, | ||
| removeTopologyPrefix bool, | ||
| ) error { |
There was a problem hiding this comment.
Normally, I added my changes under this function. The lint failed like that:
controllers/topology/definitioncontainerlab.go:423:1: cognitive complexity 34 of func `(*containerlabDefinitionProcessor).processConfigForNodeGroup` is high (> 30) (gocognit)
I tried to spread the functionalities to different functions.
I also realized some of the functions also similar. I might do other refactoring like this.
There was a problem hiding this comment.
yeah, this stuff has gotten a bit out of hand (not you, just in general I mean!) -- I think the general flow in this mr looks chill to me though!
|
Currently, I tested with all sr-sim labs in the containerlab. The clabverter seems working with it too. Let me know if you have more corner cases, or different approaches. I can take a look on the code again. |
carlmontanari
left a comment
There was a problem hiding this comment.
LGTM, @hellt you wanna do a quick looksie too before we merge since this is a pretty big one?!
| func buildGroupNodesList( | ||
| primaryNodeName string, | ||
| group *nodeGroup, | ||
| ) (groupNodeNames []string, groupNodesSet map[string]struct{}) { |
There was a problem hiding this comment.
we've got a basic set implementation in util already. may shave some lines and just be more consistent. https://github.com/bayars/clabernetes/blob/853328b8b1d42b14cbca76fa5d714be77e1e299b/util/sets.go#L14
There was a problem hiding this comment.
Hey, thank you
I updated to use util.StringSet instead of map[string]struct{}.
I also changed buildGroupNodesList to use NewStringSetWithValues(groupNodeNames...) which replaces the manual map creation and loop with a single line.
| // For grouped nodes (distributed systems like SR-SIM), it creates a single sub-topology | ||
| // containing all nodes in the group so they can be deployed in the same pod and share | ||
| // the network namespace. | ||
| // The secondaryNodes map is used to resolve tunnel destinations - if a remote node is a | ||
| // secondary, the tunnel should point to its primary's service instead. | ||
| func (p *containerlabDefinitionProcessor) processConfigForNodeGroup( | ||
| containerlabConfig *clabernetesutilcontainerlab.Config, | ||
| nodeName string, | ||
| primaryNodeName string, | ||
| group *nodeGroup, | ||
| secondaryNodes map[string]string, | ||
| defaultsYAML []byte, | ||
| removeTopologyPrefix bool, | ||
| ) error { |
There was a problem hiding this comment.
yeah, this stuff has gotten a bit out of hand (not you, just in general I mean!) -- I think the general flow in this mr looks chill to me though!
|
LGTM, thanks for all the work @bayars 🔥 |
Hey!
First of all, I didn't test this fully yet. I need to test with a SR-SIM image, and I need to configure them. I will let you know when I test fully. I am creating this MR to discuss on my implementation way.
This MR adds support for deploying distributed chassis-based systems (like Nokia SR-SIM SR-7, SR-14s) that require multiple containers sharing a network namespace via Docker's network-mode: container: directive.
In containerlab (Docker environment), distributed chassis systems like Nokia SR-SIM SR-7 require multiple containers (CPM-A, CPM-B, IOM slots) that share the same Linux network namespace.
Current implementation is not working, because it's breaking distributed SR-SIM structure:
This MR implements automatic node grouping based on network-mode directives:
VXLAN Service Creation
Clabernetes uses VXLAN tunnels to connect nodes across different Pods. With this change:
Example:
links:
- endpoints: ["srsim-a:1/1/c1/1", "srsim-b:1/1/c1/1"] # Local link (same group)
- endpoints: ["srsim-iom1:1/1/c2/1", "external-router:e1-1"] # VXLAN tunnel
The tunnel from external-router to srsim-iom1 resolves to srsim-a's VXLAN service since srsim-iom1 is grouped with srsim-a.
Limitations: