-
Notifications
You must be signed in to change notification settings - Fork 4
Description
If a workload that maintains a connection to Ankaios is running and the agent responsible to this workload is restarted (agent only, the workload continues to run), the SDK does not send a new hello to the agent. As the agent is restarted and does not have information on the previous connection, it assumes that the hello was never sent.
Current Behavior
Restart of the agent managing a workload using the SDK does not work.
Expected Behavior
The SDK should be usable also after the agent restart and workload resume.
Steps to Reproduce
- Start a workload that maintains a connection to Ankaios using the python SDK
- Stop the agent responsible for the that workload and start the agent again
- Try to send requests to Ankaios over the workload (The Ankaios fleet management tutorial gives a good example environment for the test)
- Observe that the agent closes the connection as it expects a hello to be sent
Context (Environment)
Ankaios v0.5.0
Python SDK 0.5.0rc4
Logs
Additional Information
Final result
From the PR:
When the agent gets disconnected, the sdk enters a routine where it tries to send the hello message over and over. Succeeding in sending it means the agent is back. The agent will receive a number of hello messages when starting, but this is ok because the first one will be taken into account (and much needed by the whole interaction) and the rest of them will be ignored, thus the communication can continue.