Skip to content

Agent reconnect does not workΒ #39

@krucod3

Description

@krucod3

If a workload that maintains a connection to Ankaios is running and the agent responsible to this workload is restarted (agent only, the workload continues to run), the SDK does not send a new hello to the agent. As the agent is restarted and does not have information on the previous connection, it assumes that the hello was never sent.

Current Behavior

Restart of the agent managing a workload using the SDK does not work.

Expected Behavior

The SDK should be usable also after the agent restart and workload resume.

Steps to Reproduce

  1. Start a workload that maintains a connection to Ankaios using the python SDK
  2. Stop the agent responsible for the that workload and start the agent again
  3. Try to send requests to Ankaios over the workload (The Ankaios fleet management tutorial gives a good example environment for the test)
  4. Observe that the agent closes the connection as it expects a hello to be sent

Context (Environment)

Ankaios v0.5.0
Python SDK 0.5.0rc4

Logs

Additional Information

Final result

From the PR:

When the agent gets disconnected, the sdk enters a routine where it tries to send the hello message over and over. Succeeding in sending it means the agent is back. The agent will receive a number of hello messages when starting, but this is ok because the first one will be taken into account (and much needed by the whole interaction) and the rest of them will be ignored, thus the communication can continue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions