small memory leak for new connections/subscriptions

A small amount of memory (~1kb) is allocated with new connections or subscriptions (haven't asserted which yet) which doesn't appear to be released when the connection / subscription terminates.

The amount of memory is very small so is unlikely to be noticed in real-world situations. However, if for some reason, a client were to time out and reconnect with a high frequency, this can lead to a slow memory leak.

> [!IMPORTANT]
>
> Recommend investigating with https://github.com/cylc/cylc-flow/pull/6985 merged in.

### Steps To Reproduce

Configure a frequent websocket ping with a completely unreasonable timeout:

```python
c.ServerApp.websocket_ping_interval = 3
c.ServerApp.websocket_ping_timeout = 0.000001
```

You will need to hack jupyter_server to allow you to use float values for the ping_timeout, here's the patch (will raise as a PR to jupyter_server in due course):

```diff
diff --git a/jupyter_server/serverapp.py b/jupyter_server/serverapp.py
index 1afbef4d0..53ecc04a7 100644
--- a/jupyter_server/serverapp.py
+++ b/jupyter_server/serverapp.py
@@ -1649,7 +1649,7 @@ class ServerApp(JupyterApp):
             return "jupyter_server.gateway.connections.GatewayWebSocketConnection"
         return ZMQChannelsWebsocketConnection
 
-    websocket_ping_interval = Integer(
+    websocket_ping_interval = Float(
         config=True,
         help="""
             Configure the websocket ping interval in seconds.
@@ -1666,12 +1666,16 @@ class ServerApp(JupyterApp):
             the connection will be closed from the server side.
         """,
     )
-    websocket_ping_timeout = Integer(
+    websocket_ping_timeout = Float(
         config=True,
         help="""
             Configure the websocket ping timeout in seconds.
 
-            See ``websocket_ping_interval`` for details.
+            The default is the same as the ping interval. Set to ``0`` to turn
+            the ping timeout off.
+
+            Note the the websocket ping timeout requires the ping interval to
+            be set, see ``websocket_ping_interval`` for details.
         """,
     )
```

This will cause the client to reconnect every ~3 seconds.

```bash
# launch the server
cylc gui --new

# connect the client in a new brower tab

# wait for the client to connect, then attach your profiler of choice
mprof attach <pid>

# wait for a few connect/disconnect cycles to pass, then collect your results
mprof plot <pid>
```

<img width="1200" height="540" alt="Image" src="https://github.com/user-attachments/assets/486613a4-8268-4902-b0f6-ad50cd37872f" />


### Profiling

Because the amount of memory leaked is so small, it is quite hard to trace.

The best approach I have come up with so far is to compare memory snapshots, the diff of which should contain the answer.

There are two tools I'm aware of which can do this:
* `tracemalloc.take_snapshot.compare_to(previous, 'lineno')`
* `pympler.tracker.SummaryTracker.print_diff`

Here's a patch which uses `tracemalloc`:

```diff
diff --git a/cylc/uiserver/graphql/tornado_ws.py b/cylc/uiserver/graphql/tornado_ws.py
index fd58032..de24895 100644
--- a/cylc/uiserver/graphql/tornado_ws.py
+++ b/cylc/uiserver/graphql/tornado_ws.py
@@ -23,6 +23,7 @@
 #
 # It has been evolved to suit and ported to graphql-core v3.
 
+import tracemalloc
 import asyncio
 from asyncio.queues import QueueEmpty
 from contextlib import suppress
@@ -46,6 +47,8 @@ from cylc.flow.network.graphql_subscribe import subscribe
 from cylc.uiserver.authorise import AuthorizationMiddleware
 from cylc.uiserver.schema import SUB_RESOLVER_MAPPING
 
+tracemalloc.start()
+SNAP = tracemalloc.take_snapshot()
 
 NO_MSG_DELAY = 1.0
 
@@ -166,7 +169,7 @@ class TornadoSubscriptionServer:
     ):
         self.schema = schema
         self.loop = loop
-        self.middleware = middleware
+        self.middleware = middleware or []
         self.execution_context_class = execution_context_class
         self.auth = auth
 
@@ -310,6 +313,25 @@ class TornadoSubscriptionServer:
     async def on_start(self, connection_context, op_id, params):
         # Attempt to unsubscribe first in case we already have a subscription
         # with this id.
+        global SNAP
+        snapshot = tracemalloc.take_snapshot()
+        cmp = [
+            s
+            for s in snapshot.compare_to(SNAP, 'lineno')
+            if 'cylc/' in str(s)
+            # or 'tornado/' in str(s)
+            # or 'graphql/' in str(s)
+        ]
+        print()
+        print('+/-', sum(stat.size_diff for stat in cmp))
+        # with open('log', 'a') as log_file:
+        #     log_file.write(str(sum(stat.size_diff for stat in cmp)) + '\n')
+        #     log_file.flush()
+        for stat in sorted(cmp, key=lambda x: x.size_diff, reverse=True):
+            if stat.size_diff != 0:
+                print(f'  {stat}')
+        print()
+        SNAP = snapshot
         await connection_context.unsubscribe(op_id)
 
         params['kwargs']['root_value'] = op_id
```

This diff is filtering memory allocated from within the Cylc codabase.

Note that memory diffs can be a bit jumpy, sometimes memory doesn't get deallocated for a while, but does eventually.

Remove the filter and uncomment the `with open('log` bit to get the full memory diffs and log the net change to a file. Plot the result and you'll get something like this:

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/619566aa-0c88-4ca8-b2b1-d1608c2e03ba" />

It jumps around, the leak is barely visible in the long term trend.


### Leaks

The leak may not be in the Cylc codebase at all.

So far, I have found one minor leak originating from Cylc, however, this doesn't seem to make a big difference: https://github.com/cylc/cylc-flow/pull/6985

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

small memory leak for new connections/subscriptions #732

Steps To Reproduce

Profiling

Leaks

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

small memory leak for new connections/subscriptions #732

Description

Steps To Reproduce

Profiling

Leaks

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions