Skip to content

Conversation

fosterseth
Copy link

@fosterseth fosterseth commented Nov 5, 2022

pairs with this PR django/channels_redis#339

to address a memory leak issue in pubsub backend django/channels_redis#276

Here is the race condition in pubsub.py that leads to the memory leak:

  • if we hit an asyncio.CancelledError while awaiting in the receive() method, we get a chance to cleanup internal data related to the channel and all is well
  • if we hit an asyncio.CancelledError outside of the receive() method, we cancel the channels task, but we don't have a way to cleanup internal data related to that channel. That data will live forever until the process is restarted.

Solution is to set a cancel_callback in the dispatch code that will run when hitting an asyncio.CancelledError. By default this callback will be None, but if the channels_layer has a clean_channel method, it will call that instead (with the channel name as the argument, so we know which channel to cleanup)

TODO: maybe tests? I might need some help here

@acu192
Copy link

acu192 commented Nov 11, 2022

Good work on this. These race conditions are certainly hard to think about, but I agree with your analysis.

This solution looks good to me! It is similar what I've wanted to do for a while, but never found the time. Thank you!

I don't have permissions to run workflows in this repo, but I'd certainly recommend it to @carltongibson.

self.channel_receive = functools.partial(
self.channel_layer.receive, self.channel_name
)
if getattr(self.channel_layer, "clean_channel", None) and callable(self.channel_layer.clean_channel):
Copy link

@qeternity qeternity Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future refactoring might be easier if you try/except an AttributeError here (if self.channel_layer.clean_channel doesn't exist, it will raise). This removes the hardcoded string which can make auto refactoring tools, etc fall over.

Copy link
Author

@fosterseth fosterseth Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, you mean something like

cancel_callback = None
try:
  if callable(self.channel_layer.clean_channel):
    cancel_callback = functools.partial(self.channel_layer.clean_channel, self.channel_name)
except AttributeError:
  pass

?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly what I had in mind

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, changed it

@qeternity
Copy link

Looks good to me. Nice job! Left one small comment re: usage of getattr.

@carltongibson
Copy link
Member

Hi @fosterseth Thanks for this. (And thanks @acu192 and @qeternity for checking it out.)

Just to let you know it's on my list for 4.1 updates that I'm looking at over the end of year period.

TODO: maybe tests? I might need some help here

Yes. 🤔 So I guess a mocked channel layer, providing the clean channel method, and verifying that it is scheduled and run.

@fosterseth
Copy link
Author

thanks for looking over this @carltongibson I'll work on getting a test together

@bigfootjon
Copy link
Collaborator

Hey @fosterseth: would you mind rebasing this and cutting out the 3.7 support?

@bigfootjon
Copy link
Collaborator

I've gone ahead and manually rebased this PR. @cacosandon would you mind rerunning your testing on the new version (also note @carltongibson's message in the other thread, you also need to apply django/channels_redis#339)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Memory leak when using channels redis PubSub layer
5 participants