Skip to content
This repository was archived by the owner on Jun 2, 2025. It is now read-only.

Conversation

@ShahRutav
Copy link
Contributor

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 11, 2023
@ShahRutav ShahRutav linked an issue Jan 11, 2023 that may be closed by this pull request
@vmoens vmoens changed the title Added sac codebase. [Algo] Added sac codebase Jan 12, 2023
@ShahRutav ShahRutav linked an issue Jan 13, 2023 that may be closed by this pull request
_has_functorch = False


class SACLoss(LossModule):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does that differ from the TorchRL SAC exactly? If there's an extra feature I'd prefer to add it to torchrl directly, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I used the local sac_loss is because torchRL sac.py requires you to pass three networks: actor, qvalue, and value. In SAC, you don't have a value function as far as I remember.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented that following regorously what the paper presented, but if it works better with one net only we can put that as an option

https://arxiv.org/abs/1801.01290

image

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is SAC-v1, I think more commonly used is SAC-v2 (https://arxiv.org/abs/1812.05905). Checkout section 4.2 in the paper.

Here's the pseudo code that they have used:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my opinion, it is worth adding the v2 implementation of SACLoss?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. My point is mainly that rather than coding up a new SAC, we should simply add the v2 to the SAC loss. As it is now, we're sort of saying "TorchRL has everything you need... but they got SAC wrong so here's a patch"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have a look at pytorch/rl#864?

_has_tv = False


class _RRLNet(Transform):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see why we need a new env for this. We could create R3M with download=False, and load the state dict from torchvision no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure if the architecture of R3M is different from ResNet torchvision module. Plus I think this is a cleaner way to do it? but we can switch to loading weights if you think so

Copy link
Contributor

@vmoens vmoens Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure if the architecture of R3M is different from ResNet torchvision module

What would be different? The only thing that pretrained=True does is load a state_dict, the architecture is 100% the same
Have a look at my PR on torchrl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool. I have never tested R3M backbone against ResNet backbone but they might be exactly same. Thanks! I will take a look and update the code

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error in creating transformed env with R3M Error in running set_info_dict_reader

4 participants