Skip to content

Is this a reinforce implementation or actually AC2 given the second network? #1

@lrfreeman

Description

@lrfreeman

Hi there,

Thanks for sharing your repo, it's helping me greatly explore the field. I have a question I'm not sure the answer of. In this implementation I believe you have implemented the V(s) function and therefore have a parameterised value function. I have read that a vanilla implementation of the reinforce algorithm does not parameterise any value function, and instead has only a single parameterisation network that maps states to actions. Am I wrong, or is this more of an implementation of an actor-critic algorithm considering the dual networks?

Cheers,

Laurence

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions