Skip to content

Low Seaquest avg score compared to A3C #3

@beniz

Description

@beniz

Looking at a handful A3C implementations and results on Seaquest, they appear to score around 50K:

PAAC however, reaches a plateau around 2K according to our tests (similar to your paper). Visual inspection of the policy shows that the submarine does not resurface. While a common difficulty of the game, A3C appears to be able to overcome it (maybe this could be due to a modification in OpenAI Gym since their Atari setup has some differences with ALE).

We've looked at various explorations (e-greedy, boltzmann, bayesian dropout), with no improvement at the moment.

Do you seen any particular reason PAAC would underperform in this case ? LSTM might help, but from the two OpenAI Gym pointers above, it seems it should not be critical for Seaquest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions