Skip to content

FeedForwardNetwork is catastrophically broken #255

@ntraft

Description

@ntraft

One-Line Summary

On almost all evolved genomes, FeedForwardNetwork does not execute all nodes, and many nodes retain their initial value of 0.0. This manifests itself as a network always returning an output of all zeros.

More Details

Sorry for the melodramatic headline, but I wanted to make it clear that this isn't just any old bug. As far as I can figure out, it would affect any and all feed-forward NEAT programs in a very dramatic way. I found the issue on the OpenAI Lunar Lander example. It's actually pretty interesting that the XOR and Cart-Pole examples still work well. The issue doesn't present itself until the network evolves for awhile, so it isn't immediately apparent.

I'm using the latest version of the repo (not the pip release). The code I'm looking at has been present since at least 2017, so I'm not exactly sure whether it's expected behavior.

The issue I'm seeing is basically that FeedForwardNetwork.node_evals does not actually contain all necessary nodes. In the net below, only the following nodes are being run: 2176, 1153, 1602, 1311. Notice that these are the only nodes which depend on an input, but do not depend on any other nodes (blue arrows).

winner-1-net-pruned

There are many non-input nodes which have no predecessors (red arrows). These nodes are not being included in the initial set of eligible nodes in neat.graphs.feed_forward_layers(). The fix would be to add these nodes as the first "layer" of the computation. I intend to fix this in my own fork and submit a pull request.

However, I'm not sure if this is the correct fix, or if this is an indicator of a deeper problem? Are these "dangling" input nodes expected? This seems related to #250. These nodes are largely unnecessary, yet they are not completely functionless (they essentially act as an additional "bias" parameter). Plus, they could become more integrated by the addition of new edges in later generations. So it seems like they belong there, but it is shocking that such a massive bug exists for years and makes me very suspicious of whether my fix is correct.

To Reproduce

Steps to reproduce the behavior:

  1. cd examples/openai-lander
  2. python evolve.py
  3. See a fitness.svg plot like the one below (I have modified it for more clarity). We can't achieve a positive reward (solving the task would be a reward of +200).

fitness

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions