I'm currently stuck at 1233 after 250 generations. The last improvement was at gen 183. Walking gets part way down the first downslope, then falls. The current champions are far better than any of their mutations, so a local maximum has been reached. This is the fate of most evolutionary AI.
Once on the downslope, the controller for flat ground won't work, so now it has to jump to something substantially more complex to progress. Making such jumps was an unsolved problem with genetic algorithms 20 years ago, and I commented back then that whoever solved it would win a Nobel Prize. Still waiting.
I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.
Just to see if any further improvement was possible, I left this running for over 18,000 generations. 98% of the improvement had happened by gen 183. The last minor improvement was at gen 7313. There's no sign of a controller evolving that can handle non-flat terrain.
I used to work on legged locomotion. Once you get off the flat, traction control ("anti-slip") starts to dominate the problem. It looks like this simulation isn't going to evolve anti-slip control.
What you describe is called local optima and it's well studied. There is no magic way around it, you just need to put resources into exploring a larger area. Methods include selecting really diverse solutions rather than just the best one, restarting the simulation with a new random population, or sometimes throwing away the best solution and letting it wander around randomly longer before converging.
>I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.
Neural nets have a lot of desirable properties for this kind of stuff. Mainly that they can be trained with reinforcement learning. Instead of randomly testing which solutions work better, you have them try to predict how good each action is.
From my experience, restarts are as important as mutation and recombination. The most efficient algorithm I've seen so far is a mixture of ES, hillclimbing and restarts. The whole field seems to be pretty stuck IMO, teaching the same inefficient approaches over and over again.
Once on the downslope, the controller for flat ground won't work, so now it has to jump to something substantially more complex to progress. Making such jumps was an unsolved problem with genetic algorithms 20 years ago, and I commented back then that whoever solved it would win a Nobel Prize. Still waiting.
I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.