Novelty Search, Minimal Constraints and Hyperspace Islands

Mike Depies
5 min readMay 25, 2021

--

So we’ve done away with objective, and embraced a novelty search paradigm. How do we get solutions that are actually catered toward the problem we care about? And what consequences do we incur when trying to foist our objective back into the picture? I’m going to discuss minimal constraints, the impact it’s had on MeleeNEAT and a potential mitigation strategy.

Minimal Constraints allow us to apply a pass/fail predicate (test) onto each behavior measurement. In the event this condition was true, the behavior is added to the novelty archive and scored appropriately. If the constraint is not met, the agent is assigned a score of 0 and the behavior is not added. This is a simple, but effective extension of Novelty Search that allows us to constrain the area of search. However, this also introduces a new idea: feasible and infeasible agents.

Diagram from the constrained novelty search paper highlighting their two population concept. (Image from Constrained Novelty Search: A Study on Game Content Generation)

This new dichotomy was investigated by a paper where they managed two separate populations. A feasible population and an infeasible population. When infeasible agents produced behaviors that were feasible they were then transferred over to the feasible population. The authors contend that valuable information was being lost in the discard process that takes place with a minimal criteria. Further, in this paper they observe that the constraints applied create a sort of set of pockets of feasibility. This prevents real exploration, as the population has no tools to get off an island.

F and the gray area representing feasible space. I and the white area representing infeasible space (Image from Constrained Novelty Search: A Study on Game Content Generation).

By maintaining a separate population, you’re able to have a vehicle to move from island to island. This observation and solution leads me to a new question and a new design approach to address the above problem. What other ‘meta-deception’ is introduced by the constraints, and conflation of behavior space?

What is conflation of behavior space? In MeleeNEAT, we capture each action the agent performs into a sequence. If this sequence is repeated, despite the qualitative differences that are not being measured, it will be considered as the same behavior. If we only kept this single vector and not the complimentary vectors for damage and recoveries, we’d see lots of potential conflation and loss of meaning/importance. Take this example:

We have a new population of Pikachus who randomly incorporate their up-B move into their behavior set as well as jumping. As the agents matured, and they reliably started to get knocked off stage — the novelty search might very well have pushed the agent away from behaviors with up-B because of early saturation. This might very well prevent the agent from discovering necessary and novel behavior.

While this is somewhat of a contrived example, I believe this is an important dynamic at play when deciding a behavior metric. In the original novelty search paper, they addressed a cost of computation as the novelty archive grows. They suggest the size can be limited with minimal harm to the search process.

from the original novelty paper

I am going to go further and suggest that not only is it fine to limit the size of the archive, but that flushing the archive can have a boost in exploration. The network that may have created a behavior at generation 0, is a vastly different network at generation 100. And in fact, there are going to be refinements that are interesting if we allowed the search to return to this behavior space.

I think both this conflation of behavior space, and the feasibility dynamic introduced by minimal criteria are two new problems to wrestle with. I’ve found with the MeleeNEAT project that doing a sequence of: no constraints behavior space for X generations. Then apply the min constraint of doing damage for another X generations. Then wipe the novelty archive, and remove the min constraint and begin the process again. This has in practice with MeleeNeat, created a seeming boost in performance at the actual capacity to play the game.

I wanted to add one last bit of intuition or thought on why this might work and why we might not observe backtracking in the traditional sense. In some sense, in order to produce more and more novel behavior, there must be an accompanying development of mastery with the object at hand as well. Without the ability to properly wield an object, the actions that can be performed with said object shrink dramatically. Novelty Search is often described as an information accumulator by it’s authors, and I think that pairs quite well with the idea of mastery. More directly with Smash Brothers, competitive players typically don’t improve their play by simply trying to win. They often explore, and practice routines that cause their play to potentially suffer or be less optimal. It’s possible that Novelty Search will serve as an analog to this behavior.

I’ll finish by offering up the following example to be used as a metaphor for what might be occurring when we clear the novelty archive. In the old adventure style games, such as Metroid, sometimes the player has to return to previous places in order to find things they either missed (due to lack of knowledge) or due to new opportunities unlocked (upgrades that open new pathways). If we consider novelty search as an archive of locations visited, and the process of novelty search being an information accumulator, then it stands that returning to places you’ve already been can be useful after enough development has occurred. So then the question might be, is there an algorithmic or optimal approach to managing the novelty archive?

Thanks for reading, follow for more developments @meleeNeat on Twitter. As always, I welcome questions, criticism and feedback.

--

--