Tuesday, January 17, 2006

Non-Markov Tic-Tac-Toe Experiment



EXPERIMENTAL SETUP

1) The Neural Network (NN) always plays X; O is played by a hand-crafted computer opponent.
2) The computer opponent is of intermediate skill, it is not perfect, but can only be beaten if it is "forked" by X.
3) The NN is recurrent, and has multiple time steps to decide on a given move.
4) The NN controls a "focus of attention" (FOA) that can be moved around the TTT grid by using the following actions: turn right, turn left, move forward.
5) The FOA allows the NN to "see" only a single cell at a time. The input to the NN is 0 for an empty space, 1 for X, and -1 for O.
6) At the beginning of each turn the FOA of the NN is placed in the center of the grid facing a randomly selected direction (north, east, south, or west).
7) The NN is given no hint as to its current location or the direction it is facing. It must build up an internal representation based on what it has already seen through the FOA.
8) If the NN inadvertantly attempts to move the FOA out of the boundaries of the grid, it automatically forfits the game.
9) The NN selects its move by moving the FOA to a given cell and then performing a "halt" action. This "halt" action must be taken within the specified time-limit (50) or else it automatically forfits the game.
10) The move selected by the NN after halting must be legal (i.e. must be on an empty cell). If the cell is already occupied then it automatically forfits the game.
11) Feedback is given after playing a set of games (1000). Fitness is computed as the sum of the number of games won in the trial set. No credit is given for draws, losses, or premature forfits.


As you can see, these are rather Draconian measures. To summarize: in order to be successful, the NN is expected to learn to move around its "focus of attention" that only allows it to see a single location at a time while being careful not to accidentally move outside of boundaries it has not been informed of, create an internal representation of what it has already seen, use that information to navigate successfully to an empty spot, announce that it is halting, and do all of this repeatedly for each and every one of its turns in such a way that it manages to beat a pre-programmed opponent that can never be beaten without being forked! (Remember, it receives absolutely no credit for a draw).

RESULTS

So far, after a few hours of CPU time, it has managed to discover a solution that can win 64% of the games it plays, even under these drastically difficult conditions!!

Tuesday, January 03, 2006

Traversing Neutral Networks

THE IDEA

I was doing some research over my christmas break and stumbled upon a concept called Neutral Networks (which sounds rediculously close to Neural Networks, but oh well).

Basically the idea is that during the course of evolution, many point mutations made to an organism's genome are neither positive nor negative with respect to it's reproductive fitness; they are "neutral". A good example of this is the many-to-one mapping for encoding various amino-acids from codons.

Presumably there are many neutral mutations that are often connected by a single point mutation, thus forming "Neutral Networks" embedded in the fitness landscape. The theory is that this is advantageous because it allows the evolving organism to escape local-optima by horizontally sliding (whatever exactly "horizontal" means in N-dimensions) along flat plateaus of homogeneous fitness until an even higher point is stumbled upon. Much research on this is being done at Sussex University.

IMPLICATIONS

So, if this is true, then one would assume that evolution could occur even with very low rates of mutation... there would be no need to "jump" off of local optima.

I tested this hypothesis by altering the mutation function in my software. Rather than applying Gaussian or Cauchy distributed mutation (as is standardly done in Evolutionary Programming and Fast Evolutionary Programming, respectively), I chose to use a small, fixed-size constant that would either be added to or subtracted from each linkweight of my Neural Network with equal probability. The results have been nice; this new (and considerably simpler) mutation strategy significantly outperforms the previous method I was using.

However, I did find that it was not feasible to use arbitrarily small constants because: 1) evolution would be too slow, and 2) the changes in fitness need to be larger than the "thermal noise" inherent in the evaluation function (note: I am not working with fixed training sets, therefore the feedback function is a bit noisy).

TWO TYPES OF NEUTRAL NETWORK SYMMETRY

This concept was not mentioned in the literature, but I realized that there are actually two distinct forms of Neutral Networks; one nested inside the other. Basically there are two forms of symmetry that can lead to Neutral Networks.

The first form is symmetry of summed fitness; two different behaviors may be equally fit although for differing reasons. The second form is symmetry of representation; two internally different systems (e.g. Neural Networks) can lead to the same external behavior, which then necessarily implies equal fitness.

I believe that biological systems make use of both forms of Neutral-Network symmetry during the evolutionary process.