The replicator equation used in evolutionary game theory (EGT) assumes that strategies reproduce in direct proportion to their payoffs; this is akin to the use of fitness-proportionate selection in an evolutionary algorithm (EA). In this paper, we investigate how various other selection methods comonly used in EAs can affect the discrete-time dynamics of EGT. In particular, we show that the existence of evolutionary stable strategies (ESS) is sensitive to the selection method used. Rather than maintain the dynamics and equilibria of EGT, the selection methods we test impose a fixed-point dynamic virtually unrelated to the payoffs of the game matrix, give limit cycles, or induce chaos. These results are significant to the field of evolutionary computation because EGT can be understood as a coevolutionary algorithm operating under ideal conditions: an infinite population, noiseless payoffs, and complete knowledge of the phenotype space. Thus, certain selection methods, which may operate effectively in simple evolution, are pathological in an ideal-world coevolutionary algorithm, and therefore dubious under real-world conditions.
Genetic programming is a computer learning method that imitates nature's selection process to lead a population of computer programs towards improving levels of performance.
Tron is a dynamic game, difficult for computers to learn. Playing against itself a computer might believe that is doing a good job when it is not, because it lacks a parameter (ie, a really good player) to compare.
In this experiment, we have put a genetic learning algorithm online. A "background" GA generates players by having the computer play itself. A "foreground" GA leads the evolutionary process, evaluating players by their performance against real people.
Following Tesauro's work on TD-Gammon, we used simple hill-climbing in a 4000 parameter feed-forward network to develop a competitive backgammon evolution function. No back-propagation, reinforcement or temporal difference learning methods were employed. Instead we start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights when the challenger wins. The success of so simple a learning method indicates backgammon deserves further study as to why hard learning is so successful in this domain.
In the Golem project (Genetically Organized Lifelike Electro Mechanics) we conducted a set of experiments in which simple electro-mechanical systems evolved from scratch to yield physical locomoting machines. Like biological lifeforms whose structure and function exploit the behaviors afforded by their own chemical and mechanical medium, our evolved creatures take advantage of the nature of their own medium - thermoplastic, motors, and artificial neurons. We thus achieve autonomy of design and construction using evolution in a limited universe physical simulation, coupled to off-the-shelf rapid manufacturing technology. This is the first time robots have been robotically designed and robotically fabricated. |
|
|
One of the central difficulties of coevolutionary methods arises from "intransitive superiority" - in a two-player game, for example, the fact that A beats B, and B beats C, does not exclude the possibility that C beats A. Such cyclic superiority in a coevolutionary substrate is hypothesized to cause cycles in the dynamics of the population such that it "chases its own tail" - traveling through some part of strategy space more than once despite apparent improvement with each step. It is often difficult to know whether an application domain contains such difficulties and to verify this hypothesis in the failure of a given coevolutionary set-up. In this work we elucidate some of the issues and concepts in an abstract domain where the dynamics of coevolution can be studied simply and directly.
We define three simple "number games" that illustrate intransitive superiority and resultant oscillatory dynamics, as well as some other relevant concepts. These include the distinction between a player"s perceived performance and performance with respect to an external metric, and the significance of strategies with a multi-dimensional nature. These features alone can also cause oscillatory behavior and coevolutionary failure.