Prisoners’ Dilemma Part 2– The Simulation

To gain insight into the dynamics of Prisoners’ Dilemma , I wrote a computer program that played the banker.  This main “banker” program accepted input from two independent sub-programs which would act as the two players.  Each player program would signal to the banker a play of either “Cooperate” or “Defect”.  The banker would then return to each player program what the other player program had played as well as the outcome.  The banker would then either signal to the player programs that it was ready again to accept their “Cooperate” or “Defect” symbols, or that the game just played represented the final game in the match.  On that news, the player programs would self-terminate and the banker program would select two new opponents and launch the programs that represented each of them.

In this way I was able to simulate with computer code, all three roles of the game (the two players and the banker) as well as provide the freedom for each player program to adopt any type of strategy I could imagine.  A configuration file read by the banker program allowed me to define the number of games that would be played per match.  In this way I could very quickly explore not only a very large number of strategies, but also how each of these strategies performed against every other strategy.


Choosing Opponents

The last piece I needed to provide a complete computer simulation was a method (and associated computer code) for selecting whom competed against whom.  I considered several different approaches:

A Matrix Method in which every possible strategy in the system played against every other possible strategy exactly X times.

A List Method in which the program would take a list of all the possible strategies in the system , and starting at the top, use the list to decide one of the contestants.  The other would be selected at random.

A Pure Random Method in which both contestants are selected at random.

All of these potential methods suffered from the same flaw:  The relative frequencies of strategies would certainly NOT be equal in nature and on top of that, not be equal about matching against one another, while all the strategies I’d described above would have an equal chance of matching A against B against C.  This could be relatively easily corrected by simply using a weighted method of random selection.  If A is 2x more prevalent than B in a population, then whenever using a random method to select an opponent, I make sure that my method is twice as likely to choose A as B.

But there was no way to know what the initial frequencies should be.  There was no population to go measure and then adjust the frequencies accordingly.  In fact, the strategies I was using, while some were quite complex, none were nearly adequately complex enough to consider the thousands to even millions of factors a biological individual may consider when deciding whether or not to jump up on to your lap.

Worse, these frequencies would certainly change over time.  In fact, in real life, any strategy that was successful would naturally increase in frequency for all kinds of reasons:  Maybe winning strategies are sought out adopted by increasing numbers of players.  Or maybe they’d be passed down generationally from father to son (more successful fathers would have more children).  Or maybe strategies would be passed from one generation to the next at the molecular level via genes.  Whatever the reason, it seems pretty basic to assume that a successful strategy would be employed by larger portion of the players as time goes on.

As an aside, note that this means that a successful strategy, in order to truly be robust, must be unusually good at competing against itself or against other rival successful strategies.  Another way to look at it is that a truly robust strategy will mean that there is no other strategy that a player might switch to which would increase her success.  But how to best model this?


Survival of the Fittest

Biology suggests an elegant answer.  Natural selection, sometimes also referred to as “survival of the fittest” refers to the way that nature shapes species over time.  Any trait (or “strategy”) that improves an individual organism’s success, and thus relative number of healthy offspring, will become more prevalent in the population over time.  Any detrimental “strategy” or trait, by decreasing the success of the organism, will leave it with relatively fewer healthy offspring, and thus the trait becomes less and less prevalent in the population over time.

If I treated each strategy as a living organism and rewarded success and failure, not with money, but with relative reproductive success, I would fundamentally bring natural selection to bear in my model.  Under the pressure of this unnatural natural selection, the weaker strategies would be eliminated while the stronger strategies continue to fight generation after generation for domination.  Essentially, my “soup” of strategies would “evolve” towards the most successful and stable outcome.


Iterations and Generations

I start round one (or the first generation) of the simulation by “combining” equal portions all the strategies being considered.  The computer then matches each strategy against an opponent strategy chosen at random to begin the game.  In this first generation, all strategies have an equal chance of competing against any other strategy, including itself.  The simulated opponents play a series of games after which the computer ends the round and totals the score for each player/strategy.  The number of games played per round should is not important save that it be a relatively large number (to help eliminate spurious results) and that the number of games per round be unknown to the strategies.  I’m not going to discuss the reasons for this here, but would encourage anyone who can see the problems a known number of games creates, to write it up in the comments below.  The reason turns out to be very similar to the reason a single game is not interesting.

At the end of each round (or generation) the computer converts each strategy’s winnings to a relative amount of reproductive success.  In each round past round one, more successful strategies are present in a greater proportion than less successful strategies.  As a result, strategies find themselves with these successful strategies as a rival more often than they would in round one.  By the same reasoning, each strategy will see fewer and fewer relative matches against strategies that have not been so successful.  As the rounds progress, the population is made up of greater percentages of the most successful strategies, with the frequency of less successful strategies thinning, possibly even to the point of extinction.


0 Responses to “Prisoners’ Dilemma Part 2– The Simulation”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: