Prisoner's Dilemma Content from the guide to life, the universe and everything

Prisoner's Dilemma

3 Conversations

Irrationality is the square root of all evil.
- Douglas Hofstadter

Prisoner's Dilemma will be familiar to many people from the extended treatment Richard Dawkins gives it in his classic popular science work The Selfish Gene. It is a fairly simple problem which has, since it first came to prominence in the 1950s, exercised and exasperated the minds of people drawn from such diverse fields as political science, economics, social psychology and philosophy.

What is the Dilemma?

The basic problem at the heart of the dilemma is the question, 'How can co-operation emerge among rational, self-interested individuals without there being any form of central authority imposed on them?'. In other words, it can be seen as an attempt to find a secular, rational alternative to old-fashioned 'top-down' moral codes such as those of religious doctrines.

The term is used to refer to any situation in which there appears to be a conflict between the rational individual's self-interest and the common good. The basic premise underpinning the Prisoner's Dilemma is the Darwinian insight that human beings are essentially selfish creatures genetically programmed to place their own survival above all other considerations. However, an individual who works against the 'common good' can in fact be undermining the very foundations on which his/her own self-interest can thrive. An example of this being the continued short-sighted waste of the planet's resources by us as individuals, without taking the wider view that, since everyone else is doing the same, there may soon be little left of the planet for us to live on.

Brief Outline of the Prisoner's Dilemma

The Prisoner's Dilemma has conventionally been illustrated by means of an example involving two prisoners trying to decide whether or not they should inform on one another. These two prisoners find themselves in jail in separate cells awaiting trial, having been caught and charged for their participation in the same crime. The police go to each of the prisoners in turn and offer them the same deal - if you inform on your friend, we will see to it that you get a shorter sentence. Both men know that precisely the same deal will have been offered to their partner-in-crime; however, neither man knows for certain, or has any way of finding out, which decision the other will make. This is the crux of the problem - the outcome of either prisoner's decision depends in part on the decision made by the other prisoner, which decision the other has no way of knowing for certain in advance.

There are four possible outcomes. If they both stick to their story and refuse to talk (ie they 'co-operate' with one another), the law will have a hard time pinning anything on either of them and therefore they will both benefit. They will both end up with, at worst, a minimal sentence. However, each prisoner knows that if he 'co-operates' while the other 'defects' (ie, turns informer), he will end up losing heavily, because he will be doing the sentence for both of them - this outcome is known as the 'sucker's pay-off'. Likewise, the other prisoner is aware that the same thing could happen to his disadvantage, if he keeps quiet while the other prisoner turns informer. The most likely outcome, then, if both men are rational, will be for both of them to inform and therefore both will 'lose', but each loses less than they would have done if they had got the sucker's pay-off by keeping quiet while the other informed.

Another Way of Looking at It

For those who find the prisoner example a little obscure, perhaps it is better explained in the form of a simple game. Two players face one another each with two cards in their hand, one of which says 'co-operate' and the other says 'defect'. Each player has to lay one of their two cards at the same time as the other player also lays one of their cards. Neither player has any way of knowing which choice the other will make. The point of the game is not to eliminate the opponent, but simply to accrue as many points as possible for oneself. This is not a 'zero-sum' game such as chess - success does not depend on the failure of one's opponent, but rather on one's ability to adapt appropriately to their behaviour. The game is 'iterated' - that is, there will be a series of rounds rather than it simply being a one-off event. The four possible outcomes for each move are essentially the same as for the situation involving the two prisoners outlined above. These outcomes are shown in the following table. For Player 1 read across the table, for Player 2 read down1:

Co-operateR=3, R=3
Reward for mutual co-operation
S=0, T=5
Sucker's pay-off and temptation to defect
DefectT=5, S=0
Temptation to defect and sucker's payoff
P=1, P=1
Punishment for mutual defection

The optimum outcome for both players is mutual co-operation, for which either player gets a reward (R) of three points. However, both players have the temptation (T) of knowing that, if they defect while the other co-operates, they will score five points while the other player gets the sucker's payoff (S) - no points at all. Therefore, if the game is being played between two rational players, the logical outcome, bearing in mind that neither player knows what the other is going to do, is that both will defect, and therefore they will both end up with P - only one point apiece. This, clearly, is by no means the most favourable outcome for either player. In fact, the most favourable outcome is for the two players to consistently co-operate with one another. However, two rational players will accept the results of mutual defection because of the possibility of an even worse outcome if they do not. Therefore both players, as a logical consequence of the rational pursuit of their own self-interest, end up with less than they would have got if they had been able to trust one another enough to co-operate.

So Are We All Doomed?

Maybe, maybe not. At any rate, it will be a disagreeable conclusion for anyone still prey to the notion that the human being is somehow distinct from other animals by virtue of having a 'soul', or some other kind of 'spirit' or innate 'moral sense'. Even for the open-minded sceptical type, however, it is a problematic conclusion. After all, in the real world, most of us will probably have some idea of what it feels like to get the sucker's pay-off from time to time. Not nice; you do everyone else's work for them, and they end up with most of the rewards. Then you get the blame when everything goes wrong, which usually happens about two minutes after you stormed off in protest at your shabby treatment. So it would seem to be in most of our interests to see if there is some rational way of dealing with the problem.

Co-operation in Nature

The first thing to bear in mind is that nature is not simply governed by a brutal 'survival of the fittest' ethic. Actually, in nature there is a surprisingly high amount of co-operation between members of a species and also between members of different species. In other words, survival in nature is not all about 'dog eat dog'. Many species have evolved sophisticated co-operative techniques that enable all participating parties to benefit. An example of this is mentioned in Matt Ridley's book The Origins of Virtue. Shoals of fish often stop off at specific points where they know there will be smaller fish waiting to 'clean' them of parasites. The benefit is mutual as the larger fish gets cleaned and the smaller fish get something to eat. However, from the point of view of the larger fish, the benefit would appear to be greater still if it were to simply eat the smaller fish after the latter has done its job. After all, it does not need the fish anymore, now it has done its work, so why not give it the sucker's pay-off and take all the reward for oneself, in other words eat the poor creature?

The answer has to be simply that the larger fish 'knows' that, if it were to 'defect' in this way, it would suffer for it later on because other smaller fish, once they got wind of what had happened to one of their number, would no longer be so inclined to provide their useful service to that particular larger fish. They would, in other words, isolate the defector and make it more difficult for it to get back into the 'game' another time. So here we see the policy of reciprocity (or, 'tit-for-tat') being enacted in real life - these fish have a mutual understanding to act not purely for their own benefit, but also with the interests of the wider fish community in mind.


At any rate, all is not lost. In the late 1970s a political scientist named Robert Axelrod set about trying to find a rational solution to the Prisoner's Dilemma. He invited people within academia to send in computer programs containing a strategy for coping with the dilemma. He then pitted these programs against one another in a series of virtual Prisoner's Dilemma tournaments. Numerous people took part from a wide range of disciplines, including psychology, philosophy, biology and mathematics.

Axelrod found that the most effective strategies were, almost invariably, 'nice' ones. That is to say that they tended towards co-operation rather than defection. Furthermore, and rather pleasingly, the strategy that kept coming out on top was also the most straightforward, involving nothing more complex than a child's game of Tit-for-Tat. You simply do what your opponent did on the previous move. However, for your first move, you always take the risk of co-operating before having any knowledge of what your opponent is going to do. This calculated risk is worth taking because if your opponent also co-operates you will be in a (potentially long-lasting) win-win situation, and everyone involved will go home as happy as can be. If he defects - well, fair enough, you get the sucker's pay-off once, but next time you will be wise to your opponent and will know to defect from then on if necessary.

Many of the worst performers were those strategies which attempted to exploit the weak points of others; for example, a strategy which always defected when confronted with a program that always co-operated. Such 'nasty' (ie exploitative) strategies, while they often started out looking successful, would soon begin to reveal themselves as self-undermining because the easy prey that they were feeding off was destined to be knocked out early on in the tournament. Thus, it became increasingly difficult for those 'nasty' strategies to find suckers to exploit, and consequently they themselves also tended to disappear as the tournament progressed to its later stages.

Axelrod argued that his experiment demonstrated that co-operation can evolve organically, from within a system of interactions, without having to be imposed by some or other external authority. This would seem to be the case in real life, even in situations where there is no particular trust or spirit of friendship between people. Axelrod cites the example of British and German soldiers during the First World War, many of whom adopted a (highly unofficial) policy of 'live and let live' during quiet moments in trench warfare. This basically involved leaving the other side alone unless provoked into defensive action, and was a direct contravention of orders.

The encouraging thing about tit-for-tat is that it does not require any particular intellect or even self-awareness to be able to play it. Even lowly life forms such as bacteria are to some degree capable of playing out a version of this strategy, so small are the requirements. All that is required is that the entity 'playing' the 'game' should need to be able to interact with at least one other entity, and that it is capable of responding to the last action made by the other player2. So we clever humans should be able to figure it out too...

Problems with the Tit-for-Tat Approach

However, tit-for-tat is not a panacea for all evils. It is a great theory, but it cannot explain rational co-operation all on its own. One very obvious question is that if tit-for-tat works so effectively, and evolves so inevitably, how come we do not now live in a world full of organisms that have evolved to live in a state of near-complete harmony with one another? Obviously, the world isn't really like that. As Ridley points out, while some animals do use the tit-for-tat strategy, most do not. While reciprocity seems to be prominent among human beings, and some other species, the truth is that in nature it is far from universal.

Another problem with tit-for-tat is that it requires stability over an extended time if it is to work effectively. In other words, it is only any use in an iterated Prisoner's Dilemma-type of game, in a situation where interactions between people are repeated. In a one-off situation, tit-for-tat simply cannot work because it is plain common sense to defect if one knows that one will never come across this particular situation again - simply because it is unlikely that the other player will ever have a chance to return the disfavour. Stability and longevity are features which tend to be in rather short supply in our globalised laissez-faire economy, and it is perhaps accurate to say that, for this reason, tit-for-tat is insufficient for the evolution of co-operative behaviour among people.

Even on the theoretical level, tit-for-tat is no universal solution. Most problematic of all is that, if left to its own devices, it can evolve into other strategies that are less conducive to co-operation. Or it can lead to situations where non-co-operative strategies can once again begin to flourish. For example, as Ridley notes, if two tit-for-tat players come across one another, it only needs a single accidental defection, or a misunderstanding of some sort, for the players to become locked in a perpetual cycle of mutual defection; one defects, the other defects in retaliation and this continues ad infinitum because neither player has the mechanism to break out of the cycle. Worse still, as Ridley also notes, is that in an environment in which everyone is accustomed to co-operating, things can degenerate easily into naive 'always co-operate' strategies, which then leave the territory ripe for exploitation by unscrupulous defectors. So, paradoxically, the 'nicest' strategies, if left to their own devices, can pave the way for the return to prominence of the 'nastiest' ones...


It would seem, then, that we need to look beyond the merely 'technical' level if we wish to solve the Prisoner's Dilemma. However, this is not to say that rationality is no use to us, but that our conventional understanding of what it means to be rational is simply too narrow and needs broadening a little. Thus, we come back to the quote at the start of this article. 'Irrationality', says Douglas Hofstadter, is the 'square root of all evil.' Fair enough but how can we distinguish, once and for all, what is rational from what is irrational? A possible answer lies in Hofstadter's concept of super-rationality. Put simply, this is looking outside one's own decision and taking into account the decisions of others too, and consequently making the decision that one would hope they would also make. In other words, the 'super-rationalist' thinks 'globally' - in the wider interest - rather than 'locally', simply with his/her own interest in mind.

The simple question to ask, when confronted with a Prisoner's Dilemma-type situation, is 'Which world would I prefer to live in - which is more in my interests?' A world in which all rational people recognised that to co-operate is more rational than to defect, or a world in which people get stuck at the point that says defecting is more rational in the short-term? The truth is that the latter world would soon become uninhabitable (and there are arguable comparisons here with our own real world). No one would be able to have any trust in anyone or anything at all - therefore, to choose to defect, even in a one-off Prisoner's Dilemma-type situation, is to the super-rationalist an ultimately irrational choice, because one is undermining the very foundations of reason on which one depends and hopes to live with. The rational thing to do is to make the leap to that second, higher level of thought, and assume that one is dealing with other people who are also rational enough to make this leap.

Examples of Super-rationality

Imagine you are on holiday hiking through a lovely part of the world, some unspoilt green area that the masses haven't yet got their hands on. You stop for a picnic, in the process of which, naturally, you generate a certain amount of litter. 'Why bother to clear it up?' is the thought that might flash through the average mind. 'I'll never be coming back here, and it'll probably all be ruined by next summer anyway. Besides, it's only a few bits and pieces.' But, of course, you do take your litter home with you because you know that, if you leave the place in a mess, it will also discourage others from respecting the natural beauty of the area. You also know that if you had come across such a mess yourself, it would have made your holiday a little less enjoyable. This, then, is the super-rational approach - only by living the value of super-rationality can we expect our fellow humans to live it also. The more super-rational we become, the more super-rational we can expect our fellow travellers to be.

Other examples spring to mind. The decision to turn your heating down a notch, putting on an extra pullover instead, is a super-rational decision in the face of energy shortages and global warming. Hearing a rumour that there is going to be a shortage of some commodity, coffee for example, and therefore buying a little bit less than normal, rather than stocking up and actually helping to bring about the rumoured shortage, is a super-rational decision because by taking account of the collective good, your super-rational choice will eventually be reflected back to work for your own good. You hope. The fear is that those people who are addicted to their caffeine in a serious way will panic and hoard supplies, clearing the shelves so that next time around you will go short. And it is that fear which exerts such a strong grip over our minds, making us want to buy in bulk too, in order to guarantee our own supply. Just like the Prisoner's Dilemma game, the overriding thought is that you can only be worse off as a result of co-operating. As we saw previously, the rational case for defection seems to be overwhelming. However, far from being rational, defecting is thoroughly irrational. We can only promote sanity with our own sane behaviour.


Only by promoting super-rationality - by making the choice ourselves - will we be able to make the choice with any confidence, because one thus makes it more likely that other people will co-operate. After all, as Hofstadter points out, in a game played between truly (super) rational thinkers, choosing to defect 'undermines your reasons for doing so'. If you suspect that all of the others will behave as you behave, then logically you are saying that they are likely to co-operate with you, and therefore reason says that between truly rational (ie, super-rational) people the only rational thing to do is to co-operate3.

No solution is perfect. After all, you cannot necessarily be sure that the people you are dealing with are sufficiently rational to understand the principles involved. However, as Hofstadter says, once the principle of super-rationality has been established in a person's mind, there is no reason to suppose that a rational thinker will deviate from it, just as there is no reason to suppose that a person who has been taught basic mathematics will ever conclude that 2+2=5. It is a simple principle that, in theory, everyone can learn - and the more people who learn it, the better it is for all of us.

The attempts of Axelrod, Hofstadter et al to solve the Prisoner's Dilemma logically may seem a little simplistic to some and possibly rather too optimistic in their apparent faith that logic can indeed solve humanity's problems. At any rate, they stand as commendable attempts to try to think through the problems of living together in a complex world, without retreating into the superstition of a pre-scientific age.


Robert Axelrod The Evolution of Co-operation (1990, Penguin. First published 1984)

Richard Dawkins The Selfish Gene 2nd edition (1989, Oxford University Press. First published 1976)

Douglas R Hofstadter Chapter 29 and Chapter 30 from Metamagical Themas: questing for the essence of mind and pattern (1986, Penguin, First published 1985)

Matt Ridley The Origins of Virtue (1997, Penguin. First published 1996)

1See Axelrod, page 8 for the original version of this table2See Axelrod, Chapter 5; also Hofstadter, page 729.3See Hofstadter, pages 746-8.

Bookmark on your Personal Space

Edited Entry


Infinite Improbability Drive

Infinite Improbability Drive

Read a random Edited Entry

Categorised In:

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more