Prisoner's dilemma: Difference between revisions
imported>Nick Gardner No edit summary |
mNo edit summary |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 5: | Line 5: | ||
Two fellow-criminals are held in separate cells, and each is told that: | Two fellow-criminals are held in separate cells, and each is told that: | ||
* he will go free if he informs on his mate and his mate keeps silent; | * he will go free if he informs on his mate and his mate keeps silent; | ||
* he will get ten | * he will get a ten year sentence if he stays silent and his mate informs on him; | ||
* he will | * he will get a five year sentence if they each inform on the other | ||
* he will get a six months sentence if they both stay silent | * he will get a six months sentence if they both stay silent | ||
Each prisoner then constructs his personal payoff matrix on the following lines: | Each prisoner then constructs his personal payoff matrix on the following lines: | ||
Line 15: | Line 15: | ||
==The multiple-trial game== | ==The multiple-trial game== | ||
Experiments have shown that mutual defection is not an individual player’s best strategy if similar situations are expected to arise in the future. Sixty-three strategies were pitted each against the other in a computer over sequences of 200 iterations, in a round robin tournament. The total amount won by each strategy over all of its pairings was then calculated, allocating a predetermined sum of money to each of the four possible outcomes of each iteration. The consistent winner was called tit-for-tat. Its strategy was to start with a cooperative move, and thereafter to imitate every move of defection or cooperation made by the other person. By definition, Tit-for-tat could not ‘win’ any particular iteration but it nevertheless accumulated the most points by achieving high shared scores. The most successful of the strategies tested were those meeting four requirements: | Experiments have shown that mutual defection is not an individual player’s best strategy if similar situations are expected to arise in the future<ref>Robert Axelrod: ''The Evolution of Coperation'', Basic Books, 1984</ref>. Sixty-three strategies were pitted each against the other in a computer over sequences of 200 iterations, in a round robin tournament. The total amount won by each strategy over all of its pairings was then calculated, allocating a predetermined sum of money to each of the four possible outcomes of each iteration. The consistent winner was called tit-for-tat. Its strategy was to start with a cooperative move, and thereafter to imitate every move of defection or cooperation made by the other person. By definition, Tit-for-tat could not ‘win’ any particular iteration but it nevertheless accumulated the most points by achieving high shared scores. The most successful of the strategies tested were those meeting four requirements: | ||
* ''niceness'' (willingness not to defect except as retaliation) ; | * ''niceness'' (willingness not to defect except as retaliation) ; | ||
* ''retaliation'' (a policy of matching defection with defection); | * ''retaliation'' (a policy of matching defection with defection); | ||
* ''forgiving '' (a policy of always rewarding cooperation : letting bygones be bygones ) | * ''forgiving '' (a policy of always rewarding cooperation : letting bygones be bygones ) | ||
* non-envy (willingness to accept deals giving preferential benefits to the other party). | * non-envy (willingness to accept deals giving preferential benefits to the other party). | ||
The first two requirements are relatively easy to meet, but failure to meet the last two are the most likely causes of mistakes. If a return to cooperation is not immediately rewarded, there is a danger of a protracted sequence of defections. That danger is in any case present, for example from accidental defections, and it appears that there is advantage in reducing it by retaliating only against sequences of two defections (tit for two tats) or by breaking a run of defections by the occasional insertion of a cooperative move. The danger of mistakes arising from envy is harder to guard against. When psychologists set up games of iterated prisoner dilemma between humans instead of using computer programmes, nearly all of the players did badly because they succumbed to envy. In many other game-playing experiments it has been shown that people would rather do down the other player than do down the banker. Other experiments have shown subjects to be even willing to pay for permission to burn someone else’s money . But it has also been shown that people often put greater weight on a desire for fairness than on their own interests. The behaviour pattern known to psychologists as inequity aversion has been shown to exist even among monkeys. Capuchin monkeys rejected their own rewards when they saw others getting better rewards for the same task . | The first two requirements are relatively easy to meet, but failure to meet the last two are the most likely causes of mistakes. If a return to cooperation is not immediately rewarded, there is a danger of a protracted sequence of defections. That danger is in any case present, for example from accidental defections, and it appears that there is advantage in reducing it by retaliating only against sequences of two defections (tit for two tats) or by breaking a run of defections by the occasional insertion of a cooperative move. The danger of mistakes arising from envy is harder to guard against. When psychologists set up games of iterated prisoner dilemma between humans instead of using computer programmes, nearly all of the players did badly because they succumbed to envy. In many other game-playing experiments it has been shown that people would rather do down the other player than do down the banker. Other experiments have shown subjects to be even willing to pay for permission to burn someone else’s money<ref>Daniel Ziszo: ''Money-burning and Stealing in the Laboratory'', Oxford University Department of Economics | ||
Discussion Paper No40, October 2000</ref>. But it has also been shown that people often put greater weight on a desire for fairness than on their own interests. The behaviour pattern known to psychologists as "inequity aversion" has been shown to exist even among monkeys. Capuchin monkeys rejected their own rewards when they saw others getting better rewards for the same task . | |||
==The ultimate bargaining game== | ==The ultimate bargaining game== | ||
Unlike the monkeys, humans tend to be averse to both guilt and envy. That has been demonstrated by experimental trials of an ''ultimate bargaining game'' in which: | Unlike the monkeys, humans tend to be averse to both guilt and envy. That has been demonstrated by experimental trials of an ''ultimate bargaining game'' in which: | ||
Line 35: | Line 35: | ||
==References== | ==References== | ||
{{reflist}} | {{reflist}}[[Category:Suggestion Bot Tag]] |
Latest revision as of 06:01, 7 October 2024
The Prisoner’s Dilemma is a game-theory concept that is often used in the analysis of bargaining situations. Experiments in which the game is repeated have been used to evaluate alternative bargaining strategies
The basic game
Two fellow-criminals are held in separate cells, and each is told that:
- he will go free if he informs on his mate and his mate keeps silent;
- he will get a ten year sentence if he stays silent and his mate informs on him;
- he will get a five year sentence if they each inform on the other
- he will get a six months sentence if they both stay silent
Each prisoner then constructs his personal payoff matrix on the following lines:
- If he doesn’t inform and - I don’t, I get 6 months but if I do, I go free
- If he does inform and - I don’t, I get 10 years but if I do, I get 5 years
- -so whatever he does, I am better off informing on him.
So both inform, and both get 5-year sentences instead of the 6-month sentences that they would have got if they had cooperated. That outcome is the Nash equilibrium for the prisoner’s dilemma. It has been termed mutual defection.
The multiple-trial game
Experiments have shown that mutual defection is not an individual player’s best strategy if similar situations are expected to arise in the future[1]. Sixty-three strategies were pitted each against the other in a computer over sequences of 200 iterations, in a round robin tournament. The total amount won by each strategy over all of its pairings was then calculated, allocating a predetermined sum of money to each of the four possible outcomes of each iteration. The consistent winner was called tit-for-tat. Its strategy was to start with a cooperative move, and thereafter to imitate every move of defection or cooperation made by the other person. By definition, Tit-for-tat could not ‘win’ any particular iteration but it nevertheless accumulated the most points by achieving high shared scores. The most successful of the strategies tested were those meeting four requirements:
- niceness (willingness not to defect except as retaliation) ;
- retaliation (a policy of matching defection with defection);
- forgiving (a policy of always rewarding cooperation : letting bygones be bygones )
- non-envy (willingness to accept deals giving preferential benefits to the other party).
The first two requirements are relatively easy to meet, but failure to meet the last two are the most likely causes of mistakes. If a return to cooperation is not immediately rewarded, there is a danger of a protracted sequence of defections. That danger is in any case present, for example from accidental defections, and it appears that there is advantage in reducing it by retaliating only against sequences of two defections (tit for two tats) or by breaking a run of defections by the occasional insertion of a cooperative move. The danger of mistakes arising from envy is harder to guard against. When psychologists set up games of iterated prisoner dilemma between humans instead of using computer programmes, nearly all of the players did badly because they succumbed to envy. In many other game-playing experiments it has been shown that people would rather do down the other player than do down the banker. Other experiments have shown subjects to be even willing to pay for permission to burn someone else’s money[2]. But it has also been shown that people often put greater weight on a desire for fairness than on their own interests. The behaviour pattern known to psychologists as "inequity aversion" has been shown to exist even among monkeys. Capuchin monkeys rejected their own rewards when they saw others getting better rewards for the same task .
The ultimate bargaining game
Unlike the monkeys, humans tend to be averse to both guilt and envy. That has been demonstrated by experimental trials of an ultimate bargaining game in which:
- A is given the opportunity to share £100 with B, having said how much of it he is prepared to give to B.
- If B accepts the offer, he gets what is offered, and A retains the rest.
- If B rejects the offer, neither get anything.
If both were to act rationally, A would offer a small sum, and B would accept. But in typical collections of such experiments, subject A seldom offered very much less than £50, and subject B very seldom accepted very much less.
Applications
The tit-for-tat strategy is in many ways similar to what biologists term an Evolutionary Stable Strategy because once it becomes a community’s established strategy, attempts to depart from it are unlikely to succeed.
In the multiple-trial experiments it was assumed that the prisoners could not talk to each other. If that assumption is dropped, a cooperative strategy equivalent to tit-for-tat could have been adopted if the prisoners were willing to trust each other. The prisoner’s dilemma is, in that respect, a useful allegory for a variety of real-life situations in which there is a resolvable conflict of interests. It is particularly relevant to arms races in which each side acquires more and more arms for the sole reason that the other side is doing the same. The disarmament agreements between Reagan and Gorbachev in the 1980s are an example of what cooperation can achieve. An important feature of those agreements was, however, their provision for verification by on-site inspection, in accordance with Reagan’s stated policy of trust but verify. On the other hand, the breakdown of the Munich Agreement between Chamberlain and Hitler is a reminder that the trustworthiness of both parties is essential to success. A strong incentive to be trustworthy has been shown to be provided by people’s willingness to punish cheats, and it has been found that they often do so, even when to do so is against their immediate interests . In commerce, a reputation for fair dealing among businessmen has generally made mutual benefits possible, even from one-off transactions. A culture of trust has been taken to be one of the essential components of social capital , and societies that possess it have benefited in economic terms. It has even been suggested that the origin of trade and the division of labour, and thus of nearly all economic growth, was the first occasion on which members of a family decided to trust a stranger.