Economics 101: The best solutions to the Prisoner’s Dilemma

As part of our recurring Economics 101 feature, the Bahrain-US economist Omar Al Ubaydli presented a series of four articles on the economic consequences of the Prisoner’s Dilemma. He outlined how the dilemma works and studied two possible solutions to self-interested behaviours in the marketplace. His lucid and insightful series follows below:

1. The case of Ahmad and the TV store

The prisoner’s dilemma is a parable about two individuals who have jointly committed a crime considering how much to confess to detectives.

The lessons one gains from studying this peculiar scenario apply to a huge array of important settings in economics, including the most fundamental transaction that underlies all of our prosperity – commercial exchange.

In the original prisoner’s dilemma, the two prisoners are held in separate cells from which they cannot communicate. They are suspected of committing a major offence but, without a confession, the district attorney can only indict them for a minor offence. If, upon interrogation, each remains quiet, the two will serve one year for the minor infraction. If one remains quiet while the other betrays his friend, the betrayer walks away free, while the betrayed is jailed for 10 years. If they both betray each other, each serves six years. While it is in their collective interest to keep quiet, each has an incentive to betray the other.

How does this apply to commercial exchange? Consider a fictitious UAE citizen, Ahmad, who wishes to buy a TV online from a fictitious UAE-based electronics retailer, iElectronics, for Dh2,000. Ahmad is willing to pay up to Dh3,000 for the TV, while it costs iElectronics Dh1,000 to sell it. Therefore, if Ahmad buys the TV, each party will obtain Dh1,000 worth of benefit.

Executing the transaction is not so simple, however. Once Ahmad communicates his desire to purchase the TV to iElectronics, each of the two parties has a choice between two alternatives. Ahmad can send the money or not; and iElectronics can deliver the TV or not. For simplicity, let us set aside the possibility of complaining to the authorities if one side reneges on its commitment. Then this situation shapes up the same way as the original prisoner’s dilemma. So what choice will Ahmad and iElectronics make?

To analyse this situation, economists use “game theory”: the study of decision-making when the players’ choices are interdependent, meaning that what one party wants to do depends on what other parties are doing.

Let us suppose that iElectronics will send the TV; should Ahmad pay? If he does, he benefits Dh3,000 minus Dh2,000 = Dh1,000. If he does not, he benefits Dh3,000. Thus, setting aside morals, it serves his interests to refrain from paying.

What if iElectronics does not send the TV? If Ahmad pays, he loses Dh2,000; and if he does not pay, he neither gains nor loses anything. Thus again, it serves his interests to refrain from paying.

Thus, whatever iElectronics does, Ahmad’s interests are best served by his refraining from paying.

The retailer faces an equivalent dilemma. If Ahmad pays and it sends the TV, it gains Dh2,000 minus Dh1,000 = Dh1,000; and if it does not, it profits Dh2,000. If Ahmad refrains from paying, sending the TV loses iElectronics Dh1,000, while the retailer neither gains nor loses should it keep the TV. Therefore, its interests are best served keeping the TV, whatever Ahmad does.

Thus, game theory predicts that neither Ahmad nor iElectronics will perform their side of the bargain, and the result is both sides gaining nothing, just like the two prisoners betraying each other and suffering a worse outcome. If the two sides could trust each other, then they could each profit by Dh1,000; but the opportunity to exploit the other side’s trust prevents the gains from being realised. The key is that the interests of each individual are incompatible with those of the collective, and individuals are the ones who make the decisions.

The breakdown of commerce is a serious issue in the GCC. Many GCC citizens are reluctant to buy online precisely because they fear being cheated by sellers, forcing society to dedicate lots of resources to brick-and-mortar stores.

In addition to explaining the weakness of certain types of commerce, the prisoner’s dilemma also explains why groups often fail to organise contributions to a public good (known as the tragedy of the commons), and why countries struggle to open their economies to each other according to trade accords. These are common problems in the GCC and beyond.

An especially timely application for the GCC is Opec and its accord with non-Opec oil producers. In principle, if they can all cut their output together, they will all benefit; however, each producer faces a temptation to cheat on its quota – a temptation made more acute by the difficulty of objectively measuring production. In the period 1980 to 2009, according to a research paper published in 2011 by the American political scientist Jeff Colgan, Opec producers have cheated 96 per cent of the time. That represents the classic outcome of the prisoner's dilemma; most analysts expect a similar outcome this time around.

Are decision-makers doomed to succumb to their own selfishness? There are two primary solutions to the prisoner’s dilemma. The first is introducing repetition, so that when Ahmad observes iElectronics cheating today, it can punish him tomorrow. The second is external enforcement, whereby a third party enforces good behaviour by wielding a capacity to punish cheaters. Let us consider these options …

2a. Repetition as a solution to prisoner’s dilemma

Individuals pursuing their own best interests sometimes set up the worst outcomes for a group – known as the prisoner’s dilemma. One manifestation of this problem in the GCC is the limited role for e-commerce, where buyers and sellers do not trust each other enough to conduct an online transaction. How can repetition help solve the prisoner’s dilemma?

Remember, in the prisoner’s dilemma, each person has the choice between behaving opportunistically (defection) and responsibly (cooperation). The best possible outcome is multilateral cooperation but it is difficult to realise because each person benefits unilaterally from defection.

In particular, defection benefits the defector at the expense of other players. This means that defection has two properties – it serves an individual’s immediate interests and it can serve as a way of punishing others.

Punishment is an asynchronous activity – it is a response to someone else’s actions, not something that you carry out simultaneously to your being mistreated. Since the prisoner’s dilemma is generally applied to simultaneous-move environments, if you find yourself playing a one-off prisoner’s dilemma (like the two prisoners in the original parable), then by the time you realise that you have been wronged and want to take retribution, the game is over and you have no recourse.

Repeating a prisoner’s dilemma is a game-changer because it offers players the chance to retaliate. More importantly, the threat of retaliation can sometimes be strong enough to motivate people to cooperate in the first place. When is this threat strong enough? A group of economists proposed a series of conditions, known as the Folk Theorem. There are two conditions necessary for the threat of retaliation to induce cooperative behaviour. First, the punishment has to be painful for its recipient. Second, the recipient has to care enough about the future, also known as exhibiting sufficient farsightedness, because all punishments are necessarily delayed. If, for example, one of the parties is leaving town the next day or is incredibly impulsive, then even the threat of a severe punishment is insufficient to scare them into virtuous behaviour, and defection will prevail.

In the context of online sales in the GCC and beyond, when you imagine retaliating against an unscrupulous seller, you probably think of delivering bad online ratings or a scathing review. These are slightly more complex versions of the punishment conceived in the Folk Theorem and we will discuss them more in the next article. In fact, the most elemental counter-attack you have at your disposal is withdrawing your future custom.

In many cases, this is enough to get people to play cooperatively. For example, most people play a repeated prisoner’s dilemma with their electricity provider: you can choose whether or not to pay and the provider can choose whether or not to provide you with electricity. Online bill payments are prevalent in the GCC and elsewhere.

Should the company decide to defect, by cutting your electricity off, you can respond by not paying future bills, hurting its bottom line. Conversely, should you decide not to pay your bill, the company can cut your electricity off. For the most part, people pay their bill and receive electricity, because of the repetition of the transaction and the farsightedness of both parties. Note that both parties have more complicated punishment methods available but, generally, the most basic form suffices.

In the case of Opec members trying to coordinate output cuts, quota violations act as defection and punishment. The historic 96 per cent defection rate indicates that there might be impediments beyond the punishment being too weak, or oil ministers being too short-sighted. In fact, a key challenge is devising ways of objectively detecting defections, before the repetition-mediated punishment can be determined.

2b. Repetition and the power of (perceived) righteousness

In business and elsewhere, interacting repeatedly with someone helps you to overcome the temptation to behave opportunistically – you fear the consequences of such opportunism, including direct punishment (see last week’s article). Having your reputation tarnished is a more subtle mechanism for motivating good behaviour, and it is one that has a rich history in traditional GCC commerce. How exactly does it work in the prisoner’s dilemma?

Recall that the prisoner’s dilemma is a situation where individuals pursuing their own interests leads to the worst outcome for the group. Repetition allows you to retaliate against selfish behaviour with selfish behaviour; if being selfish is sufficiently damaging to the other party and they care enough about future consequences, then the threat of punishment will induce good behaviour. But sometimes the threat is too weak, as demonstrated by Opec’s historic inability to effect coordinated output cuts.

In the 1980s, economists working in the field of game theory articulated a different way in which repetition induces cooperation. Imagine that the world is populated by the selfish, Hobbesian beings considered in the original prisoner’s dilemma, who struggle – and typically fail – to resist the temptation to behave selfishly. But imagine that there is also a small proportion of righteous people, who refuse to behave selfishly in a prisoner’s dilemma regardless of the material benefits because they care so much about “doing the right thing” (cooperating). Economists showed how the presence of a small number of such saints can be sufficient to convince all the Hobbesian wretches to behave well.

The key condition is that when you enter a prisoner’s dilemma scenario with someone, you cannot be sure if they are righteous or selfish. The uncertainty creates an incentive for people to want to pretend that they are righteous, even if they are actually worthless misanthropes, because everyone wants to do business with the righteous and avoid the selfish.

While your righteousness cannot be ascertained by others based on your physical appearance, your behaviour is a giveaway: since righteous people always cooperate, anyone observed behaving selfishly is guaranteed to be a selfish type, and one slip is all that is necessary for you to be branded. If the benefits of being perceived to be righteous, including repeat business and new opportunities, are sufficiently large, then it becomes optimal for selfish people to cooperate in prisoner’s dilemma settings because they are so keen on having others treat them as potentially righteous.

If this incentive is strong enough, then everyone behaves righteously and everyone continues to be treated as if they might be righteous, but nobody is ever sure who the actual righteous people are.

We see this principle regularly in our day-to-day lives. Some GCC companies are owned and operated by genuinely righteous people and this is reflected in charitable contributions and good treatment of employees. Such acts confer a potential advantage upon the company as customers and workers like to be associated with ethical businesses. The vast majority of companies, however, prioritise profits, but they also want to secure the benefits associated with projecting an image of ethical operations. As such, even ruthlessly capitalistic firms might run large corporate social responsibility programmes as part of an elaborate masquerade, fearing the adverse commercial consequences of being branded an “unethical” company.

However, sometimes the fear of punishment and of having one’s reputation tarnished are still not enough to ensure cooperation, as reflected in the case of Opec and output cuts. Some members have actually cultivated a reputation as being egregious quota violators, apparently without any adverse consequence. Maintaining our Hobbesian theme, achieving the best outcome in certain prisoner’s dilemma scenarios, especially in GCC commerce, is possible only via some sort of leviathan, which we discuss below.

3. External enforcement as a solution to prisoner’s dilemma

The prisoner’s dilemma is a caricature of the many situations we face in our day-to-day lives where cooperation leads to a better outcome for all, but where the desire to pursue one’s own interests results in a worse outcome for all. Tackling global climate change is arguably the most prominent example today.

In the case of the GCC, commercial exchange, which benefits all parties, is often undermined by selfish behaviour by buyers, such as bouncing cheques or reneging on credit agreements, and by sellers, such as delivering faulty merchandise. This is one possible explanation for why e-commerce has, until recently, been slow to catch on in the GCC.

As we learnt above, repetition can promote cooperation in prisoner’s dilemma situations by allowing people to punish non-cooperators, and by allowing people to reap the benefits of cultivating a reputation for being cooperative. Yet, as we see in the case of Opec or climate change, even that is not enough.

An alternative solution is external enforcement, loosely equivalent to English philosopher Thomas Hobbes’ Leviathan: a third party possesses the power to forcefully punish those engaging in selfish behaviour in PD situations, ensuring cooperative behaviour.

In the case of GCC commerce, that third party is usually the government, which enacts and enforces commercial laws that protect both sides of the market. For example, writing cheques that bounce can lead to your imprisonment and, as we saw in Bahrain in mid-January when several retailers were found to be displaying prices inaccurately, commercial malpractice can lead to fines and temporary closures.

Sometimes, a prisoner’s dilemma setting falls under an external enforcer’s purview by default, without the prior approval of the people playing the prisoner’s dilemma. As an illustration, if a pirate attempts to interrupt trade in the Strait of Hormuz, the US Fifth Fleet, a de facto external enforcer, is likely to intervene without the need to refer to a prior agreement between the pirates and the targeted vessel.

Alternatively, in many of these settings, the stakeholders will voluntarily invite an external enforcer and endow them with the tools required to enforce order. For example in the GCC, generic courts can sometimes be slow to dispense justice in specialised commercial disputes, which encourages opportunistic deviations from contractual agreements.

As an antidote, contracting parties may agree to abide by the ruling of a private, independent arbitrator that is dedicated to the swift resolution of commercial disputes. One way of strengthening the court’s hand is getting both parties to post a bond at the contract’s outset, which the arbitrator can use to punish the party assessed to have violated the terms of the contract.

On the surface, external enforcers may seem like a definitive solution to prisoner’s dilemma problems, arguably superior to the intermittently successful role of repetition and reputation. However, external enforcement comes with its own drawbacks.

First, third-party enforcement requires resources: courts and arbitrators are expensive inside and outside the GCC, and the US Fifth Fleet costs millions of dollars to operate.

Second, in the case of external enforcers that need to be granted the authority to punish miscreants, attaining the requisite consensus among stakeholders can be impossible. The difficulties that the United Nations faces in holding violators of international accords accountable are well documented, and they stem from the inability of member countries to agree upon a system of enforcement. Opec’s ineffectiveness is also symptomatic of failure to forge a consensus on how to monitor production and punish quota violators.

In most cases, beyond issues relating to operating costs, the reluctance to evolve power to a third party stems from a lack of trust that the third party will behave impartially and serve the general interest. That is why supranational organisations such as the IMF and World Trade Organization function primarily as intermediaries and coordinators rather than as economic overlords.

We welcome economics questions from our readers via email (omar@omar.ec) or tweet (@omareconomics).

Omar Al Ubaydli is the programme director for international and geopolitical studies at the Bahrain Centre for Strategic, International and Energy Studies, and an affiliated associate professor of economics at George Mason University in the US. He welcomes economics questions from readers via email (omar@omar.ec) or tweet (@omareconomics).

business@thenational.ae

Follow The National's Business section on Twitter