University of Warsaw
e-mail: miekisz @mimuw.edu.pl
Prof. Jacek Miękisz, a mathematician, works in the Section of Biomathematics and Game Theory of the Institute of Applied Mathematics at the University of Warsaw. He is interested in statistical physics, quasicrystals, gene expression and regulation, intracellular signaling pathways, time delays, and evolutionary game theory.
From the perspective of an individual, the best choice is not to cooperate, but simply to take advantage of the goods created by irrational altruists. If everyone cooperates, however, there are more goods to distribute. This paradox is fundamental for understanding situations referred to as social dilemmas.
Game theory is the study of models of interaction between individuals (players) who can choose between different strategies, or behaviors. The outcome of each game is defined in terms of benefits for the players (called payoffs), the value of which depends not only on a specific player’s strategy but also on the strategies chosen by other players. That is a fundamental characteristic of game theory and one that differentiates it from classic optimization, which involves searching for an optimal solution – that is, the best solution from the perspective of an individual – in response to unchanging external conditions. In game theory, each player’s actions has an impact on his surroundings, and vice versa. Individual rationality is fundamentally based on the assumption that every player seeks to maximize his payoffs without showing concern for other players. When choosing a strategy, a player must take into account the behavior of his opponents and anticipate what they may expect of him. These considerations prompted John Nash to introduce a certain kind of equilibrium: a set of strategies chosen by the players in a game, in which no player stands to benefit from changing his strategy, because any deviation will not increase his or her payoffs and may even diminish it. Thus, an unwritten agreement comes into effect and no one stands to benefit from any unilateral deviation. Such a “Nash equilibrium” offers a certain security, but does it ensure optimal payoffs for the players?
A classic example used to illustrate the essence of social dilemmas is the game called the Prisoner’s Dilemma, in which two suspects are placed in separate cells and questioned. Each suspect may testify against the other (defection, D) or keep silent (cooperation, C). If neither strikes a bargain with the police and they both choose cooperation as their strategy, their prison sentences will be reduced by three years. If one of them decides to cover for the other (cooperation), but the other makes a deal with the police (defection), the former will get a maximum sentence, whereas the latter will have his sentence reduced by five years. If they both choose to betray each other (each of them chooses to defect), they will each have their sentences reduced, but only by one year. We can abstractly represent the Prisoner’s Dilemma with the following matrix, viewing the prison term reductions as payoffs:
C 3 0
D 5 1
Every player has two possible moves: to cooperate (C) or defect (D). For each pair of moves, the payoffs are listed in the cell at the intersection of the corresponding row and column. As we can see from this matrix, each player benefits from betraying the other (choosing the bottom row, rather than the top row), with mutual defection constituting a Nash equilibrium. However, both players taken together would benefit more, both gaining a lighter sentence, if they both chose to cooperate. This illustrates a social dilemma in its most elementary form. What should we do? How can find a way out of the Nash equilibrium? In the 1970s, John Maynard-Smith expanded on classic game theory by proposing evolutionary game theory. His formulation was quickly followed by specific dynamic models. Simply put, an evolutionary game is one that involves many players and repeated games. Players usually play games in pairs, for example the Prisoner’s Dilemma. In consecutive rounds, the proportion of players who choose a given strategy changes depending on the payoff it brought in the previous round. Such behavior follows directly from the very essence of Darwin’s theory of evolution: the fittest individuals have more offspring, who inherit their traits strategies in the game.
Robert Axelrod expanded upon the permissible strategies and staged computer tournaments according to the aforementioned rules. The extension of the rules was necessary: if we allow only unconditional defection or unconditional cooperation, the population will be soon dominated by defectors. In turned out that a strategy called tit-for-tat proved to work quite well: both players cooperate in the first round and then they mimic the opponent’s previous action in the next round. In other words, they betray a defector and cooperate with an altruist. Tit-for-tat beats many other strategies. As we can easily observe, however, it still loses out to unconditional defection, but the conclusion is that cooperation should perhaps not be completely written off in the Darwinian world.
Since that time, many new mathematical models have emerged that promote cooperation strategies, sometimes even very strongly. These include models based on reputation (cooperating with those who have gained their reputation by helping others), aspirations (when we are unsatisfied with the small payoffs we get by betraying defectors), and finally spatial games with local interactions. In the latter model, an individual who cooperates is surrounded by cooperating neighbors, which fosters cooperation, and the group of defecting neighbors gets lower average payoffs than the group of cooperating altruists.
Such models include additional mechanisms for promoting cooperation that were not present in the classic version of the Prisoner’s Dilemma and may have emerged in the course of evolution. Such mechanisms can be used in studies of the development of certain behaviors at a very general, philosophical level (or, in other words, from the mathematical perspective). In the vast majority of cases, studies of evolutionary models are not linked to any biological or social data or psychological experiments. Payoffs are arbitrary (like the aforementioned reduced sentences in the Prisoner’s Dilemma) and bear no relation to reality. Nevertheless, game theory may serve as a tool in the study of certain aspects of natural and social sciences, if we calibrate payoffs using empirical data.
Another interesting application of evolutionary game theory involves studying the possibility of incorporating mechanisms for promoting cooperation into the process of crafting legislation and the functions of public institutions. In a social dilemma referred to as the Commons Dilemma (or the “Tragedy of the Commons”), cattle herders choose a strategy based on the cattle they have. The best strategy for each player, regardless of the strategies adopted by other players, involves keeping as many cows on the pasture as he or she can. Such behavior results in a Nash equilibrium in which the players’ payoffs (daily amount of milk) are far from maximum as a result of overgrazing. However, if we impose an adequately calculated tax for placing the maximum number of cows on the pasture (thereby altering the matrix of payoffs), we will see that the altered Nash equilibrium in the modified game will secure maximum payoffs for all players despite the fact that they all have fewer cows on the pasture. Importantly, no one has to pay any taxes in this situation. Such a mechanism could be described as a kind of non-authoritarian coercion. As this example illustrates, public institutions should strive to eliminate real-life social dilemmas by introducing legal regulations that lay the groundwork for Nash equilibriums involving optimal payoffs.
The notion of introducing such regulations prompts several questions. Can we really be better people than the theory of evolution says we are? Can we guide our behavior at this stage of development to mitigate the effects of evolution and its theoretical predictions? Finally, can we beat evolution or steer it in a desired direction?
In Darwin’s theory of evolution, which describes the origins and demise of species, individuals do not make rational decisions. They simply reproduce, at a faster or slower rate. When studying human behavior, however, we must take into account the additional factor of free will, underpinning the rationality (or irrationality) of decisions. Instead of the thoughtless replications of Darwinian inheritance, therefore, we need to incorporate learning mechanisms into our scientific models. For that reason, the long-term behavior of biological and social evolutionary systems may be diametrically different.
How can mathematicians, by further developing game theory, help contribute to these fundamental debates? Evolutionary game theory models are dynamic models in which the population of players changes according to strictly defined rules. In order to describe Axelrod’s tournaments, for example, we could use a system of differential equations in what are referred to as replicator dynamics, which describe the time evolution of the frequency of individual strategies. How fast the frequency of a given strategy changes is proportional to the difference between the average payoff of this strategy and the average payoff of the entire population. And this is where Darwin’s theory again comes into play: if the payoff of a given strategy is greater than the average payoff of the population, its frequency rises. In the Prisoner’s Dilemma, the situation is clear: irrespective of the initial conditions (except for a population that consists exclusively of altruists), the population will be quickly dominated by defectors. In more complicated models, characterized by multiple strategies (such as tit-for-tat) or other dynamic rules (such as ones based on imitating the opponent’s strategy), mathematicians examine the properties of such models, seeking to answer specific questions. In what classes of models and for what parameters do the stationary states (ones that do not evolve any further) constitute Nash equilibria? Are they stable systems, or in other words, will the model return to a stationary state after deviation? Are cyclical behaviors possible? Do certain strategies tend to die out? Can multiple strategies coexist?
It is usually assumed that any interactions take place momentarily and their results are immediate. In reality, payoffs in evolutionary processes, which means changes in fitness, are delayed. In social interactions, people make decisions based on their knowledge of past events. Time delays may cause systems to oscillate around equilibrium points. Let us consider the replicator dynamics for the asymptotically stable equilibrium that describes the coexistence of cooperation and defection in a game called the Snowdrift. Let us imagine two travelers who cannot continue their journey by car, because the road is blocked by a snow. Let us assume the cost of removing the snow is 2 whereas the reward (for managing to reach home) is 4 . Each player can either cooperate by helping to remove the snow, or wait for the other traveler to do so. In this particular game, we obtain the following payoff matrix:
C 3 2
D 4 0
In this particular game, the best response to cooperation is defection and the best response to defection is cooperation. We have demonstrated that if players react with a certain delay to information about the previous state of the model in social models, then oscillations around the equilibrium point may occur when delays are sufficiently long. In turn, in biological models, where past events affect changes in present fitness, the coexistence of both strategies is stable for any time delay. Examining the stability of equilibria and the creation of cycles in the mathematical models of social dilemmas is an extremely important direction of scientific endeavors.
The third inherent component of Darwin’s theory, after the aforementioned selection and inheritance, involves random mutations. In order to deal with them, we need to introduce random (stochastic) elements into our models. We will then be studying stochastic processes – in the simplest case Markov chains, where the probability that the system will adopt a certain state depends on its state in the immediately preceding moment, not on the entire history of the system. In such situations, we want to know how probabilities of specific states change and whether they tend to any specific values. We also want to know the probability of cooperation in the long run. Defection very often turns out to be stochastically stable. Again, we look for additional mechanisms fostering cooperation, this time in an uncertain world that is subject to stochastic fluctuations.
As mentioned earlier, cooperation may be promoted by a certain spatial distribution of players. In spatial games, we define this distribution by positioning players on the vertices of graphs that form so-called social networks. In this case, players play two-player games with their immediate neighbors and their payoffs are the sums of the payoffs from individual games. Recent studies show that Barabási-Albert graphs are especially conducive to cooperation. We form such graphs by adding new vertices to a network one at a time and connecting each of these new vertices to an already existing vertex by an edge, with a preference for existing ones that already have a high number of edges that come out of them. Such preferential attachment of new connections makes popular vertices even more popular. After receiving payoffs, the players positioned on the vertices of the graph look at their neighbors and mimic the behavior of the individual with the best strategy in the previous round, while the probability that they will choose a worse strategy is very low. In the Prisoner’s Dilemma played according to these rules, almost all individuals cooperate after a sufficiently high number of rounds. Vertices with a very high number of connections, called hubs, play a major role in the strengthening of cooperation.
By introducing a fixed cost for the maintenance of a connection between neighbors into this dynamic, we used computer simulations to demonstrate that if the costs are sufficiently low, then almost everyone cooperates, but that there is a certain critical cost that will dramatically lower cooperation in the population to 20%. A further increase in costs does not result in cooperation being additionally lowered. What is more, for the critical-level cost, the share of cooperators in the population ranges from 20% to 100%. This is a phenomenon familiar from the statistical mechanics of many-body systems: a system at critical temperature may be simultaneously in two phases, a typical example being the coexistence of ice and water at the same temperature. The mathematical analysis of such “phase transitions” in social systems is a subject of very intensive research (also by the present author).
By drawing conclusions from the models they construct, mathematicians show what worlds are possible. As the case of evolutionary game theory illustrates, by engaging in interdisciplinary cooperation with biologists and social researchers, they can help to explore the nature of the reality we live in, reasons behind the occurrence of altruistic acts, and what needs to be done to encourage people to cooperate for the benefit of society at large.
Further reading:Maynard Smith J. (1982). Evolution and the Theory of Games, Cambridge University Press.
Malawski M., Wieczorek A., Sosnowska H. (2016, I ed. 1997). Konkurencja i kooperacja – teoria gier w ekonomii i naukach społecznych [Competition and Cooperation – Game Theory in Economics and Social Sciences]. PWN.
Miękisz J. (2008). Evolutionary game theory and population dynamics, Lecture Notes in Mathematics 1940: 269–316. Miękisz J. (2009). Dylemat więźnia a ewolucja [The Prisoner’s Dilemma and Evolution], Wiedza i Życie, 2: 58–60.
© Academia 1 (49) 2016