Hi. In this lecture we're going to talk about replicator dynamics, and what we
want to do is we want to talk about it in the context of learning. And the idea here
again is pretty straightforward. What we imagine is, is there are some set of types
out there and these types are actions or they're strategies. And each type has a
payoff associated with it. That's how well that type is doing. When we look at them,
we think boy, these type 1's are doing really well. These type 7's are not doing
so well. And then there's also a proportion of each type. So maybe ten
percent of people are type one and 30 percent of people are type two, six
percent of people are type three and so on. So when you think about how people
learn, how do we do, well we've talked about a couple things in this class. One
is we realize that we've talked about how people just copy other people, and the
reason you might do that is because you might think that they're doing something
worthwhile. So in the standing ovation model, we just had people copying what
other people do, you don't have conformity models, people just copy what other people
do. Well if you copy, what's going to happen is you're gonna copy in proportion
what other people are doing. Now another thing you might do is you might hill
climb, that was one of our characteristics. If you hill climb, what
you're going to do is look and see which actions are paying off well. So when we
think about this sort of an environment, there's a whole population of people,
existed for proportions and they're getting different payoffs, we want some
way of capturing the dynamics of that process. And so what we're going to do is
we're going to introduce this model called replicator dynamics that's one way of
thinking about how that dynamic unfolds. Now let's again suppose you were rational,
So if you're rational you look out there and you see a bunch of types. You see that
there's different strategies people are playing: strategy one, strategy two and
strategy three. This one has a payoff of five, this one has a payoff of four, this
one has a payoff of three. You're just going to say, I'm going to choose strategy
one. I look out there and the strategy one people are doing the best. If that's a
rational model you could also have a more sociological model, this is a rule based
model. You say, 'I'm just going to copy the next person I meet figuring that if
they're doing this they must have chosen it for some good reason and I'm going to
pick in proportion to other people.' So now if I look at strategy one, strategy
two strategy three I can say, twenty percent of people are using strategy one,
70 percent are using strategy two, ten percent using strategy three. I'm more
likely to choose strategy two because I'm more likely to bump into someone using
strategy two. So there's two ways you could think about how people might choose
what to do. One would be to really do a detailed analysis of which actions seem to
be paying off the best, be rational and pick that action. Or another thing you
could do is you could be sort of just. I was wanting to know, I was just, and I was
starting to get photographic magazines and read and look. But I really wanted, wanted
to do it so I was doing it first, developing pictures in my. Question is how
do we do it. The ideas we're gonna wanna put weight one each possible action and
so, and we want that weight to include both the pay off, Which is this thing pi
and the proportion, which become probability of i. The one thing we do is
we could add those things up. We could say the weight is just the payoff plus the
proportion. Another thing we could do is the weight is the product. It's the weight
times the proportion. We do either one of these. What we're gonna do is we're gonna
use this one. We gonna use the weight is gonna be the probability that the
strategy's been used times its payoff. Why? Well, here's why. Suppose that you
had something that had a probability equal to zero. So nobody's using this strategy.
Well then there would be no way of seeing it, and so it would be very unlik ely that
they'd deduce it. But if you think about this model where weight, the weight to an
action is equal to the payoff plus the probability, if the payoff is really high,
it means the probability is low, people would use it. Well again we think of a
population of people, sort of copying rather than learning from other people, if
no one is using it. And you couldn't possibly think of it and you couldn't
possibly use. So we are not going to have any room for doing anything new. So if the
probability is zero, we're just going to assume that there's no one or anybody will
ever think of it. So that wipes out this model. We are going to assume the weight
is the product than of the path and the strategy and its proportion. So what that
means is we get a somewhat complicated procedure for figuring out how many people
use the strategy in the next period and this is going to be the replica of the
equation. So here's the idea, The probability that you play a strategy, N
period T plus one. So the probability of playing strategy on N period T plus one is
just the ratio of its weight to the total weight of all the strategies. Cuz remember
the weight is just the probably that somebody plays the strategy times its
payoff. And on the bottom, We're just summing over the weights of all the
different actions or strategies. So the probability you play something in the next
period. Probably someone's of that particular type is just gonna be its
relative weight. Okay, so let's do an example. We've got three strategies, each
one they have payoffs two, four and five, and they exist in proportions a third, a
sixth and a half. So what we want to do now is figure out the weight of each of
these strategies. So the weight on strategy one is its proportion, which is
one-third times its payoff, so that's two-thirds. The weight for strategy two is
its proportion, which is 1/6th times its payoff, which is four, which is also
two-thirds. And the weight on strategy three is its proportion, which is
one-half, times its payoff which is five, whic h is gonna be five halves. So what
you can is putting everything over six if we want. So this is four over six. This is
four over six. And this is going to be fifteen over six. So if we add up the
total weights what we're going to get is 23 over six. So now we want to figure out
what's the proportion they're going to be using strategy one in the next period And
period two plus one that's just going to be 4/6th over. 23 six which is four over
23. The property in E Strategy two is also four over 23, because it has the same way
in the E property Strategy is fifteen over six divided by 23 over six, which is
fifteen over 23. You notice if we add that up four plus four plus fifteen gives us 23
over 23. So, what we do is we start out with a population, The third strategy one,
A sixth strategy two a half strategy three and we end up with Four third twenty-third
strategy one, four twenty-third strategy two, and fifteen twenty-third strategy
three. That's how replicator dynamics works, it tells us how this population
moves over time as a function of the payoffs and the proportions. So here's
what we want to do. We want to apply this to games. So here's a very simple game. We
can think of this as the shake-bow game where shaking has a higher payoff and we
can ask, 'How do the dynamics change?' So what we're going to do is we're going to
assume that there's some population of people, some are shakers, some are bowers,
and we'll see how that population learns. Alright let's get started. So let's
suppose we start out with, one-half shakers and one-half bowers, that's our
original population. Now we want to know what's the pay off. So these are our
proportions. Well, the pay off, if you're a shaker, half of the time you're going to
meet a shaker and half of the time you're going to meet a bower. So if you going to
meet a shaker, you're going to get path of one if you meet a bower you're going to
get a path of zero. So your payoff is one. If you're a bower, half the time you're
gonna need a shaker, you're gonna get a payoff of zero. Half the time you're gonna
get a bower and get a payoff of one. So your payoff is gonna be a half. So now we
just have to figure out the weight for each of these strategies. So the weight on
shaking is just gonna be the proportion, which is one half times the payoff which
is one, so that way it's gonna be one half. The way on bowing is the proportion
of bowers, which is one half, times the payoff which is one, which is gonna be one
fourth. So what we get is, if not to forget them how many shakers of hours
[inaudible] in the next period, whats gonna happen is the probability of shakers
is gonna equal one half which is the [inaudible] of shakers over the weight of
shakers in the balance. So we're gonna get that's just two thirds. In the probability
of someone's a power is gonna be one-fourth over one-half plus one-fourth
which is one-third. So what we see is we started out with equal number of shakers
in powers and now moving towards more shakers but that makes sense because
shakers get a higher payoff. Now if we ran this a whole bunch of times and used
replica dynamics, And we started out with equal numbers of shakers or bowers,
eventually we'd end up with all shakers. So here's an interesting thing, we thought
about. Well, how do we model people? We said, well, we should model people as
rational. If we thought of rational people, we'd say, well then rational
people in this model would choose 2-2. So what replicator dynamics does, it gives us
another way to think about what you're gonna get in the game. It says, let's
assume a big population of people And let's assume initially that there's equal
numbers of each action. So there's equal numbers of shakers, equal numbers of
[inaudible]. And in this case sort of, let the population learn according to
replicator dynamics, and see what happens. And what we see is in this game.
Replicator dynamics would lead us to 2-2. So now if you want you say what's our
model of people, you say, 'We have two different models. One model's rational
actors, Rational actors are g oing to choose 2-2. Another model is people use
this simple rule; this learning rule called [inaudible] dynamics. And if we use
this learning rule called [inaudible] dynamics and unless we start out with a
whole bunch of hours, we're probably also going to end up With everybody shaking. So
that's great, it gives us another motivation for sort of, figuring out why
we're going to get the outcomes we're going to get. Well now we can ask the
question though, does repetitive dynamics always give us the same thing that we get
if we thought about sort of, super smart people playing the game. Well, let's see.
Here's another game and this is called the SUV/Compact game. So here's [laugh] how it
works. You can either drive an SUV or drive a compact car. If you drive an SUV,
your payoff is just gonna be two. Because you just drive your SUV, listen to the
radio, it doesn't matter. If you drive a compact car, and you run into someone
who's driving an SUV, your payoff is gonna be zero. And I don't mean physically run
in, but I mean if you're just driving [inaudible] and you see one, somebody else
has an SUV, there's two things going on. One is, you probably can't see around the
SUV, so that's bad. And also, you're gonna feel a little bit unsafe, so that's also
bad. But, if you're driving a compact car, and the other person is driving a compact
car, your path is gonna be three, cause you're both getting better gas mileage,
you can see around the other car, you feel safe, everybody wins. So if you're
thinking about rational people playing this game, you think, okay, what would you
want to do, you might think well look, three three's got the higher payoff, and
it's an equilibrium because if we're both driving compacts, then we have no reason
to switch. You'd think, that's what you get. Well let's hear what we get from
replicator dynamics. So let's start out with again, half the people driving SUVs
and half the people driving contracts, compacts. Let's figure out the way. On
SUVs. Well, half the people are playing, are driving SU Vs. And your payoff if
you're driving an SUV regardless of who you meet is two. So, the weight on SUVs is
just gonna be one. What about the weight on compacts? Half the people drive
compacts. Now, what's their payoff? If you're driving a compact half the time you
meet someone with an SUV. So that gives you a payoff of zero. And half the time
you meet someone driving a compact and that give you, so half the time you're
gonna get a payoff of three. So that means your total payoff is gonna be three
fourths. So the weight on SUV's is one, The weight on compacts is three fourths.
So now I wanna ask what's the probability that someone drives an SUV in the next
period, That's gonna be one. Over one plus three fourths, which is going to be four
sevenths? And if that's what the property is when somebody drives a compact, that's
gonna be three fourths over one plus three fourths, which is three sevenths. So, what
we see is we're gonna see a drift toward SUV's. And so, what we're gonna get in
this game is that people drive SUV's. [inaudible] We put replicate dynamics on
this game, we don't get 3,3, as an outcome, we're more likely to get 2,2 as
an outcome. And, what's gonna happen is evolution leads us to something that's
sub-optimal. So now we've learned something interesting, that these
replicator dynamics, this evolution of strategies leads us not to the optimal
thing which is 3-3 but leads us to 2-2. There's actually a book written on this
that's called high and mighty by Keith Bradshoe, where he talks about, why is it
that people drive these big SUVs if it doesn't make sense? Bradshoe makes and
argument that it's really through the evolution of choices that have caused us
to all be driving SUVs when we'd be collectively be better off if we were
driving the compacts. And it's evidenced by this picture here, you can see that in
the little car, you're sort of frightened by the big car and you can't see around
it, and so as a result, the dynamics, So led us towards big cars, even though we
collectively would be better off if we were all driving, Smaller cars. Alright,
what have we learned in this lecture? We've learned that one way to think about
what people decide to do is to construct a model based on [inaudible] dynamics. And
[inaudible] dynamics, they capture two fundamental social processes. One is the
fact that people are fairly rational. We copy. We take optimal actions. Another
thing that people tend to do, is we tend to copy other people. Cuz if we write down
a model where people do both, were we sort of try to do the thing that best but also
copy other people or think of combing those, we copy people who are doing well.
What we say is that's the way the sort of think about a population of people might
go. Now we saw in some cases like the shape [inaudible] game that's going lead
us to make the optimum choice. But then we saw other games, like the SUV compact car
game that in fact, it didn't let us choose something that is safer the SUV, than not
the thing that we choose if we were rational and sat back, and said what's the
best choice here. So [inaudible] dynamics really interesting way to model learning
and give us sort of surprising insights into some games what is likely to happen,
different insights then we get if we assume people were quote UN-quote
rational. Alright, Thanks.