0:01

The second view point of DPC is game. Just like optimization, a game can be used

as a common English word. But in this case, we got a very

unambiguous, formal definition of a game. Now, power control clearly is a

competition among the different transceiver peers.

Now, we can model competition as well as cooperation with games.

So, the kind of formal language of games that we will be looking at, will extend

our intuitive understanding, where we have a set of players, and players, and

transceiver peers in this particular case, and each has a certain strategy set.

It's a set of possible strategies that we can play.

And then, there is a certain payoff function that says, if you play a certain

strategy and other players play other strategies, then these will lead to a

particular reward coming to you or a cost coming to you.

So, we define a game formally as a three tupple, so one, two, three things.

First, is very simple, set of players. So, index to buy i, one, two, three up to

N. And then, for each of these players, say,

index by i, we have a certain set, A sub i.

And there are many possible elements that could be living in this set.

In each of these elements, is a possible strategy, for example, in soccer, if I

have a goal here and a goal keeper, okay, and a goal keeper thinking should I go

left, should I go right to catch the ball, that would be a strategy set with two

strategies. And then there would be a payoff function.

And this usually maps some element in the set into a particular real number, say ,

five or, you know, -three and so on. And if these numbers model something

desirable, then it's called a payoff function, or in general, a utility

function, as we will see in future chapters.

If this representing something bad, undesirable outcome then will be a cost

function. So, of course, you would like to maximize

your payoff function or minimize your cost function for, for each player, varying

only the strategy that is in your own control that is vary pick a particular

point in this set. For example, the goal keeper going left

2:49

and the ball turns out and come over here then, you lose.

And this may be, you know, a -five, say, of [unknown] of cost.

Whereas, if he catches it, then you would get a reward of, say, ten.

So, that is the function in that map. The strategy that you choose to number.

Of course, it's determined, in part, by your decision, but also, in part, by

whoever is kicking the ball. So, there comes the coupling of

everybody's strategy. Your strategy and her strategy, together

determine the outcome. Now, we're going to later come back to our

power control and view it as a game. But first, let's take a look at two simple

but very famous examples. Perhaps, by far, the most famous example

of simple game theory is the so-called Prisoner's Dilemma.

It's a two-user game, modeling competition, and each user has two

possible strategies. So, we can plot this in a two-dimensional

space, in a table, on a piece of paper with two columns and two rows, okay.

Let's say, the two rows represent the two strategies for player A, and the two

columns represent the strategies for player B.

And players A and B are suspects of a crime, and the police is going to

investigate into this crime and examine both of them individually.

So, they are going to know the entries in this table later on, but they cannot

communicate, because they are being questioned separately.

And let's say, they have two choices to make.

One is you do not confess, no confess. The other is you confess.

Similarly, B can also say, I don't confess or I confess.

So, there are, altogether, four possible outcomes.

For each of these outcomes, jointly determined by A strategy and B strategy,

there will be a pair of numbers. The first number denotes the payoff to A,

and the second, the payoff to B. So, if they both decide not to confess,

then the police can only charge them with a lesser crime.

And they will each get, say, one year in prison.

Whereas, if they both confess, then, they will be sent to prison for longer terms,

say, three years. Whereas, if one confess, but the other not

confess, then the one that confess will be rewarded and walk away free for

cooperating with the investigation, and the other one will be left with, most

severe penalty of, say, five years in the prison.

And this is a symmetric game so we also have, this cell filled out exactly like

this one. Okay, B confess, then she walks away with

nothing, zero years. And A, do not confess, then she will be

getting five years, okay? So, this defines the game, the Prisoner's

Dilemma game. Because there are two players, each has a

strategy set consisting of two possible strategies.

And this table quantifies the payoff for each player, jointly determined by their

respective strategies. So, let's take a look at this famous

Prisoner's dilemma. Why is it called dilemma?

Because, imagine you are, suspect A and you say, I don't know what B will do.

But suppose, she chooses not to confess, then what should I do?

Well, if I don't confess, I guess my [unknown] is zero so I should pick,

confess. We call this the best response strategy by

player A in response to player B's picking no confession.

But B may also choose to confess. I don't know.

And, in that case, what should I do? Well, I'm basic comparing -five and -three

here, still confess is the better choice relative to -five, -three is better.

Therefore, confess is also the best response strategy if B chooses to confess.

Now, B can only choose between two choices.

So, A would say, hey, I have the same best response strategy no matter what.

That is called a dominant strategy. If it is the best response strategy for

all possible strategies by the other players, in this case, that is confess.

Now, by symmetry B is thinking in exactly the same way.

And B will also pick confess as the dominant strategy.

And therefore, the outcome of this game will be that both A and B will confess and

get three years each. What's wrong with that?

Why is it called dilemma? Because, they could have got, one year

each. If you think about a summation of the

payoffs and you want to, for each i, maximize this quantity, then -two would be

much better than -six, except, A and B cannot collaborate or communicate.

So, in their each strategization process, they end up picking a socially sub optimal

solution. And that's why it's called a dilemma.

8:53

So, in what we just talked about, we have defined the best response strategy in a

game and the dominant strategy in a game. Now, there may not be a dominant strategy

as we'll see right away. This is another famous example called the

coordination game. Suppose there are two users, again, you

and your friend. And one might decide that maybe you should

go watch a movie. And let's say, there two kinds of movie,

action and romance. So, again, each player has two choices in

action strategy set, Action versus Romance movie.

For you player A and also for player B, two strategies, action or romance.

Now, suppose you two decide different kinds of movies, then there is no

agreement, and you decide not to watch a movie and you get zero payoff.

So, both of you, okay? This for you, this for your friend, this

is for you, this is for a friend. But suppose you both pick the same kind,

then you get some strictly positive return.

Except the return's value are different, to you and to your friend.

Say, you prefer action so if you two both agree on action movie, then you get two

units of reward and B only get one unit. And conversely, let's say, your friend B

likes romance movie more than action movies so if the outcome is going to see

the romance movie then, B gets two, and you get one.

So, obviously, you both have the incentive to try to reach some kind of consensus,

okay? But you still would like to compete,

because depending on what kind of consensus it is, you'll be getting

different kinds of payoffs. So, again, now I've defined a game, a

coordination game. The set of players, which is A and B, for

each user, there is a strategy set, two possible strategies.

And the payoff functions basically map action, action to two for A, one for B.

Maps romance, romance to one for A, two for B.

And it maps romance, action or action, romance to zero for both players.

That's the payoff function. So now, we have completely specified this

game. Let's understand the behavior here.

Now, clearly there are best response strategies for A.

If B chooses action, then the best response is action cuz two is bigger than

zero. But if B chooses romance, then A's best

response strategy is romance because one is bigger than zero.

So now, you see that there is no dominant strategy.

Because the best response strategies for different B's choices will be different.

Similarly, B does not have a dominant strategy either.

However, we see that their best response strategies actually click the match.

What do I mean by the match? Well, if A chooses action, then the best

response by B with respect to that is action.

Similarly, if B chooses action, then the best response would be, with respect to

that, for A, is also action. So, the action, action strategy choices by

A and B is one that matches. Same story for romance, romance.

So, you see that these, these two best responses are actually clicking with the

two best responses respectively by B. And in general, we can define what we call

the Nash equilibrium. Basically, it is an equilibrium that

describes strategic players, unilateral lack of incentive to move away from this

choice. Notice this word, equilibrium, is not the

same as the equilibrium we saw earlier which is used to describe the convergence

of iterative algorithm to a particular point.

In fact, we'll later see yet another notion of equilibrium.

So, more precisely, let's say that there are two users, okay?

And user one got the utility u1, user two got the utility u2, and user one got a

strategy set script A, and user two got strategy set script B.

13:49

Now, we say that a particular point called an a star in a particular point b star in

A and B sets respectively, is called a Nash Equilibrium, if the following

definitions are true. First, the first users utility at this

point a star, b star will be no smaller than the utility where user two sticks to

b star, and user one is free to pick any a in the set of possible strategies, that is

for all a in the set A. This simply says that, hey, if the other

player doesn't move away from b star, I have no incentive to move away from a

star. End, that's the key word, end, at the same

time the other user feels the same way. You, too, at this, a star, b star, is no

smaller than the utility for user two. If a star remains there and b is free to

be chosen anywhere in the set, okay? So, this b can be any point in the set.

It doesn't matter. What it says is that for user two, again,

I have no unilateral incentive to move away from b star.

If the other player sticks to a star, I'm going to stick to b star.

And the key thing is that both states, statements must be true for the single

point a star, b star. That's what we mean by matching best

responses. And if so, we call this a star, b star

point, a Nash equilibrium. It says, there's no unilateral incentives

to change my strategy away from a star, b star.

So, in this case, there are two Nash equilibria.

The action, action is a Nash equilibria, because neither a nor b has a unilateral

incentive to move away from this choice. And there's another Nash equilibrium,

romance, romance. And back to the Prisoner's Dilemma, this

confess, confess is the unique Nash equilibria.

Even though both realize it's better to go there, but there's no unilateral, that's

the key word, incentive to move away from this point.

Basically, A says, if you don't move, I don't move.

B says, well, if you don't move, I don't move.

And they both have stuck in this socially sub optimal unique Nash equilibrium.

So, we see that Nash equilibriua may not exist, both examples we just saw, they

exist. It may not be unique, if exists, we saw

the examples with two Nash equilibria. And, none of the Nash equilibria might be

socially optimal, maximizing the summation of payoffs, or, even be Pareto optimal.

However, by a very fundamental research by John Nash, in his PhD dissertation at

Princeton, he showed that as long as you allow each player to flip a coin and

probabilistically decide which strategy to choose, then in that so-called mixed

strategy scenario, there will always exist at least a Nash equilibrium.

17:46

Now, back to our power control problem. Distributed power control is competition

and competition can be modeled as a game, so what kind of game are we talking about

here? Well, let's identify the set of players

first. Well, that's the logical link, or the

transceiver peer there at the end of that. Let's identify the strategy space here.

It becomes a little more complicated. In the two examples we just saw, the

strategy space is pretty simple. It's just two choices.

But here, is a continuum of the power levels for each user i, such that the

target SIR gamma is achieved. Now, as you can see, the strategy space

said, so defined, is a set of Ps, thus, determined, in part, by what the other

users' Ps might be cuz that would determine how easy or hard is it for you

to achieve the target SIR. So, it's through the coupling of strategy

space instead of the coupling of payout function that we see the interference

phenomenon reflected in the power control game here because the payoff function is

very simple, you basically want to look at a cost function, that is Pi and you want

to make this small. So, that is the definition of the game.

And here are two simple, but important facts.

Fact number one, you can easily to verify for yourself, that the DPC algorithm, what

is the algorithm again? It says, that to transmit power for each

transmitter i at time t + one is the current transmit power times the ratio

between the target SIR and the current SIR at time t, and this goes on for all the

users. This actually is the best responsive

strategy in response to all the other players' strategies.

Whatever they might be, they are reflected through the SIR value.