Hi, In the Sun Lectures we're talking about prediction, And here's the idea, we
want to think of individual people make prediction based on models. Those models
can be based on categories or linear models or Markov models, any of the models
you've learned in this class you could use to make some sort of prediction. And what
we want to talk about is where collective wisdom can come from. So if we have a
whole bunch of people using a whole bunch of different models, how does that enable
the crowd of models to do better at making sense of the world, making an accurate
prediction? Now, the essence of our argument is going to be something called
The Diversity Prediction Theorem. And, the Diversity Prediction Theorem is gonna
relate the wisdom of the crowd to the wisdom of the individual. So, in other
words the accuracy of the crowd, in relationship to the accuracy of the
individuals. Now one logic that should come to you right away is if I had more
accurate individuals, I should also get a more accurate crowd. But a logic that
might not come to you right away is that if I had a more diverse crowd, I should
also get more accuracy. So, we can think of things like the crowds accuracy is
going to depend on the individual accuracy Plus The diversity. Now the question is
how much do these things matter. How much does it depend on, individual accuracy and
how much does it depend on diversity. That's why we wanna use a model, to figure
out. So let's first do an example just to get some bearings, some inner bearings in
terms of what it means for a crowd to make a mistake versus an individual to make a
mistake and what diversity is. So we'll do an extremely simple example. We have three
people, Amy, Bell and Carlos. And let's suppose that they're picking the number of
people that come to our diner on a particular day for lunch. And so Amy
predicts it's gonna be ten. Val predicts it's gonna be sixteen and Carla predicts
it's gonna be 25. Now if I add these up, I'm gonna get 51 and divide by three, I'm
gonna get an aver age value of seventeen. So the crowd predicts seventeen. Now let's
suppose the actual value is eighteen. Now again, if, let's suppose the crowd is
pretty accurate, it's not gonna matter but this is just for the purpose of the
example, I'm gonna make it so the crowd actually does pretty well, I just wanna
work to the logic. So the first thing I'm gonna do is I wanna figure out how
accurate are these people. Well what I can do is I can compute the error of each
person. So remember the, the true value was eighteen, that's the number that
showed up. And I can ask what's the error of each individual? And remember we
computed errors by looking at variations, squared error. So INU predicted ten, the
truth was eighteen, so her squared error is 64. Belle predicted sixteen, The true
value is eighteen, So her score there is four. Carlos predicts 25, The true value
is eighteen, So his score there is 49. And if we add all those up, I get 117, And if
I divide that by three, I get the average there that's 39. So on average, these
people are off by 39. Some people, Belle, are really accurate, She's only off by
four. Other people, Amy, is off by quite a bit. She's off by, her error is 64, But
the average is 39. So this sort of gives us a sense of how accurate the individuals
are. The individuals off, are off by an average of 64, four and 49, for an average
of 39. Now we can ask how accurate was the crowd? Remember the crowd predicted
seventeen, because that was The average prediction of the three people. The
[inaudible] is eighteen, so we get the crowd was only off by one. So the crowd
here is, if you notice, better than anybody in it. So, we get "the wisdom of
crowds." Well, let's try and think about. Why that makes sense and to do that we're
going to look at diversity. So diversity [inaudible] is the variation in the
predictions. So how do we do the variation in the predictions? We look at each
person's prediction and its distance from the mean prediction not from the true
[inaudible], the mean prediction. So the mean predicti on was seventeen so. Amy's
contribution to this sort of total variation of predictions is ten minus
seventeen squared, which is 49. Belle's is sixteen. It should be seventeen minus
seventeen squared, which is one, and Carlos's is 25 minus seventeen square
root, which is gonna be 64. Now if I add all these up, I get 114, And if I divide
by three, I get 38. So the diversity of these predictions is 38. Well notice this.
The crowd's error was one. The average error was 39 and the diversity was 38. So
I look at that, I get one equals 39, minus 38. The crowd's error in this case equals
the average error minus the diversity. But I just Set this up, What turns out That's
always true. This is what the diversity prediction theorem says; That the crowd's
error equals the average error minus the diversity. Now, this isn't some, you know,
feel good setting, This is a mathematical fact, This is an identity. So no
assumptions have to be made here. This, there's no opposite [inaudible]. This is
just true. If I have a set of predictions, it will always be the case that the error
of the crowd to the average errors. Squared error, the average prediction
squared error is going to equal the average squared error of the people in
that crowd, minus the diversity of their predictions. Now the way to write that
formally is like this, Now this looks pretty scary, but let's just walk through
it. So let's let C be the crowd's prediction. Data be the truth, so data is
equal to the true value, And so this [inaudible] thing. This is the crowd
square [inaudible]. So it's the distance from the crowd to the truth. Let's let SI
here equal individual I's prediction So individual I's prediction. And so we're
gonna get... This is I's prediction minus the truth squared. And then we sum that
all up over all the individuals, and we divide by the number of individuals. So
that's just gonna be the average error. So crowd [inaudible] equals average error
minus... Now we take each person's prediction minus the crowd's prediction,
which is C. Remem ber, because C is the crowd. So this tells us how far people are
from the crowd on average. We sum those up and rate divide by N, So we get the crowds
there equals the average airlines diversity. Now if you take this equation
and expand all the terms and cancel everything out you'll see that it's an
identity. It's a mathematical identify. So it's always true. Crowds there equals
average year minus diversity. Let me give a famous example to sort of drive this on.
So in a book called The Wisdom of Crowds by Jim Surowiecki, he talks about the 1906
West of England Fat, Stock, and Poultry Exhibition. At this exhibition, 787
people, Guess the weight of a steer. Their average guess was I think, 197 pounds; the
actual weight of the steer was 198 pounds. So they're only off by a pound. So you're
looking at it and say, oh my gosh that's amazing, that's the wisdom of crowds. But
let's think about it, what's going on? We've got a bunch of predictions, there's
a true value, there's an average value, our theorem, this thing, this [inaudible]
theorem must hold. And, in fact, if you take Galton's data. And you plug it all
in, here's what you get. The crowd's error is actually a little bit less than a
pound, it's.6. The average error is 2956. Now, wait, that seems crazy, 'cause
remember, the steer only weighs 1100 pounds. So if this thing weighs 1100
pounds, how could they be up by 2956? Whenever these are squared errors, so if I
square 50 I get 2500 and if I square 60 I get 3600. So this is probably 55, 56
squared. Something like that. Well, that makes sense because people could probably
guess the weight of a steer within about 55, 56 pounds. Well, why is that? Well
think about it. A steer's five times the size of a person. If you can guess the
weight of a person within about ten pounds, you can probably guess the weight
of a steer to about 50 pounds. So what you've got is you've got some sort of, you
know people are reasonably good at guessing the weight of steers. They're not
geniuses, but they're also not crazy. They're not guessing 15,000 pounds. So
these are reasonably knowledgeable people who for whatever reason are you know
making these errors of about 55, 56 pounds. Not it's interesting is that there
diversity is 29 55. So, what you get is. The crowd is wise because they're
moderately accurate. I drop by 55, 56 pounds and they're are also diverse and
it's that accuracy plus diversity that makes the crowd do so well. Now if you
think about this book, the wisdom of crowds, see we can get a bunch of
examples, well that's the case. Let's think about in the context of our theorem,
so we've got crowd air, equals average air, minus diversity. Now in this book,
Sir Wiki says, here's what matters, diversity matters a lot. Well why does
diversity matter a lot f we're looking at the wisdom of crowds, let's see, this
actually, the math will tell us why. If you make it into the book, the wisdom of
crowds, what has to be true? This has to be small. The collective area has to be
small. So if the collective area isn't small it doesn't make the book it's not
the wisdom of crowds, it's the madness of crowds. So for the wisdom of crowds to
exist this has to be small, collective area has to be small. Let's think what
else has to be true, the average air has to be fairly large, why does it have to be
fairly large? If the average air is small, that means it was a easy thing to predict,
everybody can pretty much get it right. So if it's interesting enough to make a book
called the wisdom of crowds, where the crowd is smart and the people aren't, if
the people are not smart that means the average area has to be large. Well if you
got something small equal something large minus something else. This other thing has
to be large, Which means diversity has to be large. So when Surowiecki walks through
all these examples and he looks at what's going on, he says, look there's a lot of
diversity. And diversity seems to be a key component in the wisdom of crowds. You
want to encourage people to think about the world in different ways if you wanna
get the wisdom of crowds. And our model explains why that's the case. It's
collective error equals average error minus diversity. If people aren't that
smart, average error is gonna be big. If you want the crowd to be smart, the only
way to get it is by having that crowd be diverse. So if we look at the first
example of [inaudible]. Where we get 0.6, 29 56 and 29 55. We see that's exactly the
case, small crowd error, you know fairly large average error because it's not an
easy thing to do and then high diversity. And if you take examples of wisdom of
crowds from all over the place, you'll see they all look like this, they look exactly
like this, Small crowd error, large individual error, large diversity.
Question is, how do you get how do you get that diversity? Well you get that
diversity by people using Different categorizations. Different that your
models. Maybe people using entirely different models. Maybe, one person's
using a mark off model and one person's using a diffusion model. Maybe one
person's using a linear model. Maybe one person's got a non linear terminal model.
There is a lot of different variables. So, what you get is this how originating what
we see the world. In the boxes we use and the variables we use, and the models that
we construct. I would give Diversity to these collective predictions. In that
collective prediction, those collective predictions, then, lead to accurate
crowds, provided you've got reasonably accurate people who are reasonably
diverse. And what we've learned is by constructing a very simple model of that
predictive task, where the wisdom of crowds come from. And we've learned that
individual ability. And collective diversity matter equally, they're equal
partners. So if someone were to say to you, where does the wisdom of crowds come
from? You could say, well it comes from. You know, reasonably smart people who are
diverse, And you could also ask, where does the madness of crowds come from? How
could it be that a crowd could get something totally wrong? Well, that's not
har d either; cuz crowd error equals average error times diversity. Well, if I
want this to be large. I want large collective air, then I need large average
air, cuz I need people to, on average, be getting things wrong, and I need diversity
to be small. So the [inaudible] of crowds come from like-minded people who are all
wrong, and again, the equation gives us that result. Alright, thank you.