0:03

Hi folks it's time to talk a little bit now about some observations of actual

diameters in the world. And if you recall, we had we talked about

diameters of random graphs, of a particular form and we were finding that

for large enough graphs and degrees that weren't to large or too small,.

Log n over log d was an approximation of the average path link and diameter.

And let's do a rough back of the envelope calculations, so you can pull out your

calculators. Say that the world population these days

is somewhere between 6 and 7 billion, let's take 6.7 billion as an estimate.

And let's suppose that you just count you know, friends that you talk to on a

reasonably regular basis, so friends, relatives, let's take 50 for an average

number of people that people, somebody might talk to on a regular basis.

Now do log of 6.7 billion over log of 5, of sorry, log of 50.

What do you end up with. 6, so this is the six degrees of

separation that is often talked about the idea that, you know, to get from any

individual to any other individual in the world.

You actually don't need, a lot of hops. You can get there fairly efficiently.

so lets take a look at some data and see if, if, those kinds of numbers actually

are observed. And so what I want to look at is what's

known as the ad-health data, this is the ad-less of health data set.

was collected in the 1990s interviews of a bunch of high schools in the United

States. And there's network data for a lot of

these high schools, so people were asked to name friends, their friends and kept

track of their friends. And, you know, the, the schools actually

varied quite a bit in racial composition, the size of them.

Of the school, how many students are it in, and a bunch of other things.

So the networks have some variation and we can see whether the diameters in these

networks look like they log n over log d, that, that we found in the estimate.

And so let's have a quick peek at some data.

So this is the average shortest path And it's plotted for the a giant component,

versus log in for log d, and this is from 84 high schools for which there's a

fairly complete network data. And this is from work I did with Ben

Golub. And when you look at this graph so do we

have on the x axis. We have, this is the log n, so look at

the number of people in the high school, divide by log of the average number of

friends that they had in the network. And then here is the actual average

shortest path. Right, and if the theorem is true then

there should be, all of these points should lie on the 45 degree line and

actually remarkably close in terms of looking at real data.

the, the spread here that we get in terms of log n over log d, and average shortest

path fits fairly well. And you know indeed for the smaller

schools you have fairly shorter average path lengths and for the larger schools

you have larger ones. But they're matching up very match with

log n over log d, seems to be fairly accurate.

Now some other curious numbers that are out there in the world.

Erdos had a large number of co-authors. And 509 co-authors and he wrote more than

1400 papers in his life. and so people, mathematicians like to

count their Erdos number. So you count how many co-authors does it

take you to reach, how many links does it take you to reach Erdos.

So Erdos had a co-author, they co-authored with somebody else and so

forth you can find what your own Erdos number is.

Interestingly enough there was an auction in 2004 of a co-authorship with a person

named William Tozier. This was on eBay, his Erdos number was 4

and so, if you won the auction then he would put your name on a paper with him

so that would make your Erdos number 5. the winner paid more than a thousand

dollars actually to have the, have their name on a,

A paper with Tozier and end up with an Erdos number of 5, so that's just sort

of, an interesting curiosity. when we look at average degree, one thing

that's going to be important is that this says that as the density of the network

changes we're going to end up changing average path length.

And interestingly enough, networks do come in very different varieties of, of

sizes. So, for instance this high school

friendship networks on average 6.5 connections per individual of degree.

There's a paper by Bearman Moody and Stovel looking at romantic relationships

in some of these high schools. There people had, on average during a

time period about 0.8 of a relationship. you can look at, this is data from work I

did with Abhijit Banerjee, Arun Chandrasekhar and Esther Duflo on.

borrowing money, borrowing kerosene and rice from other individuals in, in small

rural villages in India, average mother of other households that you Bor, er,

given household borrows from 3.2. Various co-authorship studies, depending

on what you're looking at economics, biology, math, physics you see different

number of co-authors that people typically have, say, in a decade or some

period. Varying from, you know, just under 2 to

over 15.5, if people work in larger teams.

So you see different number of co-authors.

People always asks about Facebook. Facebook number about 120.

So you see different con-, connectivities in these graphs and that's going to lead

to different properties. So some of them are going to have

different, you know, average path lengths.

Other ones are going to have larger ones. And so whatever we're looking at a given

problem or given context. It's important to define the network

carefully, because these are going to have different properties, depending on

whether we're looking at a borrowing network, a collaboration network.

something like Facebook where, you know, you just have a friendship, means you

have a link to somebody else's page. well, and various other kinds of things.

or, you know, friendships, romances, there's a whole series of different kinds

of, of ties we might define. And they're going to have different

network properties.