Hi, this is Mat again. And now we're ready to start talking
about how we can represent networks and talk about their properties.
And in terms of simplifying the complexity that we're facing in
describing networks we're going to think about different kinds of patterns that
might be. So one is going to be that there's global
patterns of networks. So overall big picture items.
So, how are the different constructiveness of individuals
distributed through the society? Are there some people who are really well
connected and, and serve as hubs and other people who are not so well
connected, or is everything very evenly distributed, so that's something.
Path links. How far does it take to get on an average
from one node to another. There's going to be segregation patterns,
so if we begin to think of nodes as having characteristics, say if people's
ages, their genders and so forth, do we see separations between nodes of
different types. Or are things fairly well integrated.
local patterns. do we see tight clusters of nodes that
are all tightly connected to each other, cliques.
Do we see that if I'm connected to somebody else and they're connected to a
third person that I also tend to be connected to that third person.
Transitivity. do relationships have friends in common?
Are they supported? so we'll look at local patterns, little,
you know, zooming in on particular parts. Well also talk about nodes positions in
networks so how central is somebody, how influential are they, what does their
neighborhood structure look like? So there will be different ways that
we'll be talking about networks overall and we'll also be looking at nodes in
positions in networks and other kinds of things so there's going to be a series of
definitions that we have to go through to keep track of these things.
So the basics in terms of representing networks in, there's going to be some set
of nodes, vertices, players, depending on what literature you're looking at these
will have different names. I might go back and forth between the
names and the course. Sometimes I'll call them nodes, sometimes
I'll call them agents, sometimes I'll call them players, vertices, etcetera.
But there's, tend to be some finite number, little and will be our typical
characterization of how many there are. And one basic way of representing a
network is just by what's called the adjacency matrix.
So it's going to be a matrix of zeroes and ones so an n by n matrix, and what it
indicates is 2 nodes connected to each others.
So gij equals 1 indicates a link or a tie or an edge between 2 nodes i and j.
Now generally in less otherwise stated in the course we'll be dealing with ones
which are not directed, so if i is connected to j, than j is connected to i.
So if we're friends with each other, we're both friends it's a mutual
relationship. Or if we're allies with each other, we're
both allied to each other. It can't be that one country is ally to
another, and not vice versa. So we'll tend to look at it that, that
way and we'll tend to, to work generically with zeroes and ones as, as
the structure of it. So there's either a relationship or not.
But we could allow for directed networks. We could allow for weighted networks.
so when we looked at the financial relationships, so the amount of debt that
was held inside one country of the sovereign debt of another country.
That's going to tend to be a directed network and it's going to be a weighted
network. So it's going to have different
intensities on things. when we think about an alternative
notation it's going to be very useful for representing networks.
Sometimes we'll just use instead of thinking of an adjacency matrix we'll
think of, of representing a graph or network.
By just listing all the relationships that are present.
So a notation will, will have, is it will say that i and j is in g, if that means
that there's a link between nodes i and j.
So very simple, succinct notation will just keep track of the sets of links that
are present and depending on whether these are ordered or not ordered they'll
be directed or not directed. so a network is going to be a pair of a
set of nodes. And either an adjacency matrix, or a list
of all the links that are present depending on the particular context.
Which was it's easiest to represent the network.
Okay. so, basic definitions.
1 thing we're going to want to keep track of is, sort of.
ways of navigating through a network, other known as walks or paths.
a walk in a network from, say, node i1 to node, ik, is going to be a sequence of
nodes. i2, i2, blah, blah, blah, through ik, and
the sequence of links, i1 is connected to i2, i2 to i3, and so forth, ending up at
k, such that each one of those links is in the network g, okay?
So a walk in a network from one node to another is a sequence of links that take
you from that first node to the last node.
so [INAUDIBLE] often in, in this kind of setting it's just going to be convenient
to represent it just as the corresponding sequence of nodes such that we know that
each subsequence pair are connected in the network g.
So in terms of a picture well before, before I go to a picture, let's also talk
about paths, cycles and, and geodesics. So a walk is going to be some set of, of
links, each connecting to another node but they can possibly cycle back on each
other. A path is going to be a walk where each
node is distinct. So we start at i1.
We go to a new node, i2. Then we go to some node, i3, which is not
in the previous sequence and so forth. Until, eventually, we reach ik.
A cycle is going to be a walk where we end up back at the node that we started
at. So the n node is the same as the starting
node. one other term that's going to be very
useful, geodesic. A geodesic is a shortest path between two
nodes. And shortest means we just count how many
links there are in that path and we want to find what's the fewest number of
links we need to go from. Node say 1 to node 7.
How many links do we need to get from one to another?
So in terms of pictures here's a set of networks on 7 nodes and the first one
represents a path and a walk. From node 1 to node 7, right?
So we go from 1 to 2, 2 to 3, right? So we start 1 to 2, 2 to 3, 3 to 4, 4 to
5, 5 to 6, 6 to 7. Okay, so that's that's going to be a path
which is also in this case, a walk. if we look at a walk that's not a path we
could have from 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 back to 3, so now we've hit 3
twice and then gone to 7. So that's a walk that's not a path.
And if we wanted the geodesic in this case from 1 to 7.
Then the shortest path would actually have gone just 1, 3 to 7, right, so this
is a path which is longer than a geodesic.
When we look at cycles, 1 to 2, 2 to 3, 3 to 1, that's a cycle, it's also known as
a simple cycle because it just, the only node that repeats itself is the first and
last node. And a cycle and a walk, which is a more
complicated one, is we could go from 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 back to 3,
back to 1. We cycled, but we actually cycled through
3 and 1 in that case. So it's a more complicated cycle.
So these are different terms, we'll talk about cycles as we go through.
They're going to be important. Paths are going to be important.
Walks are going to be important,. So these are distinct items in that works
and it's going to be important to keep these definitions and this terminology in
our heads as we go through. So if we want to count how many walks
there are in a network. Let's look at a simple example here on 4
nodes. So this is the network we're looking at.
So 1 is connected to to 2 and 4, 3 is connected just to 4, 4 is connected to
everybody. So that is represented here, in this
particular adjacency matrix. And generally, we're not going to have
nodes connected to each other. So when we look at the diagonal, we'll
have diagonal being 0's. there'll be some applications where it's
going to be useful to have nodes connected to each other, or to
themselves. For instance, if.
If we're doing learning, I might be able to learn from myself that will be one
application where we'll have non zero entries on the diagonal.
But generally we're going to have situations where this will be the kind
of, matrix we're looking at. Now if we square this matrix, so we raise
it to the second power, so what we're looking at here is, so there's a
relationship between 1 and 2 captured by this entry right here of a 1.
And so that indicates that 1 is connected to 2.
And so if we ask how many ways there are, how many walks of length 2 there are from
i to j. Then, the, when we square the matrix,
that actually gives us the answer of exactly how many walks there are of
length 2 from node, whatever to whatever. So this is the number of walks of length
1. This is the number of walk, walks of
length 2. If you take it to the 3rd power, you'll
get the number of, of walks of length 3. And here it says that there's two
different walks of going from node 1 to node 1.
I could go up to 2 and then back. I could go to 4 and back, and you get
that by looking, so why is the 2 coming out in this part of the matrix?
When you multiply the matrix it says I could go from 1 to 2 and then 2 back to
1. So that gives us one entry.
If I could go from 1 to 4, and then 4 back to 1, and when you multiply the
matrix, it'll pick up the 1 times the 1 plus the 1 times the 1.
It gives us a 2 in this particular entry. So when you multiply the matrix, it gives
you how many walks of different path times.
So now we've got you know path of length 2 from each node to each other node.
so it's impossible to get from node 3 to node 4 in a walk of length 2.
All the other ones you can get by actually except, and also from 4 to 3.
But all the other ones are possible. And if we keep raising these to powers.
Then we end up with you know, how many walks of length 3 from node i to node j?
And so forth. And so this is a very useful thing to
keep track of because it's going to help is in defining centrality, it's going to
help us in keeping track of diffusion processes, and other kinds of things
where we look at information moving subsequently from node to node in a
network. Okay.
So another thing that's going to be very important to keep track of in a, in a
network is its component structure. What are its components, these are the
connected sub graphs that make up a network.
So we'll say that a network N, g is connected if there's a path between every
two nodes. And a component of a network's going to
be a maximally connected subgraph. What do we mean by a, a maximally
connected subgraph? We're going to mean that the we look at a
subgraph that's a subset of nodes and a subset of the, of the links in the, the
network, so that we have those being, of course, bonding subsets of the original
nodes and original set of links. We want this to be path connected.
So from every node i in N prime, and every node g in g in N prime there exists
a path in G prime connecting those 2. So this is a connected subgraph.
And if we look at somebody, some node who's in N prime and any link in the
original network then any link that we find in the original network if there's
some node j at the end of it, then ij has to be in g prime as well.
Okay. And so what does that, what does that
mean? in terms of.
A picture when we look at components then this part of the network is a component.
But this part of the network is not a component.
Why is it not a component? It doesn't satisfy that last part of a
definition. Because we've got 5 in there, 5 is
connected to 1 and yet we didn't include 1 in our, our little picture here.
So we have to include that and, and so it's a maximally connected subgraph that
would you have include 1, 4, 3 and 5. And it would have to include all the
links there. So component is going to be a maximally
connected subgraph. Another component here it would be 2,
another component 6 and 10 together with air link And then, 7, 8 and 9 together
with air links. So this is a network with four different
components and we can keep track of those.
So a simple concept that's going to be useful in understand how, again,
diffusion works, learning, etcetera, in understanding how separate the network is
in terms of different component structures.
[BLANK_AUDIO]
Okay, so that, that does it for this lecture.
our next lecture is going to start looking at path lengths and diameters.