Finally, uh, the gossip protocols

that we've discussed so far are not topology aware,

so, uh, when it comes to a hierarchical topology,

whether it's, uh, on the internet using subnets

or whether it's on-in a data center using, uh, racks,

uh, the core switches and routers

might get overloaded quite a bit.

So, for instance, consider a scenario

where I have two subnets, perhaps two racks;

uh, the top subnet and the bottom subnet

each with n/2 nodes, uh, from the group.

Since each node selects a gossip target uniformly at random,

about half the gossips are gonna go across from one subnet

to the other, okay, and also from the bottom to the top.

This means that the router sees, uh, n/2,

actually b*(n/2) gossips go out per, uh, gossip period

and so the load on the router

is going to be O(n), which is very, very high.

So, the fix to this is to do the following.

You have gossip that, uh, prefers nodes in your own subnet

with a high probability

and nodes outside your subnet with a lower probability.

If your subnet contains ni or n_i, uh, nodes, you, uh,

gossip, uh, um, uh, outside your subnet with probablity 1/(n_i).

With probability 1-(1/(n_i)),

you gossip within your own subnet.

So, this means that within your own subnet,

because the probability of gossiping

is still very close to one,

ah, it's gonna spread in O(log(n)) time,

and this is true for any one of these subnets.

And then after a subnet has been completely infected,

uh, since everyone gossips outside with probability 1/(n_i)

and there is n_i such nodes gossiping outside,

it takes O(1), uh, rounds on expectation

for the gossip to go across the router to the other, uh, subnet

and for someone in the other subnet to get infected.

After that, again, it's gonna take O(log(n)) rounds

to spread within that subnet itself.

So, overall it's still O(log(n)) +, uh, O(1) + O(log(n)), uh,

for it to get across

and so it's still O(log(n)) rounds dissemination time,

and instead what we have done

is that we have re-reduced the load on the router to be O(1).

Why is it O(1)?

Well, uh, because essentially you have n_i nodes in the-

in one subnet and since each of them

sends a gossip out with probability 1/(n_i),

the, uh, expected number of gossips going through the router

is (n_i*1)/(n_i), which is O(1).