. So here's a backup envelope calculation first . So suppose we look at two possibilities of multicasting of files for fixed size, okay? The fixed size is f, bits, and then we look at the first possibility, which is going from the server to all the clients. So these are the clients, one, two, three, four. Each client has a certain download speed D1, D2, D3, D4. And then the server need to upload this and has a total of use of S as the upload speed. Okay? So now we have to look at the time that it takes to finish multi casting this file of fixed size f abits. First of all, I have to make sure that even the slowest to down loaders among declines, can finish receiving. So let's denote the slowest down loaders, download speed by D sub mean. The minimum one, and he has to finish down loading a file size of F. So there's the amount of time it takes, and second the server need to multicast this. Therefore you need to do it N times, in this case N equals four, but in general N is the number of clients. So you got to look at the time it takes to send out N times F bits using the speed of U sub S, the upload speed of the server, so U sub S, for server. And these two numbers. One of them maybe smaller then the other, so that will be the bottleneck. And therefore, we take the minimum of these two numbers. And that would be the answer to the question, how long does it take for client server architecture to finish sending this multi-casting. This file. Now in contrast suppose we do it in P to P fashion. Then, the server only need to send this out in the ideal case only to one of these peers. So that only take F divided by use of S time of course, this file needs to be downloaded by even the slowest downloader. So F divided by B mean term still remains there. And now, have to make sure that. All together eventually NF number of bits needs to be sent around one way or another. And the ideal case is that everybody's upload speed are used, including the server's upload speed, u sub s, and the summation of all the upload speed from all the piers in next to, by I. From one up to N. Is there n such peers? If, you can indeed achieve this bound, this is the most ideal situation where everybody's upload speed can be utilized, but then, you can say. Well, one of these three numbers will be the smallest of the three, and that will become the bottleneck. Whether it's the service ending one piece of the, one copy of the file out or the slowest downloader or the fact that some of these slow speeds might be slow. But since I'm using all the piers as servers, I can leverage all of them. Assuming this can be achieved, then we have this formula. The mean of these three numbers, okay, whichever turns out to be the bottleneck is the answer to the question, how long does it take now? Now if I compare this formula with the formula on the last slide, which is this one, they look kind of alike. But there's a crucial difference. Over here, as N becomes big, okay, N goes to infinity for example, soon enough this term will dominate that term. So, we're going to have to, have, to, I guess I had a verbal type of, this should be the max of course, we are talking about the time it takes, so whichever is the longer time. So, this should be the max, and this should be the max. Okay, I'm not talking about the speed but the time it takes, alright. So, as n becomes very large, this would dominate that term in picking which one is the larger number, which one is the longer, delay component, then this term will dominate. And, how does this term scale with size of network, that's the scalability question. Well, linearly in n If there are more clients, you spend more time, and that is intuitively trivial to see. If you keep adding more clients, eventually the server will be loaded. In contrast, in this formula, we see three terms. These two terms do not scale with N, so as N becomes big, they will be dominated. It will be this term that matters. And as N becomes big, what would happen to this term? Well, the numerator will go up in any hand. But assuming the upload capacities are somewhat reasonably distributed numbers, then as N becomes big, the denominator also grows like N. And, therefore, that cancels each other out, and we have a self-scaling property . Again, intuitively, that's crystal clear because the more receivers you add, the more destinations you add in a multicast, you also add more sources. Okay. So that's the back of envelope. But this assumes the ideal situation, where you can indeed utilize everyone's upload capacity. That doesn't have to be the case, in fact, that, generally speaking, is not the case. For example, in graph A we have a particular tree. This is the server, okay, th-, the blue rectangle, and then we got four pairs, a, b, c, d. And this is a tree, because all the chunks will traverse like this. Okay? This is a tree. The parent node goes down to this, go down to this, go down to the this, has another child node down there, okay? So you can join like this. That's the tree, Okay? There is only one way to go from the root node to any of the leaf nodes, and there's no cycle, where's for. In fact just look at this graph you see that hey since node a is the parent of nodes b and d so the upload capacity of node a is utilized. Seems node b has a child. It's upload capacity will also be utilized. But what about nodes c and d? They are the leaf nodes and therefore are only on the receiving end and their upload capacities are entirely wasted. So, C, D upload capacity are wasted. And, clearly, we wouldn't be able to achieve this bound. Well, what if we use another tree? In this case, the root will go to c first, and c go to d and then d go to a and b. This is another tree. In this case AB's upload capacities are not used. Their upload capacities are wasted. That's not good either. So whichever tree you pick, you're going to waste somebody's upload capacity. But wait a minute. What if I use both? In fact, what if I use many trees? Each tree is a multicast tree, and there will be some waste of upload capacity, but if I use all of them, maybe that will be helpful. For example, I can use another one where the roots go down to A first, then A goes to C, C goes to D and B. This is the server, these are the four peers, Okay? There'll be yet another tree. It can be many of these, it just show these three trees here. Now of course you have to make sure that the upload capacity across all the trees for each pair is not exceeding this total upload capability. For example, node A is used here and here. So you better make sure that the rate you assign on this tree okay, say, you know, ten megabits per second. Now this tree you say twenty megabits per second. A has to be able to support both trees, so you have to make sure. A has at least in the 30 mega per second in the kind of uploading capacity. Okay? So you have to carefully manage where you place each pair in each tree. But, you see the problem. The problem is embedding multiple trees to support the same multicast session in a general graph. Possibly with constraints on, say, the number of children nodes you can have degree bounds. And this is a very difficult combinatorial optimization problem. Okay. It is not a continuous problem, like what we've seen before. Like in last lecture and TCP congest control. This is a graph theoretic optimization problem. In the advanced material we will look at approximation methods, again using [inaudible] problem to solve this in approximated fashion. But we can look at the exact answer . Through a interesting, development in the coming five to ten minutes for a special ideal case. But the intuition should be clear. That those peers with a large upload capacity should be placed closer to the root of the tree and support more children. And possibly show up in more of these trees too. Okay? That is the intuition. And here is the example. The example we want to look at is that, well, I have a source, okay the root or the server, okay? Indexed by, denoted by s. So I have this server sitting there with a certain upload capacity. Okay? That's the original sender. And I have n peers. They each have a certain upload, capacity ui. And this one got a u sub s. So you cannot violate a ui and u sub s constraint. Now, the questions is, how can you construct, multiple trees so that, you can achieve this bound. Okay. So let's first write down about it. We'll come back to this picture in a minute. So here is the ballot you would like to show you can achieve. Right. You want to say that the time that is take to finish multi-casting this is the mean, is the max of, three turns. Let's say, Ignore the download speed. Okay. Let's say the download speed is fast enough even for the lowest down-loader. Let's just look at the bottleneck from upload capabilities. So you're looking at the max of these two numbers. Oops running out of space, Okay? And that's the denominator here. So you want to show this is true, okay, where T is the total download time. And this is equivalent to saying that the maximum rate that you can achieve across all these multicast trees is. The mean of the reciprocal of these. Okay? Us over F , and US plus the summation UI, over NF . Okay? The smaller of these two numbers, because the largest rate is simple a proportion to one over the time it takes. Since F appears on both, let's just assume it is one unit. Whether it's one megabyte, or one gigabyte, it doesn't matter, okay? That means we want to show our max equals the mean of. Two terms, US and US plus summation UI over N. Okay? Fine, this captures the essence of the self scaling property. So we basically say that if the US is actually bigger than. Us plus sum of UI over N. Which is roughly speaking the average upload capacity, counting both server and all the peers. If that's the case, you've got a pretty powerful server. That's very good. That's one case. Okay. Then the answer is that R max should equal to the smaller of the two terms, that is this term. The other scenario is that you got a pretty. Obsolete server. Okay? Uploading capacity is very small for whatever reason. So US is actually smaller than the average of these servers and all peers uploading capacity. In that case, our Mac should be the smallest of the two. That is, it should be equal to this. So let's look at these two cases. So case one. Let's say your, server is not a very powerful one. So US is less than US + summation UI over N. Okay? This is the, condition here. And now, what kind of tree shall we construct? Well, basically, going to construct the following families of trees. There'll be n trees, one for each of these n cases. Each of these cases, you have a tool hop. Tree, okay? The depth is two. It goes to one of the n peers and then that peer will become the parent node for all of other n-1. Peers. And you do the same thing, except you swap the position of this, parent peer. Okay? Now it becomes peer #two. And then it goes down to the other n-1, one three up to n, peers. And you repeat this. Now you've got n trees. And they say that I'm going to do multicast simultaneously using all these n trees. Don't worry about this tree yet. We'll come to that in the second case. And I need to show you that I can assign the right rate, the multi cast rate. For tree one. Rate on tree two up to rate on tree n. So that the summation of these rates will achieve my, target, what I want to prove. Which is, it should be equal to the smallest of these two terms. That is, u sub s. That's what I need to prove. And also I need to prove that it's also feasible allocation of rates on these trees. Because I need to show it doesn't violate the us or the ui's. Alright so first let's show feasibility and then show achieve, achievability there. So, the feasibility is easy to see. Okay here's my solution. The solution says that I will let the rate on each tree I there in such trees to be simply. A fraction of the upload capacity of the server. What fraction? Well, the more capable you are, the more share, larger share you're going to shoulder. It will be proportional to the, to the UI, relative to the normalization, okay, of all the up, all the upload capacities, J from one, up to N. Okay? This is your UR pier number I upload capacity. And then you look at a sum of all upload capacities. Okay, so if the sum is ten and you are two, it's like you are twenty% of the capabilities of the peers. And therefore you can take twenty% of the job. You can take twenty percent of the upload capacity for the tree where you are the parent pier serving the other n-1 piers. And I conjecture this answer is eight feasible. And B, optimal, let's look at feasible part first. Is that feasible for the server? Well, sure of course, by construction it is trivially true, add up all the rice on these trees. N of them which equals the summation of UI over the summation of UJ, US. While this two will cancel each other you get US, very good. Not only it's feasible, you can smell something good, because it's fully utilizing all the upload capacities. Okay. Is, this one's upload using all the server's upload capacity, that's a good design. What about the upload capacity for each pair? Well, each pair need to say that, I am the parent node in one tree and one tree only okay, and that tree has a rate RI. And I need to support N - one children. What is this ? Well, this is simply by the definition of my . Expression for the rate on each treat, I equals this. So, I need to say it. Is this expression indeed a less than or equal to ui? That's a question mark. Okay. I need this to be smaller than or equal to UI. Is that true or not? Well, that's equivalent to ask, we can cancel UI on both sides, whether N minus one times US divided by sum of UJs is less than or equal to one. It's equivalent to ask if N minus one times US is less than or equal to summation of UJ. Still a question mark, I don't know if that's true or not. That's equivalent to asking whether us, is less than equal to n times us, less than equal to us plus summation uj. And that's equivalent to asking if US is less than or equal to US plus sum of UJ over N. Is that true? Okay, I'm carrying this question mark over. Well, let's look at the condition here. The very premise behind the case one is that, US is smaller than the US + sum UI over N. I'm just changing this dummy index variable from I to J here. Otherwise, it's identical expression. And therefore, this is indeed true by the very first assumption that defines what's case one. Case one is your server capacity upload. Capacity is not big enough. All right, so, this is true, therefore this is true, therefore this is true, and therefore it is indeed feasible. Very good. Now, is it also achieving? What we want to show. Okay? What do we want to show again? We want to show that the h-. Total rate. Under this allocation of per tree rate = the mean of these two numbers. And in this case, that means = u sub s. So that's what we need to show. Well, what is the total rate achieved we claim to be the maximum rate? It is the sum of all the rates that each peer gets. Well, first of all peer I is getting a certain rate directly from the source. From the server, remember. If you are, say, peer one, one in this tree where you support the N-1, peers. You are getting a rate, of R1 direct from the server. And then for N-1 other trees, you are getting the rate that's rec-, giving to you. You now become, a leaf node from the other N-1 peers. So on your own tree you're getting RI and on the other trees you're getting RJ for J not equal to I. But hold on a second, this is the same as summation of RI over all the I's which is the same as summation, write down the expression, for RI. Ui. Over sum of UJ times US . Us doesn't even depend on INJ. These two terms cancel each other, that equals US. That we have proof what we wanted to show, okay? Are max in this case one is indeed achieving the formula use of S. Very good. So in the case where u sub s. Server isn't powerful enough, we found a way. Okay? Simply by assigning, in proportional terms, the upload capacity to different autocast trees rates, we have shown it is feasible and achieving the answer. So what about the other case. Second to last case? Now in this case, US server capacity is pretty powerful. Okay? Us is. Bigger than US plus the summation of UI over I divided by N. So in this case we want to show that R max equals the small of the two terms that is equals to this term. Can we do that? Well, it turns out in this case we're going to need not just N trees that are two half trees. But also another tree, one more tree. And that's one half tree. Intuitively what happens is that since your service is so powerful, even after finishing talking to N trees. And let each of them talk to the other N minus trees. I still have some leftovers. The left over capacity for this server upload capacity is going to be utilized. You can't just let it go waste. If you do that you won't be able to achieve this nice result. So you have to. Do the following, okay. You can just give the left over whatever that is, directly in a one hop tree, exactly just a star topology, it's a client server topology to all the other N nodes okay? All the peers, and this we'll label as tree number N + one. So all we need to do now is to assign the right rate for each of these entries and assign the rate for the M plus 1th tree. And then show it's feasible in achieving the right answer. So I'm going to claim that, now I can assign the rate on each tree. The first entry is to be simply. The corresponding pair, that's the parent pair in that tree, okay? That pair's upload capacity divided by N minus one. Now this makes sense because then I know immediately that it satisfy all the upload capacities because it is the not a leave note only in one, one tree and in that tree it is the parent no serving N minus one children, so N minus one time, times RI is nothing but UI. So you are clearly fully utilizing each individual users upload capacity. Very good. And what about the n plus 1th trees rate? Well, that's actually easy to calculate, you just look at the server's upload capacity. Okay? Minus the summation of all the. Upload capacity . Okay. Each divided by N minus one. Okay? So that is RI. The summation of RI. This is RI term, across all the trees. Alright? This is the leftover capacity. And, as we just talked about, you just spread it evenly among these end pairs. So the right is going to be this . And this immediately implies that the upload capacity of the server is fully utilized and not violated because, what is the server capacity's, what is the server's total upload loading? Well, you need to upload to all these entries that is summation of UI. Over n-1 summing over i. Okay? Then you need to support. Talking directly to n peers with the rate . Of this n + one tree. Okay? And what is that? Well if you just look at the rate there, you multiply N over there, is nothing but US minus sum. And therefore you add the sum, that gives you nothing but use of S. So again, you're fully utilized and do not violate the upload capacity of the server. Just like what you did with each of the N clients, N peers. Alright, so by our design we have demonstrated that it is a feasible. This way of constructing trees and assigning the rates on them is indeed feasible. It doesn't violate upload capacity. The only thing left is to show that it's achieving. Our target result. Alright, let's look what is the rate that any peer I receives. Any peer I receive first the rate directly from the server in the I tree, where is the parent peer. And then, there are. And minus one trees , where it's receiving this from the other pairs, okay? So if you are peer one, you got some right direct from the server in this tree. And then, for the other N-1 trees, you got, the rate as a leaf node from the other N-1 peers. Very good. And then there's the N+1th tree. The last tree where the server shares the leftover upload capacities. That is, the rate upon that tree. Now you can plug in the definition of these Ri, Rj's. And that, star topology tree and cancellation of the terms lead you to nothing but u sub i Summation over I / by n. Exactly what we wanted to achieve. Okay? Our or max should be the smallest of the two. In case two, this is the smaller of the two. And that's exactly what you get. In other words, now we have proofs combining both cases one and two that our max is indeed the min of us. Or, and US plus summation UI over N. Now that implies therefore, the time it takes to download is the max of one over US and N over. The sum of US and all the UIs. That's exactly what we wanted to show. So in this ideal case with no topology constraints and so we have actually shown the back of the envelope calculation. This calculation is indeed achievable. So advanced material we'll look at more general case, in a much tougher cases for approximation results. But let's summarize at this point. We have seen the power of building overlay networks, whether Skype or Bit Torrent. Immensely popular internet services. And this P to P log quantifies the positive network effect in application layer multicast context. And we've just demonstrated through the back of the calculation example that you do not need to have a degree that grows as a linear function number of nodes. A smart pairing can do a magic. And ideas such as tit-for-tat also solves free-rider problems and enable the BitTorrent to scale up. And that is the scaling up property for P to P In next lecture, we look at a scaling of property of a different kind of content distribution system, the Cloud.