In this video, I'll explain the relation between conditional probability, decision trees, and an equation that relates different conditional probabilities, Bayes' law. Let's look at conditional probability as a way by which a sample space is reversed. You may decide that if Event B occurs before Event A, the sample space is effectively reduced to a new smaller sample space. When calculating conditional probabilities, you are considering the probabilities for Event A, with a reduced sample space, rather than a complete sample space. The important point is that all the properties that you know to be applicable for events in the complete sample space, apply to this reduced sample space as well. For example, if A, in reduced sample space, that is, A given B, could take three values, the union of these three probabilities should be one. Let's apply this to an example. You've made a count of different activities by the people on your beach, which you've distinguished by gender. And you have turned it into this table with probabilities. The conditional probabilities for the three activities given that the person were a male would be given by dividing the respective joint probabilities with 0.45. And for females, the conditional probabilities for the activities would be given by dividing with 0.55. The resulting conditional probabilities for the activities given gender are shown here. And the same can be done for the conditional probabilities of gender given activity. You can see that a rule stating that the sum of all conditional probability should be one does apply. Sometimes, you deal with conditional probabilities without really noticing. You may have done so already while making calculations with a tree diagram. Let's turn the table with joint and marginal probabilities into a tree diagram. You can imagine that at the beach you would first look at all the people resting, and subsequently count the number of male and female persons in that group. And the same for the other two activities. So your tree would have to first split over activity and second split over gender within each activity branch. Now you'd find these probabilities along each branch. The marginal probabilities for the activities would be placed at the first note. In the second step, they're splitted by gender. Here, you would deal with conditional probabilities gender, given that an activity is known. You would calculate these probabilities by dividing the count for male and female per activity by the total people in that activity, not the total at the entire beach. The joint probabilities at the end would result from multiplying the probability per activity within the conditional probabilities of gender, given activity. So, in this tree diagram, the probabilities that you encounter at the first node are marginal probabilities, for the activities, in this case. The probabilities at the second node are conditional probabilities, for gender, given the outcome of a certain activity. And the probabilities at the end of a tree are the joint probabilities. To get the marginal probabilities for gender, you would have to add the respective joint probabilities. If you would interchange gender and activity in a decision tree, the marginal and conditional probabilities would change. But the joint probabilities would remain the same. This leads to an interesting insight. The joint probability of A and B equals the multiplication of conditional probability of event A, given B, times the probability of B. But it also equals the multiplication of conditional probability, or event B, given A, times the probability of A. Therefore, we can just as well say that the conditional probability of A given B is related to B given A. This equation is known as Bayes' Law. It is, in fact, the shortened version, because often the marginal probability of B is not known directly, but has to be computed from the sum of the different conditional probabilities for B, given an outcome of A. If we return to our example once more, Bayes' law allows to calculate the probability that, for example, a person will be swimming when you know that the person is female. While we have only the reversed conditional probability to our availability, a person is female given that this person is swimming. The conditional probability that someone is female given the person is swimming is multiplied with the probability of swimming and this is divided by the sum of the probabilities for the three activities times their conditional probabilities. In simple, discrete probability calculations, Bayes' law is a convenient function, which follows from the axioms of probability calculus. However, it's often interpreted in a more abstract way, where the left-hand side of the equation is understood as a degree of belief in hypothesis A, after observing B. And the right hand side gives to believe in A, prior to knowing B times the support that B provides for A. In this context, the probability of A is called prior probability because it's your knowledge about A prior to observing B. And the conditional probability of A given B is called posterior probability. And in this wider context, there are no simple frequency counts to inform you about the belief in A. So, it needs to be assessed qualitatively. Considering probability is a belief, rather than a frequency based quantity, is quite a different viewpoint, and has led to a lot of philosophical debates among the different proponents. But in practical situations, the different perspectives appear not to be too far apart. In any realistic analysis there is prior knowledge involved as well as assumptions and subjective choices. While wherever possible one would use observations to estimate probabilities. Let me summarize what I have explained in this video. The probability of A conditional on B can be considered as the probability of A in the reduced sample space where B occurred. A tree diagram contains different probabilities. At the first node, it has marginal probabilities and for any node further on, it has conditional probabilities. The joint probabilities at the end result from multiplying marginal and conditional probabilities along a branch. The fact that the joint probability can be calculated based on a condition of probability of A given B, as well as B given A, leads to Bayes' law which relates these two conditional probabilities. Bayes' law is also used to express how a prior belief in hypothesis A can be updated by new evidence B. In that context, the probability of A is called the prior probability and the conditional probability of A given B is called posterior probability. Considering probability as a belief rather than a frequency based entity is a different theoretical starting point. However, in practice, elements of the two are often combined.