# Bayesian Statistics: Techniques and Models, 1.13 (V) Monte Carlo Integration

[MUSIC] Before we learn how to simulate from complicated posterior distributions. Let's review some of the basics of Monte Carlo estimation. Monte Carlo estimation refers to simulating hypothetical draws from a probability distribution. In order to calculate important quantities of that distribution. Some of these quantities might include the mean, the variance, the probability of some event, or the quantiles of the distribution. All of these calculations involve integration. Which, except in the simplest cases, can be very difficult or even impossible to compute analytically. Let's suppose we have a random variable theta. This random variable theta is going to follow a gamma distribution with shape a and rate b. And let's suppose a = 2. And b equals 1/3. To calculate the mean of this distribution, we would need to compute the following integral. The expected value of theta is equal to the integral over the whole space for theta. Which in the case of a gamma is from 0 to infinity theta times the density for theta. Now in the case of a gamma, this equals the integral 0 to infinity theta times b to the power of a. Over a gamma function evaluated at a, times theta to the a-1, and e to the -b theta. This is the integral we will need to compute. And in this particular case, it is possible to compute this integral. The answer is a over b, or with these particular values the answer is 6. However, we can verify this answer using Monte Carlo estimation. To do so, we will simulate a large number of draws. Let's call them theta star i, for draw i in one up to some large number m. We would simulate this from this gamma distribution. And to estimate the sample, or to estimate the actual theoretical mean of this distribution. We would take the sample mean or the average of these draws. Why can we do this? Recall from the previous course that if we have a random sample from a distribution. The average of those samples converges in probability to the true mean of that distribution by the Law of Large Numbers. Furthermore, by the central limit theorem, this sample mean or the sample mean of these pieces right here. We're going to call that theta bar star, which is 1/m times the sum of our simulated values. Theta star i. This sample mean right here approximately follows a normal distribution. Where the mean of that normal distribution is the true expected value. And the variance of that is the true variance of the distribution divided by m, the sample size. The theoretical variance of theta is the following integral. The variance of theta just like the expected value, we did up here. We're calculating the expected value of theta minus its expected value squared. And of course, multiplying by the density of theta. Just like we did with the mean, we could approximate this variance with the sample variance. This method of Monte Carlo estimation can be use to calculate many different integrals. Let's say, for example, that h of theta is function of theta, and we want to calculate the following integral. h of theta times the density of theta, d theta. This integral is precisely what is meant by the expectation of a h of theta. So, conveniently, we can approximate it by taking a sample mean of h evaluated for each of these samples. That is if we were to calculate the sample mean. Where we evaluate the h function on each of our simulated samples of theta. This quantity would approximate this expected value which is this integral. So we would apply this function h to every simulated sample and then take the average of the results. One extremely useful example of such an h function is the indicator function, written like this. It's a function of theta and a here would be some sort of logical condition about the value of theta. To demonstrate this let's suppose that in an example, let's say that h of theta is the indicator that theta is less than 5. This function here, this indicator will give us a 1 if theta is indeed less than 5, or it will return a 0 otherwise. So what is the expected value, Of h of theta here? Well this is the intergral from zero to infinity, once again, assuming that theta is coming from a gamma distribution. The expected value of h of theta will be h of theta this indicator function, times the density of theta. So if we evaluate this, this function will come out to be a 1. If theta is between 0 and 5, so 0 to 5. 1 times the density of theta plus the rest of the values 5 up to infinity of 0 times the density. This piece right here is going to disappear. And this right here, if we integrate the density between these two values. That's simply calculating the probability that 0 is less than theta is less than 5. The probability that theta is between these two numbers. So what does this mean? It means that we can approximate this probability right here by drawing many samples, theta star I. And approximating this integral with the following. We would take the sample mean, Of these indicator functions where theta star is less than 5. And apply it to our simulated values. What this function does is simply counts how many of our simulated values meet this criteria. And then divides by the total number of samples taken. So this approximates the probability that theta is less than five, that's pretty convenient. Likewise, we can approximate quantiles of a distribution. If we're looking for a value z, that makes it so that the probability of being less than z is 0.9 for example. We would simply arrange the samples theta star I in ascending order. And then we would find the smallest value of the theta stars that's greater than 90% of the others. In other words we would take the 90th percentile of the theta stars to approximate the 0.9 quantile of the distribution. [MUSIC]

[MUSIC] Before we learn how to simulate from complicated posterior distributions. Let's review some of the basics of Monte Carlo estimation. Monte Carlo estimation refers to simulating hypothetical draws from a probability distribution. In order to calculate important quantities of that distribution. Some of these quantities might include the mean, the variance, the probability of some event, or the quantiles of the distribution. All of these calculations involve integration. Which, except in the simplest cases, can be very difficult or even impossible to compute analytically. Let's suppose we have a random variable theta. This random variable theta is going to follow a gamma distribution with shape a and rate b. And let's suppose a = 2. And b equals 1/3. To calculate the mean of this distribution, we would need to compute the following integral. The expected value of theta is equal to the integral over the whole space for theta. Which in the case of a gamma is from 0 to infinity theta times the density for theta. Now in the case of a gamma, this equals the integral 0 to infinity theta times b to the power of a. Over a gamma function evaluated at a, times theta to the a-1, and e to the -b theta. This is the integral we will need to compute. And in this particular case, it is possible to compute this integral. The answer is a over b, or with these particular values the answer is 6. However, we can verify this answer using Monte Carlo estimation. To do so, we will simulate a large number of draws. Let's call them theta star i, for draw i in one up to some large number m. We would simulate this from this gamma distribution. And to estimate the sample, or to estimate the actual theoretical mean of this distribution. We would take the sample mean or the average of these draws. Why can we do this? Recall from the previous course that if we have a random sample from a distribution. The average of those samples converges in probability to the true mean of that distribution by the Law of Large Numbers. Furthermore, by the central limit theorem, this sample mean or the sample mean of these pieces right here. We're going to call that theta bar star, which is 1/m times the sum of our simulated values. Theta star i. This sample mean right here approximately follows a normal distribution. Where the mean of that normal distribution is the true expected value. And the variance of that is the true variance of the distribution divided by m, the sample size. The theoretical variance of theta is the following integral. The variance of theta just like the expected value, we did up here. We're calculating the expected value of theta minus its expected value squared. And of course, multiplying by the density of theta. Just like we did with the mean, we could approximate this variance with the sample variance. This method of Monte Carlo estimation can be use to calculate many different integrals. Let's say, for example, that h of theta is function of theta, and we want to calculate the following integral. h of theta times the density of theta, d theta. This integral is precisely what is meant by the expectation of a h of theta. So, conveniently, we can approximate it by taking a sample mean of h evaluated for each of these samples. That is if we were to calculate the sample mean. Where we evaluate the h function on each of our simulated samples of theta. This quantity would approximate this expected value which is this integral. So we would apply this function h to every simulated sample and then take the average of the results. One extremely useful example of such an h function is the indicator function, written like this. It's a function of theta and a here would be some sort of logical condition about the value of theta. To demonstrate this let's suppose that in an example, let's say that h of theta is the indicator that theta is less than 5. This function here, this indicator will give us a 1 if theta is indeed less than 5, or it will return a 0 otherwise. So what is the expected value, Of h of theta here? Well this is the intergral from zero to infinity, once again, assuming that theta is coming from a gamma distribution. The expected value of h of theta will be h of theta this indicator function, times the density of theta. So if we evaluate this, this function will come out to be a 1. If theta is between 0 and 5, so 0 to 5. 1 times the density of theta plus the rest of the values 5 up to infinity of 0 times the density. This piece right here is going to disappear. And this right here, if we integrate the density between these two values. That's simply calculating the probability that 0 is less than theta is less than 5. The probability that theta is between these two numbers. So what does this mean? It means that we can approximate this probability right here by drawing many samples, theta star I. And approximating this integral with the following. We would take the sample mean, Of these indicator functions where theta star is less than 5. And apply it to our simulated values. What this function does is simply counts how many of our simulated values meet this criteria. And then divides by the total number of samples taken. So this approximates the probability that theta is less than five, that's pretty convenient. Likewise, we can approximate quantiles of a distribution. If we're looking for a value z, that makes it so that the probability of being less than z is 0.9 for example. We would simply arrange the samples theta star I in ascending order. And then we would find the smallest value of the theta stars that's greater than 90% of the others. In other words we would take the 90th percentile of the theta stars to approximate the 0.9 quantile of the distribution. [MUSIC]