Subject 6. Expected Value, Variance, and Standard Deviation of a Random Variable

#reading-9-probability-concepts

The expected value of a random variable is its probability-weighted average of the possible outcomes. When combined with probability, the expected value simply factors in the relative chances of each event occurring, in order to determine the overall result. The more probable outcomes will have a greater weighting in the overall calculation.

For a random variable X, the expected value of X is denoted E(X).

E(X) = P(x1) x1 + P(x2) x2 + ... + P(xn) xn

In investment analysis, forecasts are frequently made using expected value, for example, the expected value of earnings per share, dividend per share, rate of return, etc. It represents the central value of all possible outcomes.

Example

The organizers of an outdoor event know that the success of the event depends on the weather. It costs $50,000 to stage the event. If the weather is favorable, the organizers will take in $200,000. If the weather is moderate, the organizers will take in $80,000. If the weather is unfavorable, the organizers will be forced to abandon the event, and thus take in $0. The weather bureau forecasts that the chances of favorable, moderate and unfavorable weather are 20%, 30% and 50% respectively. Should the organizers go ahead and stage the event?

We can use expected value to work out what revenue the organizers can expect to generate. Once we have this number, we can compare it with the cost of the event, $50,000, to assess whether the venture is likely to be profitable.

Using the expected value formula, we will multiply each amount by its probability, and add the answers. E(X) = 200,000 x 0.2 + 80,000 x 0.3 + 0 x 0.5 = 40,000 + 24,000 + 0 = $64,000

Thus, the organizers can expect to take in $64,000. Since it costs $50,000 to stage the event, this translates to a profit of $14,000, so they should certainly go ahead with the venture.

It's important to realize that none of the outcomes actually produces an amount of $64,000. This is simply the weighted average of all possible outcomes. Although there is a 50% chance of a loss the big profit that will be made the remaining 50% of the time more than offsets this and creates an overall expected profit.

However, with a one-off concert, there is a major risk involved, particularly in the event of unfavorable weather. An easier way to interpret expected value is as follows: If a number of such concerts were held, the organizers can expect to achieve a profit of $14,000 for each concert. So expected values actually make more sense when viewed over the long run.

The variance of a random variable is the expected value (the probability-weighted average) of squared deviations from the random variable's expected value.

σ2(X) = E{[X - E(X)]2}

Variance is a number greater than or equal to 0.

  • If it is 0, there is no dispersion or risk. The outcome is certain.
  • Variance greater than 0 indicates dispersion of outcomes.
  • Increasing variance indicates increasing dispersion, if all other factors are equal.
  • Variance of X is a quantity in the squared units of X; it is difficult to interpret this variance.
The standard deviation is the positive square root of variance.

Variance and standard deviation measure the dispersion of possible outcomes around the expected value of the random variable. If all other factors are equal, increasing variance or standard deviation indicates increasing dispersion of the possible outcomes.

In the example above, we calculated the expected value of revenue to be $64,000. This was before we subtracted the costs. To calculate the variance of the organizers' revenue, we simply take each value, subtract 64,000, square the answer, multiply by the relevant probability in each case, and add.

Var (X) = [200,000 - 64,000]2 x 0.2 + [80,000 - 64,000]2 x 0.3 + [0 - 64,000]2 x 0.5 = 5824000000

The standard deviation is the square root of this number. So, SD(X) = 76,315.13611.

These numbers are often large, particularly if your original data comprises large numbers, as is the case here. Because the calculations for variance and standard deviation yield big numbers, we can conclude that the values in the data set are extremely variable and scattered fairly far away from the expected value.

Parallel to the total probability rule for stating unconditional probabilities in terms of conditional probabilities, total probability rule for expected value states (unconditional) expected values in terms of conditional expected values.

  • E(X) = E(X|S)P(S) + E(X|SC)P(SC)
  • E(X) = E(X|S1)P(S1) + E(X|S2)P(S2) + ... + E(X|Sn)P(Sn)
    (where S1, S2, ..., Sn are mutually exclusive and exhaustive scenarios or events.)
The general case, equation 2, states that the expected value of X equals the expected value of X given Scenario 1, E(X|S1), times the probability of Scenario 1, P(S1), plus the expected value of X given Scenario 2, E(X|S2), times the probability of Scenario 2, P(S2), and so on.

In investments, we make use of any relevant information available in making our forecast. When we refine our expectations or forecasts, we are typically making adjustments based on new information or events; in these cases we are using conditional expected values. The expected value of a random variable X given an event or scenario S is denoted E(X|S).

Relating the formula to the example above and using the following notation:
X = revenue, F = favorable weather, M = moderate weather, U = unfavorable weather, the formula becomes:
E(X) = E (X|F) x P(F) + E (X|M) x P(M) + E(X|U) x P(U)

Note that the right-hand side has three terms because there are three possible weather scenarios.

The E terms on the right are calculated as follows:
E (X|F) = Expected value (Revenue | Favorable weather) = 200,000, because if the weather is favorable, the revenue will be $200,000.
Similarly, E (X|M) = 80,000 and E(X|U) = 0.

So, E(X) = 200,000 x 0.2 + 80,000 x 0.3 + 0 x 0.5 = 40,000 + 24,000 + 0 = 64,000.

This is the same answer that we calculated before; the formula above is just another way of carrying out the same calculation.

Note that had there been ten different weather scenarios, the right-hand side would contain ten different terms. The key information is that the different weather scenarios are both mutually exclusive and exhaustive.



Discussion

Do you want to join discussion? Click here to log in or create user.