Random Variables and Expected Values

Suppose an experiment has several outcomes.  Up to now, we've only concerned ourselves with the probabilities of the outcomes.  But suppose that there is a loss or gain associated with each outcome.  For example, in a gambling game, there might be some outcomes that have a payoff, but other outcomes which result in losing your stake.

A random variable is a variable (function) that associates a number to each outcome.  [Technically: a random variable is a function whose domain is the sample space.]

Example:  The game Chuck-a-Luck.  You pick a number from 1 to 6.  You roll three dice.  If your number doesn't appear on either dice, you lose $1.  If your number appears exactly once, you win $1.  If your number appears on exactly two dice, you win $2.  If your number appears on all three dice, you win $3.  So every outcome has either the value -1, 1, 2 or 3 associated with it.  The random variable is: how much you win or lose on the game.

Example:  The spreadsheet AutoInsur.xls gives a simple simulation of an insurance policy.  In the sheet, several different outcomes (possible accidents) are given, with their probabilities; each outcome has a numerical loss associated with it.  So the loss due to possible accidents is a random variable.

Frequently we are interested in the average outcome of an experiment.  For example, how much on average we lose per game if we play a certain gambling game many times.  Or what the average loss per vehicle is if an insurance company insures many cars of a certain type.  This is known as the expected value of a random variable.

Example:  Chuck-a-Luck.  There are 216 possible outcomes when three dice are rolled (6 times 6 times 6).  It turns out that the probabilities of each event are as follows:
 
event number of outcomes in that event probability of that event
your number doesn't appear 125 125/216 = 0.5787
your number appears once 75 75/216 = 0.3472
your number appears twice 15 15/216 = 0.0694
your number appears three times 1 1/216 = 0.0046
So on average, every 216 plays of the game, your number doesn't appear 125 times, it appears once 75 times, it appears twice 15 times, and it appears three times just once.  So on average, if you play the game 216 times, you lose $1 in 125 games, you win $1 in 75 games, you win $2 in 15 games, and you win $3 in just 1 game.  Your earnings would therefore be 125(-$1)+75($1)+15($2)+1($3) = -$17, a loss in fact.  The average loss per game would therefore be -$17/216 = -$0.08.  This is the expected value of the game.

Example:  Auto Insurance.  A certain make of automobile worth $15,000 is insured by a certain company.  In their experience, the following losses occur with the following probabilities:
 
loss probability
$0 0.80
$1,000 0.10
$5,000 0.05
$10,000 0.03
$15,000 0.02
If 1000 cars of this make are insured, we are likely to see approximately the following results:
 
loss number of cars experiencing that loss total loss
$0 800 $0
$1000 100 $100,000
$5000 50 $250,000
$10,000 30 $300,000
$15,000 20 $300,000
total $950,000
So the average loss per car is $950,000/1000 = $950.  This is the expected value of the loss on a car.  Consequently, the insurance company should make the premium for this policy at least $950 to not lose money insuring this make of car.

We can look at this calculation another way.  It is $950 =
$0(800/1000) + $1000(100/1000) + $5000(50/1000) + $10,000(30/1000) + $15,000(20/1000)
= $0(0.80) + $1000(0.10) + $5000(0.05) + $10,000(0.03) + $15,000(0.02).
This is the sum of each value of the random variable, multiplied by the probability of that value.

In general:  Suppose a random variable assumes value x1 with probability p1, value x2 with probability p2, value x3 with probability p3, etc. Then the expected value of the random variable is x1p1 + x2p2 + x3p3 + etc.  By the way, the set of probabilities associated with the different values of the random variable is known as the probability distribution of the random variable.