It can be described as follows: Draw a number at random from an exponential distribution. That will be the arrival time of the first customer. Draw another random number from the exponential distribution. Add that to the first arrival time to get the second arrival time. Draw a third random number from the exponential distribution. Add that to the second arrival time to get the third arrival time. Keep going for as long as you wish (say, until the time the store is supposed to close). The idea is: the times between arrivals of customers are exponentially distributed.
We can simulate the Poisson process using a spreadsheet without too much trouble. Choose a number Mean that gives the average number of arrivals per hour (or minute, or whatever your time units are). Put the formula =-1.0*Mean*LN(RAND( )) into a cell, say cell A4. (Recall this generates a random number from the exponential distribution.) Then in cell A5 put the formula =A4-Mean*LN(RAND( )). This adds a new random number from the exponential distribution to A4. Copy and paste that formula into cells A6, A7, etc. This will generate arrival times according to the Poisson process.
The Poisson process is a realistic model for customer arrival times and other similar phenomena because of a special property of the exponential distribution: it is "memory-less". This means the following: Suppose that the probability that a customer arrives in any particular 5 minute interval of time is 0.33. Suppose 5 minutes have gone by without a customer appearing. The probability that a customer appears in the next 5 minutes is still 0.33. (A customer may or may show up in that interval of time, but whether or not a customer shows up then has nothing to do with whether a customer appeared in the previous interval of time.)
The Poisson distribution is related to the Poisson process. A random variable that takes the values 0, 1, 2, 3, etc. is said to have a Poisson distribution with rate l if the probability that it takes value k is given by the formula e-llk /k!. (Here k!, read k factorial, is the product of the numbers from 1 up to k. For example, 5! = 1(2)(3)(4)(5) = 120. Also e=2.71828... is the base of the natural logarithms.) The arrival times generated by the Poisson process have the property that the number of arrivals in a given time interval of length t is a Poisson distributed random variable. (To be precise, the probability of k arrivals in an interval of length t is e-lt(l t)k/k! .Here, lis the reciprocal of the number Mean.) We aren't interested here in these precise formulas, but we note that this gives us the ability to compute the exact probability of a given number of arrivals in a given interval of time.
The Poisson distribution is a slightly different animal than the other
distributions we've met. It isn't a continuous distribution since
only the outcomes 0, 1, 2, 3, etc. are considered. But it isn't a
finite distribution: there are infinitely many outcomes. We may describe
this distribution as being "discrete".
Suppose we have some data, a list of numbers. We might be interested in the average of these numbers. For example, we could have data on the prices of homes sold in New Albany in 1997, and we might be interested in finding the average price of a home. In statistics, the mean of a list of numbers is simply their average. That is, if you wish to take the mean of a list of n numbers, add the numbers and divide by n.
But the mean might not be the best statistic for our purposes. What if lots of inexpensive homes were sold last year, and one multi-million dollar mansion? The mean might be rather high even though most homes were inexpensive. A better measure here might be a median: Determine the price level such that half of all the homes sold were lower in price, and half the homes sold were higher in price.
Another concern might be how much of a spread there is in the price of homes. Imagine a housing market where there are lots of homes close in price to $100,000, but not too many homes much less in price or much higher in price. Say most of the neighborhoods are standard tract homes of about the same age and size. Contrast that with another market where there are older neighborhoods with smaller homes and "fixer-uppers", and also with newer neighborhoods with big houses. That market might have a much greater range of prices listed.
To consider the spread in prices of the homes, we could think as follows: We could find the mean (average) price, then we could compare the price of each home with the mean. We could take the difference between each price and the mean. That is if the average price were $100,000 and a certain house were $115,000 in price, the difference would be $15,000. We could perhaps compute the average difference. But consider a house that was $85,000. The difference there would be -$15,000. If we average all of the differences, they're likely to cancel out to about zero regardless of how much spread there is in the prices. We could remedy this by taking absolute values before averaging the differences. But it turns out to be more convenient mathematically, and more natural, to take the average of the squares of the differences instead.
We offer the following definition: Let m be the mean of a list of n numbers x, so m = (Sx)/n. (Here S , the capital Greek letter sigma, stands for sum.) Then the variation of the list of numbers is s2 = (S(x-m) 2)/n. The standard deviation is given by s = ((S(x-m)2) /n)1/2. (Here, n is how many numbers are in your list.) In statistics, these formulas are sometimes given with n-1 instead of n; they are then referred as the "sample" variation and standard deviation, in symbols s2 and s.
Example: We can compute the mean, variation and standard
deviation of a list of 5 numbers.
| numbers | differences | differences squared |
| 6 | 6 - 6.4 = -0.4 | 0.16 |
| 8 | 8 - 6.4 = 1.6 | 2.56 |
| 3 | 3 - 6.4 = -3.4 | 11.56 |
| 11 | 11 - 6.4 = 4.6 | 21.16 |
| 4 | 4 - 6.4 = -2.4 | 5.76 |
| total 32 | total 41.2 | |
| mean 32 / 5 = 6.4 | variance 41.2 / 4 = 10.3 | |
| standard deviation 3.209 |
Example: The following is a list of 100 prices for homes
sold in a certain community.
| 99000 | 90000 | 110500 | 106500 | 113000 |
| 112500 | 123500 | 130000 | 82000 | 106500 |
| 88500 | 93500 | 105500 | 109000 | 91000 |
| 70000 | 116500 | 122500 | 103500 | 83000 |
| 121500 | 103000 | 104000 | 106500 | 117500 |
| 89000 | 103000 | 99000 | 121500 | 112000 |
| 100500 | 104500 | 103500 | 87000 | 106000 |
| 106500 | 90000 | 97500 | 69500 | 92000 |
| 122000 | 99500 | 110500 | 109500 | 126500 |
| 81500 | 108000 | 106000 | 88000 | 94000 |
| 99500 | 83000 | 89500 | 100500 | 121000 |
| 99500 | 101500 | 107500 | 102000 | 95500 |
| 64500 | 116500 | 83000 | 79000 | 83500 |
| 122000 | 130000 | 110500 | 76500 | 117500 |
| 92500 | 123000 | 115500 | 92000 | 77500 |
| 114500 | 107000 | 91000 | 103500 | 85500 |
| 100000 | 112000 | 90000 | 69500 | 81000 |
| 104500 | 109500 | 117000 | 100500 | 98500 |
| 91000 | 113000 | 84000 | 98000 | 117500 |
| 82500 | 85500 | 111000 | 98500 | 101500 |
Example: Here is a list of house prices from a different
community. The average sales price of 105,980 is similar to the previous
community, but this time the standard deviation is 37104.1. Below
see the histogram for this community; notice how much more spread-out the
prices are (note the prices on the horizontal axis).
| 50000 | 109000 | 46500 | 82500 | 105500 |
| 107000 | 68500 | 96500 | 156000 | 146000 |
| 106500 | 76500 | 118000 | 91000 | 44000 |
| 107500 | 53000 | 140000 | 94000 | 120000 |
| 101500 | 17500 | 174500 | 182500 | 133000 |
| 115000 | 105500 | 113500 | 131000 | 145500 |
| 49500 | 142500 | 79500 | 105000 | 108000 |
| 62000 | 132000 | 124000 | 78000 | 119500 |
| 67000 | 151500 | 73500 | 89000 | 94500 |
| 99000 | 55500 | 59500 | 108500 | 118000 |
| 93500 | 173000 | 95000 | 142000 | 147000 |
| 138500 | 108000 | 157500 | 93500 | 56000 |
| 78500 | 81500 | 194500 | 140000 | 108000 |
| 129000 | 83500 | 128000 | 70000 | 101500 |
| 110000 | 195000 | 117000 | 90500 | 159500 |
| 121000 | 95500 | 45000 | 79000 | 177000 |
| 102500 | 101500 | 60500 | 80000 | 165500 |
| 166500 | 73500 | 104500 | 94500 | 103500 |
| 126000 | 150500 | 112000 | 89000 | 94000 |
| 98000 | 145500 | 92500 | 48500 | 28000 |