Binomial Distribution is used to work out the probability of something happening.
Suppose you are an employer. You need to recruit 3 people. You have a pool of 6 candidates. The only requirement for the job is that they know Excel. Except, you forgot to specify that in your job ad. Also, you can't be bothered to interview anyone. So you decide to simply pick names of a hat. (You haven't even looked at the CVs.) Now, here's the question. Suppose 70% of working people are familiar with Excel and the other 30% know nothing about it. What are the chances that the first three names you pick out of your hat are familiar with Excel and therefore good candidates? Well, this where Binomial Distribution can help you out.
Binomial refers to two options. For example, tossing a coin gives you only two options, heads or tails. If you can ask a question and get a YES/NO reply, it's binomial. For each of our ten job candidates, we can ask, Does this person have Excel skills? IF Yes, it's a success; if no, it's a failure.
But you need the probability of each outcome. With the toss of a fair coin, each outcome, heads or tails, is as likely as the other. The odds are 50/50. Or 1/1. Or 0.5/0.5. Or ½ heads, ½ tails. There are quite a lot of ways to say the same thing.
For our scenario, we've said the chances of finding someone with Excel skills in the general population is 70% and someone without Excel skills as 30%. Or 0.7 and 0.3. So, we have enough to start our spreadsheet.
In Cell A1, then, enter Number of Trials. In B1, enter: 6.
In cell A2, enter Sample Size. In cell B2, enter 3.
In cell A3 enter Probability. In cell B3 enter 0.7.
Your spreadsheet will then look like this:
Excel already has Binomial formula built in. Let's try it out.
In cell A4 enter Excel Binomial. In cell B4, enter this:
=BINOM.DIST(B2, B1, B3, FALSE )
Press the enter key on your keyboard and you should get a figure of 0.18522. On the Numbers panel of the Home ribbon at the top of Excel, change the cell formatting from General to Percentage. You'll then get a figure of 18.52%. Your spreadsheet will then look like this:
So, if we had 6 candidates and randomly chose 3, there's an 18.5% change that all 3 will have the Excel skills we need. Not a great success. Better start the interviews!
But what is Excel doing with its Binom.Dist formula? How does it work it out?
Well, what it's doing is calculating this rather intimidating statistical formula:
Although that looks quite scary, it's quite easy, once you break it down. (Honest!) The formula can be broken down into three parts:
Binomial Distribution Formula Part One
You'll see this written as n choose k. What it means is that out of all your trials, how many total outcomes are there? For example, suppose we started with a choice of 4 items. These ones:
Now, suppose you could only pick three of these items. How many combinations are there? Well, this is one combination:
Here's a second combination:
And one final combination:
So, if you start with 4 items and can only choose 3 then there are four different combinations you can have.
If you had 5 items to choose from and could only choose 3 then there would be 10 different combinations. For 6 items, the total combinations is 20.
So, the answer to 6 choose 3 is 20.
Excel has a formula for this, so you don't have to work it out for yourself. In cell A5 of your spreadsheet, enter Total Combinations. In cell B5, enter this formula:
B1 is where we have our number of choices. B2 is how many we want to pick from those choices. The answer, when you use the COMBIN formula, is the total number of combinations.
Press enter to get a figure of 20. Your spreadsheet will look like this:
But this just works out the part in round brackets of part one, the n choose k:
P(n choose k)
The P is the probability, as a percentage. This is the 18.25% figure we got from Excel's own Binomial function. We'll get that by working out parts two and three.
Binomial Distribution Formula Part Two - Factorials
Part two is this:
The exclamation marks tell you that you're dealing with factorials. With factorials, you take a number and keep going down to 1, multiplying all the while. For example, 4 factorial is this:
4 x 3 x 2 x 1
The answer is 24.
The n for us is how many candidates we have. We have 6 candidates so we need to calculate 6 factorial.
n! = 6 x 5 x 4 x 3 x 2 x 1
And the answer to that is 720.
Fortunately, Excel has a factorial function built in. In cell D1 on your spreadsheet, enter n!. In cell D2, enter this:
Press enter to get the answer. Your spreadsheet will look like this:
And that's the top half of the equation. The bottom half is this:
The k is how many choices we have. For us this is the 3 candidates we'll employ out of the 6.
So, k! is 3 x 2 x 1, which is 6.
In cell E1 of your spreadsheet, enter k!. In cell E2, enter this:
Press enter to get this spreadsheet:
Now we need this bit:
(n - k)!
This says, get the factorial of n - k.
In cell F1, enter (n - k)!. In cell F2, enter this:
=FACT(B1 - B2)
Press enter on your keyboard and you'll get an answer of 6. Here's your spreadsheet now:
Now we can get an answer for the whole of our part two formula, which was this:
We only need to multiply E2 and F2 and then divide it into D2. So, in cell G1, enter Combinations. In cell G2, enter this:
= D2 / (E2 * F2)
Press enter on your keyboard. Your spreadsheet will now be like this:
The answer should be 20. But, hang on. Haven't we seen that figure somewhere before? Indeed, we have. It's the same answer that Excel gave us when we entered this in cell B5:
So now you know how Excel works out its COMBIN formula!
Try changing the values in cells B1 and B2. Try 5 and 3. You should get an answer of 10 in cells G2 and B5. Put them back on 6 and 3 when you're done playing around.
Binomial Distribution Formula Part Two - Probability
Part three of our formula is this:
The Pk part means multiply your probability by k number of times. Our probability was 0.7. Our k number is our sample size, which was 3. So Pk for us is:
0.7 x 0.7 x 0.7
Let's enter that into Excel.
In cell D4, enter Pk. In cell D5, enter either this:
= B3 ^ 3
They do the same thing: raise the first number to the power of the second number. (We want 0.7 to the power of 3.)
You should get an answer of 0.343. Here's what you should have on your
Now we need the (1 - p) part. P is our probability, remember. So this one is fairly easy.
In cell E4, enter (1 - p). In cell E5, enter:
= 1 - B3
The last bit, (n - k), is fairly easy, as well. This is going to be used as an exponent for the number in E5. In cell F4 enter (n - k). In cell F5 enter:
= B1 - B2
Your spreadsheet should now look like this:
Now let's get the answer to part three of the formula.
In cell G4, enter Probability. Enter the following in cell G5:
=D5 * (E5 ^ F5)
Press enter on your keyboard to get a figure of 0.009261:
Now we're ready to calculate our own binomial formula.
In cell A6, enter Our Binomial. We only need to multiply our Combinations result with our probability result. So, in cell B6, enter this:
= G2 * G5
Press enter and you may get a figure of 0.18522. To convert this to a percentage, switch the dropdown on the Number panel of the Home ribbon from General to Percentage. Here's your final spreadsheet:
Our calculations agree with Excel - there's only an 18.52% chance that we'll get 3 Excel users out of a random trial of 6 applicants.
So, we broke down a complicated formula and got a greater understanding of what the Binomial Distribution is. We calculated all this by hand:
Hopefully, doing this will improve your statistical ability!
In the next lesson below, you'll learn about Web Integeration."