Minitab Binomial Distribution

0 views
Skip to first unread message

Do Kieu

unread,
Aug 5, 2024, 9:41:26 AM8/5/24
to answivmite
Its the kind of question that students are frequently asked to calculate by hand in introductory statistics classes, and going through that exercise is a good way to become familiar with the mathematical formulas the underlie probability (and hence, all of statistics).

The good news is that determining the real odds of something happening doesn't have to be hard work! If you don't want to calculate the probabilities by hand, just let a statistical software package such as Minitab do it for you.


Suppose I want to know the probability of getting a certain number of heads in 10 tosses of a fair coin. I need to calculate the odds for a binomial distribution with 10 trials (n=10) and probability of success p=0.5.


The following output appears in the session window. It tells us that if we toss a fair coin with an 50% probability of landing on heads, the odds of getting exactly 8 heads out of 10 tosses are just 4%.


What if we wanted to know the cumulative probability of getting 8 heads in 10 tosses? Cumulative probability is the odds of one, two, or more events taking place. The word to remember is "or," because that's what cumulative probability tells you. What are the chances that when you toss this coin 10 times, you'll get 8 or fewer heads? That's cumulative probability.


As you can see, using Minitab to check and graph the probabilities of different events is not difficult. I hope knowing this increases the odds that the next time you wonder about the likelihood of an event, you'll be able to find it quickly and accurately!


If a variable can take on any value between two specified values, it is a continuous variable and the values follow a continuous distribution. However, if the value can only take on a finite number of values, the values fallow a discrete distribution.


You may just want to define some discrete distributions, as in the table above. However, in other cases you may want to use the distribution for a follow-up analysis. Just like continuous distributions, each discrete distribution has special properties that you should use for specific cases.


For example, PPG Industries reported the colors of new cars that were purchased in 2012. We can illustrate this using a Probability Distribution Plot (Graph > Probability Distribution Plot). You can use the data in this worksheet if you'd like to try it.


In the Probability Distribution Plots dialog, choose the generic Discrete distribution to supply your own categories and probabilities. You need to enter a column of categories (Car Color) and a column of probabilities (Probability).


Ordinal categories have a natural order. Values are ranked, but differences do not necessarily represent equal intervals. For example, a rating scale could have the following values: Very Poor, Poor, Neutral, Good, and Very Good. The ordering provides additional information which allows you to do a little more with the data.


You can use several distributions with binary data. The choice depends on your goal. In these examples, be sure to notice the important differences between the probabilities for each discrete value (each bar in the plots) and the cumulative probabilities (the shaded areas). These examples don't use data in the worksheet.


The plot displays the probability for each number of defects in a sample of 25. The probability of exactly zero defects in a sample of 25 is about 0.6, 1 defect is 0.3, etc. Because we asked for a shaded area for 2 or more, Minitab shades that area red and indicates that the cumulative probability of 2 or more defects is 0.08865.


Each bar represents the probability of seeing the first defect on a specific trial. You can hover over a bar to see the probability for a specific trial. For example, the probability of seeing the first defect on exactly the tenth trial is about 0.016.


Use the negative binomial distribution when you are interested in the number of trials necessary to produce the event a specified number of times. For example, this distribution can model the number of windshields produced until you reach 10 defective units.


Assume that the process is stable and has a 0.05 probability of producing a defective windshield. You are interested in the cumulative probability of producing 10 defective windshields in a batch size of 100 windshields.


It may be hard to see in the small graph, but there is a bar for each batch size. Each bar represents the probability of exactly 10 defects occurring in a batch of exactly that size. For example, the probability of observing exactly 10 defects in a batch of exactly 75 windshields is slightly less than 0.0004.


For the distributions of binary data, you primarily need to determine whether your data satisfy the assumptions for that distribution. If you satisfy the assumptions, you can use the distribution to model the process.


Generally, determining whether your data satisfy these assumptions relies on a close understanding of the process, data collection procedure, and your goals for the data. If you satisfy all of these assumptions, you can safely use the binomial distribution.


Besides the binomial distribution, there are three other distributions in Minitab statistical software that use binary data. They each have somewhat different assumptions than those listed for the binomial distribution.


If you want to determine whether your data follow the Poisson distribution, Minitab has a test specifically for this distribution. To recap, the Poisson distribution describes a count of a characteristic (e.g., defects) over a constant observation space, such as the number of scratches on a windshield.


Suppose we want to determine whether the distribution of car colors in our state match the global distribution. To do this, we have observers around the state record the colors of cars that were manufactured in 2012 and included in a random sample. We tally up the colors and enter the global proportions in the worksheet like this:


Minitab checks to see if the observed counts differ from the global distribution. A low p-value suggests that your data do not follow that distribution. In this case, the p-value is 0.012, which suggests that the distribution of car colors in our state does not match the global distribution. You can compare the Observed and Expected columns in the table to see where the largest differences occur, or look at the default graphs below.


Lean Six Sigma PowerPoint Files are available for different purposes. Our standard license restricts editing while our PowerPoint License allows editing to the content, and our White Label license allows removing our copyright marks and branding.


The NP chart is a control chart monitoring the count of defectives. It plots the number of defectives in one subgroup as a data point. The subgroup size of the NP-chart is constant. The underlying distribution of this control chart is binomial distribution.


Model summary: Four data points fall beyond the upper control limit. We conclude that the NP chart is out of control. Further investigation is needed to determine the special causes that triggered the unnatural pattern of the process.


Lean Sigma Corporation is a trusted leader in Lean Six Sigma training and certification, boasting a rich history of providing high-quality educational resources. With a mission to honor and maintain the traditional Lean Six Sigma curriculum and certification standards, Lean Sigma Corporation has empowered thousands of professionals and organizations worldwide with over 5,300 certifications, solidifying its position and reputation as a go-to source for excellence through Lean Six Sigma methodologies.


Click on File in the main Mtb window to open Mtbfiles. You can open Mtb projects (file extension .mtj), Mtbworksheets (file extensions .mtw, .mtb, .mtp and others), and Mtbgraphs (file extension .mgf). You can also open spreadsheets andtext files. Usually you'll be interested in projects andworksheets.


Projects contain also records of what happened in the session window, graphs, a list of all variables that are currently stored, possibly a report in MS Word format etc. You can open only one project at a time.


To save a worksheet, click on File > Save CurrentWorksheet or File > Save Current Worksheet As ... andfollow the prompts. Have a USB drive or some othermedia ready to save your worksheet.

To save a project, click on File > Save Project andfollow the prompts. Whenyou save the project, you save all the information about yourwork: the contents of all the windows, including the columns ofdata in each Data window, stored constants and matrices, thecomplete text in the Session window and History folder, and eachGraph window. This allows you to interrupt your work andpick up later where you left off.


Activate the data window by clicking on it.

Check to insure that the arrow in the box in the upper leftcorner of the data window page is pointing downward (click on itto switch it).

Place the name of your variable in the top cell of the column(directly under C1, or whatever column you put your datain). Move the cursor to the first cell in your columnand enter your first data value. Press Enter. If thearrow in the upper left corner is pointing down, the cursor willautomatically move to the next cell in the column. You can alsouse the up and down arrows or the mouse to move to other cells.Continue until you have all the values of that variable enteredinto that column. Don't leave any empty cells.

Move to the next column and repeat the steps with your nextvariable.


Minitab considers a data column as numerical as long asall entries in its cells are numbers. If one or more cells arenon-numerical (text, symbols), the entire data column isconsidered categorical, and the column label is changedfrom e.g. C2 to C2-T. Integer entries with spaces are interpretedas dates, and the column label is changed from C2 to C2-D.It can be tedious to undo such a change in data type, so becareful when entering data.


To copy data (cells, groups of cells or columns) within a Mtbdata window or between data windows, select the cells with themouse. You can also select a group of columns (highlight thecolumn names instead of the cells).

Go to Edit > Copy Cells.

Move the mouse to the location where you want to enter the dataand go to Edit > Paste Cells.

You can also copy and paste data to and from other applications(spreadsheet columns, text files) in this manner.

Caution:

3a8082e126
Reply all
Reply to author
Forward
0 new messages