6 Sigma Limits

0 views

Skip to first unread message

Kristin Dampeer

unread,

Aug 4, 2024, 4:57:42 PM8/4/24

to tiocompkuzford

Threesigma limits set a range for the process parameter at 0.27% control limits. Three sigma control limits are used to check data from a process and to determine if it's within statistical control by checking if data points are within three standard deviations from the mean. The upper control limit (UCL) is set three sigma levels above the mean and the lower control limit (LCL) is set at three sigma levels below the mean.

Standard deviation is a statistical measurement. It calculates the spread of a set of values against their average. It's the positive square root of the variance and defines the difference between the variation and the mean.

A bell curve gets its name from its appearance: a bell-shaped curve that rises in the middle. It illustrates normal probability and several graphs and distributions use it. The single line measures data on one, two, and three standard deviations.

Why are control charts based on three sigma limits? This publication addresses that question. Three sigma limits have been around for almost 100 years. And despite some attempts to alter this approach, three sigma limits appear to be the best way to approach control charts. In this issue:

Does it really matter how the control limits are set? After all, there is some gain simply from plotting the data over time. Yes, it does matter how control limits are set. The problem is that we seem to have made the control chart a more complex tool than it needs to be in recent years. One reason this has happened is we began to worry about probabilities instead of letting our knowledge of the process help us.

Some people look at a control chart as a series of sequential hypothesis tests and assign an error rate to the entire control chart based on the number of points. An on-line article(from statit.com) does that and recommends increasing the three sigma limits to larger values as the number of points on the chart increases. In fact, they appear to scoff at the reason the three sigma limits were originally set:

And then they say that the reason the three sigma limits worked was because everything was based on 25 subgroups. They then talk about the Type 1 error. The probability of getting a point beyond the control limits is 0.27% even when the process is in statistical control. So, using the sequential hypothesis test approach, the probability of getting a point beyond the control limits for 25 points on a control chart is:

This means that there is 6.5% chance of a point being out of control whenever you have a control chart with 25 subgroups. And as you add more points, that probability increases. For 100 points, the probability is given by:

So, there is a 23.7% chance of one point being beyond the control limits with a control chart that has 100 points. They recommend you increase the number of sigma limits to get the error rate close to 0.05. For 100 points, they recommend you use 3.5 sigma limits. This drop the error rate to less than 0.05.

If you view control charts from the probability approach, what this article states is true. I did a small experiment to confirm this. I wrote a little VBA code to generate random numbers from a normal distribution with a mean of 100 and standard deviation of 10. I then generated 100 control charts containing 25 subgroups and determined the number of out of control points when using three sigma limits. I repeated the process for 100 control charts containing 100 subgroups and, again, determined the number of out of control points when using three sigma limits.

I then changed the control limits to be 3.5 sigma limits and generated 100 control charts with 100 subgroups. For those 100 control charts, there were 6 control charts with at least one point beyond one of the control limits. Expanding the limits from 3 to 3. 5 for a control chart with 100 subgroups dropped the % of control charts with false signals from 30% to 6%. Not surprising since the control limits are wider at 3.5 sigma. The table below summarizes the results of the simulation.

Why should you care what type of variation you have present? The answer is that the type of action you take to improve a process depends on the type of variation present. If your process has variation that is consistent and predictable (controlled), the only way to improve this process is to fundamentally change the process. The key word is fundamental. But, if the process has unpredictable variation, the special cause responsible for the unpredictability should be identified. If the special cause hurts the process, the reason for the special cause needs to be found and eliminated. If a special cause helps the process, the reason for the special cause should be found and incorporated into the process.

This concept of common and special causes is the foundation of the control charts Shewhart developed. A process that has consistent and predictable variation is said to be in statistical control. A process that has unpredictable variation is said to be out of statistical control.

So, you need a method of calculating an average and a standard deviation of what you are plotting. That is the statistical part. But, the empirical evidence appears to have been the key. And from Dr. Donald Wheeler in his book Advanced Topics in Statistical Process Control (www.spcpress.com):

The probability approach has led to people putting restrictions on control charts. The data must be normally distributed. Control charts work because of the central limit theorem (our May 2017 publication addresses this fallacy). This has hurt the use of control charts over time.

This publication looked at three sigma limits and the justification behind them. Some approach control charts with probabilities. While Shewhart considered probabilities in his three sigma approach, there were other more important considerations. The major one was that the three sigma limit work in the real world. They give a good balance between looking for special causes and not looking for special causes. The concept of three sigma limits has been around for almost 100 years. Despite attempts to change the approach, the three sigma limits continue to be effective. There is no reason to use anything else on a control chart. Dr. Shewhart, Dr. Deming and Dr. Wheeler make pretty convincing arguments why that is so.

Hi Bill,Imagine that you worked at a process with a online monitor that returned a measurement every second. Suppose that the common cause scatter is close to normally distributed, and there is automated SPC software set up to handle the measurements. Are you sure that you'd be happy with a false alarm being triggered every 6 minutes or so?

Hi Dale,

I probably wouldn't chart each data point. I would probably take a time frame (minute, five minutes, whatever) and track the average of that time frame over time as well as the standard deviation of the time frame, both as individuals charts. We used to do that with PVC reactors where we tracked reactions temperatures for a batch. Gave us some good insights into differences in batches.

A longer interval Xbar-S chart would be a more obvious alternative if we don't need a quick response. But what if our automated control system with deadband really needs to respond quickly because special cause upsets can grow suddenly? The traditional 3 sigma limits are ultimately a (deadband) heuristic that works well when the sampling rate is low (a few samples per day). I think a decent case can be made that SPC limits need to be wider to control the overall false positive rate when applying SPC principles to the much higher frequency sampling often seen in the computer age.

I did a simulation of a stable process generating 1000 datapoints, normally distributed, random values. From the first 25 data points, I calculated 3 sigma limits and 2 sigma "warning" limits. Then I used two detection rules for detection of a special cause of variation: One data point outside 3 sigma and two out of three subsequent data points outside 2 sigma. Knowing that my computer generated normally distributed data points, any alarm is a false alarm. I counted these false alarms for my 1000 data points and then repeated the entire simulation a number of times (19) with the same value for and sigma. Then I plotted the number of false alarms detected (on the y-axis) as a function of where my 3 sigma limits were found for each run (on the x-axis). Above 3 sigma, the number of false alarms was quite low, and decreasing with increasing limit. Below 3 sigma, the number of false alarms increased rapidly with lower values for the limit found. At 3 sigma, there was a quite sharp "knee" on the curve which can be drawn through the data points (x = control limit value found from the first 25 data points, y = number of false alarms for all 1000 data points in one run). This simulation was quite convincing to me.The simulation also reminded me that using more detection rules at the same time (of course) increases the number of false alarms. But independent of which rules are used and how many detection rules I use at the same time, the "knee" of this curve will still be at 3 sigma, because all the detection rules are constructed in a similar way with respect to the sigma value found in phase 1 of constructing the control chart.It would be an idea to have some advice on which detection rules should we use! We should not use them all at the same time? I guess that if a "trend" because of wear-out is a typical failure mode you expect to happen to your process, the "trending" detection rule is nice to use. Can anyone give some examples from real life processes, how many rules and which rules are used in practice?

Sounds like you did some detailed work on this. The number of rules you use, to me, should be based on how stable your process is. If it is not very stable, I would probably use points beyond the control limits only. The other thing to consider is how important is a little drift in the average. If not very important, I would stay with points beyond the control limit. If is important (and you don't have many beyond the control limits) then I would add the zone tests. Just personal opinion.