Re: Understanding subgroup sizes for Six Sigma



Shawn:

The whole "statistical quality control" situations is a mess. Most (if
not all) of the "rule and regulations" are rooted in work done in the
1920s and some additional work done in the 1950s ("Rules for Runs",
etc.) The truth is that a lot of it... more than you might imagine...
is pure statistical B.S. The "sub-group thing" is among the worst.
"Charts for Averages and Ranges"... X-bar, R charts) are often not
relevant and can be misleading. The worst of the collection for
"charting" is "attribute data", followed by "R&R" which is just
awesomely terrible. I am a chemical engineer and a statistician, and
was in charge of an applied statistics group for about 34 years at a
major corporation. I left there in early 1998 (retired with 40 years
service) just as Six-Sigma was coming in the door. I've taught SS "my
way" for some companies, both large and small. While there is some
merit in SS, most of the "statistics" taught under that practice is
hogwash. So your B.S. detector is working just fine. I assure you
that the more you dig into this the more it will stink.

I wish I could suggest some books on statistical quality control that
at least have the "statistical part" done correctly. Sorry, but so far
as I can tell there are none. The reasons for this are simple. All
books and other writings are simply copied from work done in the 1920s,
much of which was driven around the need to keep calculations simple.
The problems lie not in the approximations and simplification of the
methods, but moreso in gross failures to mention the important
assumptions. Those assumptions are not trivial, and it's the failure to
recognize and appreciate them that cause the problems. I'm not talking
samll stuff here. I'm talking gross errors. So gross as to make much
of the "SPC" work pointless, confusing, and misleading. By paying
attentiokn we can do a lot better. The other problem is rotted in an
organization (American Society of Quality) that is driven by money and
politics. The people who run that are not interested in changes or
improvements. Anyone who criticizes "SPC" as it now stands... rooted
and flawed in the 1920s... or who points out the nonsense in such
things as "C-charts" and "R&R" is immediately a "bad guy". Reason: Too
many people in that old-boy network have made big-bucks "teaching and
consulting" and they are not about to admit that the emporer is as
naked as a jaybird. The book you have is indeed vague

Now to your direct question.

First of all, the sub-group thing is a total mess. There's seldom no
good rationale for setting the groupings. The book you have is indeed
vague on this point, but so are all the others. Frankly, I think there
are better things to do with data. But let me try anyhow.

You wrote "but then he says that the subgroup size will dramatically
lower this
probability and allow the operator to detect changes more quickly."
OK, increasing the sub-group size should in principle NOT CHANGE the
frequency of false alarms. If the process is correctly centered on
targer, then (using the classic 3-sigma chart, with no
"rules for runs") then the frequency of false alarms should be near 1
in 370 charted data points. But, increasing the sub-group size MAY
reduce the amount of time that passes by before we detect a change in
the underlying process average. The reason I say MAY is this. If we
arbitrarily increase the sub-group size then getting the "more data"
may take "more time". Viz., if we are collecting data are the rate of
one data per hour and we use a sub-group size of "4"... and if we
increase the sub-group size to "8"... twice as much time will pass by
before we get a sub-group and chart a point and make a decision.

All rules for charts should go back to an "Average Run Length" curve
(ARL curve) first in terms of the average amount of data that will pass
by us before we "detect" a change and also in terms of the amount of
TIME that will pass by us befrore we "detect" a change.
So simply increasing the sub-group size to reduce the variation by
averaging data may actually make matters worse in terms of our ability
to detect meaningful changes in a process average. This is one of the
many "potentially fatal errors" that are never mentioned in SPC
literature. Moreover, almost nobody in the "old-boy network" mentions
ARL curves. Yet the performance of control charts (performance =
ability to detect changes in a process) hinges on those curves...
getting access to them... and from them learning to make wise choices
of charting methods. ARL curves are not obscure... they are
well-known... but they are hardly even mentioned in SPC courses, books,
etc.

So how good is the performance of control charts, in general? Well,
the classic 3-sigma charts have very poor performance. This means their
ability to detect changes is terrible. This has been known for decades,
but it is never mentioned. At some point along the way "rules for runs"
were added to improve performance. Good idea... but with unintended
consequences. Adding "rules for runs" increases the frequency of false
alarms... and can increase that frequency to as much as 1 in 15 charted
data points. That's never mentioned.
Textbooks and "training programs" casually suggest including some
"rules for runs" without mentioning the consequences... and those
consequences can be extreme.

You wrote: My intuition, which is obviously
wrong, tells me that each of the 5, 10, 20, or 50 subgroup averages is
equally likely to plot outside the control limits.

Your intuition is not wrong. If you "follow the rules" for calculating
the control limits, then the frequency of false alarms (an "out of
control point" when in fact the average is still on target) will still
be 1 on 370. Your intuition is correct. However, if the process
average has shifted off-target then in principle you should be able to
detect that "sooner". This last statement may or may not be true in
practice... "depending". There are no "silver bullet" recipes for that
one... some actual data is required in order to estimate the
consequences of changing the sub-group size. It's easy, but not just a
simple equation or some canned software. It takes some investigation
and thinking.

As you probably noticed, I march to a different drummer. Most of what
I known has come from hard experience. Much comes from standing in
front of classes and teaching... and realizing that the questions
students are asking (just like yours) don't fit into the "canned 1920s
pattern". Also, realizing that the very words coming out of my mouth
really don't make any sense... that something in inherently wrong about
nearly all of the SPC training materials, etc. The fact is that it's
easier to "do it right" than to struggle with SPC as it is taught even
now, in 2006. Many people, including people I teach and consult with,
have asked the hard questions that signify they also smell a rat. But
the "good old-boy network" is so overwhelming and overbearing that is
it difficult to penetrate it with hard questions.

This is a long note. I can send you some printed materials that
explain some of the deficiencies I've mentioned... and some I have not
mentioned. Some materials are digitized and other have to travel by
snail-mail. Some are immediately obvious, once you see them.
Some are a bit obscure, and require more thought.

To get to me send an e-mail to hedging77 followed by the usual "at"
symbol and then yahoo.com That's an address I check about once a day.
From there I can give you my "real" e-mail address which is checked
almost continually.

Take care, and be of good cheer. Your intuition is sound. OMU








Shawn wrote:
My company is making plans to tranistion to Six Sigma processes, so I
purchased Pyzdek's book from Amazon to try and get a grip of it. My
statistics education is a couple decades old at this point, but I
think Pyzdek's wording is quite vague and is causing some of my
confusion.

On the topic of subgroups and control charts, an example is presented
with the typical Six Sigma "out of control" probability of .0027. He
then goes on to suggest that the probability of an "out of control
indication is 1/.0027 = every 370 units which I completely understand,
but then he says that the subgroup size will dramatically lower this
probability and allow the operator to detect changes more quickly.

What I don't understand, and he doesn't explain, is how can I quantify
how "dramatically" it will change? If I'm using standard SS control
limits of .0027, how/why would changing my subgroup size from say, 5,
to 10, 20, or even 50 make a difference in the probability that one of
the averages is "out of control"? My intuition, which is obviously
wrong, tells me that each of the 5, 10, 20, or 50 subgroup averages is
equally likely to plot outside the control limits.

Any help, equations, or pointers to material I could review would be
greatly appreciated. Suggestions for a better beginner book would
also be appreciated. (More examples!)

.



Relevant Pages

  • Re: Understanding subgroup sizes for Six Sigma
    ... The whole "statistical quality control" situations is a mess. ... "Charts for Averages and Ranges"... ... most of the "statistics" taught under that practice is ... the sub-group thing is a total mess. ...
    (sci.stat.math)
  • Re: Canadian doctors coming to the US
    ... >have LESS stress, despite their complaints. ... How are you going to control for self-selection from the ... I do not claim that these views are those of the Statistics Department or of Purdue University. ...
    (sci.med.cardiology)
  • Re: Canadian doctors coming to the US
    ... >have LESS stress, despite their complaints. ... How are you going to control for self-selection from the ... I do not claim that these views are those of the Statistics Department or of Purdue University. ...
    (sci.med.nutrition)
  • Re: Canadian doctors coming to the US
    ... >have LESS stress, despite their complaints. ... How are you going to control for self-selection from the ... I do not claim that these views are those of the Statistics Department or of Purdue University. ...
    (sci.med)
  • Re: User Controls (Active X) - Da Process
    ... Suppose that the object now has access to the data array. ... The default method is to draw a BarChart, and to use just the last 150 ... through properties of the control. ... draw the different charts and panels. ...
    (microsoft.public.vb.general.discussion)