Re: Welch formula doesn´t converge



On 15 Fev, 21:41, Jack Tomsky <jtom...@xxxxxxxxxxxxx> wrote:
Welch formula doesn´t converge...

Given that the estimator of the variance of the
difference of means for equal variance normal
Populations,

___var = [(ssX+ssY)/(nX+nY-2)]*(1/nX + 1/nY)

when dividing the difference on observed means: Xhat
- Yhat, follows exactly a T Student Distribution
(with nX+nY-2 df) it was expected that the Welch´s
formula:

___df = (vX/ nX + vY/ nY) ^2 / u
___ u =
= (1/(nX-1))* (vX/ nX) ^2 + (1/(nY-1))*(vY/ nY)^2
___________________(A)

Data:

Experiment 1 (10´000 pairs of samples)
______X ~ N(0, sd=1): 10____Y ~ N(0, sd=5): 10
__9_0.293___10_0.565__11_0.105__12_0.025 ...

______X ~ N(0, 1):10____Y~N(0, 1): 10
__. . . __13_0.035___14_0.063___15_0.100___16_0.147
__17_0.246___18_0.383___

It's clearly seen that if (A) converges relative to
the Population sd, then the frequency for df =
10+10-2 would be 1.000, 0.000 otherwise. BUT IN FACT
the frequencies vary from df=9 to df=18 when 10´000
pairs of simulated samples are used.

___100´000 sample pairs
______X ~ N(0, 1):10____Y~N(0, 1): 10
__. . . __13_0.037___14_0.062___15_0.095___16_0.149
__17_0.251___18_0.380___

Luis Amaral Afonso

REM "DID"
CLS
DEFDBL A-Z
PRINT " WELCH FORMULA : degrees of freedom
freedom "
INPUT " sX , nX "; sX, nX
INPUT " sY , nY "; sY, nY
DIM x(nX), y(nY), df(nX + nY)
all = 100000
pi = 4 * ATN(1)
FOR rpt = 1 TO all: RANDOMIZE TIMER
swX = 0: sswX = 0: swY = 0: sswY = 0
FOR i = 1 TO nX
aa = SQR(-2 * LOG(RND))
x(i) = sX * aa * COS(2 * pi * RND)
x = x(i)
swX = swX + x: sswX = sswX + x * x
NEXT i
FOR i = 1 TO nY
aa = SQR(-2 * LOG(RND))
y(i) = sY * aa * COS(2 * pi * RND)
y = y(i)
swY = swY + y: sswY = sswY + y * y
NEXT i
vX = (sswX - swX * swX / nX) / (nX - 1)
vY = (sswY - swY * swY / nY) / (nY - 1)
v1 = (1 / (nX - 1)) * ((vX / nX) ^ 2)
vv = (1 / (nY - 1)) * ((vY / nY) ^ 2)
a = (vX / nX + vY / nY) ^ 2
df = a / (v1 + vv)
u = INT(df + .5)
REM PRINT USING "## "; df;
df(u) = df(u) + 1
IF df > (nX + nY - 3) THEN g = g + 1
NEXT rpt
LOCATE 10, 1
FOR t = 0 TO nX + nY
IF df(t) = 0 THEN GOTO 40
PRINT USING "## #.### "; t; df(t) / all;
40 NEXT t: END

The Welsh df is not supposed to converge to a constant. It's a random variable which depends on the sample standard deviations and the sample sizes. It takes all values between min(N1, N2) - 1 and N1+ N2 - 2.

Jack (moderator)- Ocultar texto citado -

- Mostrar texto citado -

NO

When the underlying Normal Distributions tends to equal variances the
degrees of freedom MUST tend to a T Distribution with
nX + nY -2 degrees of freedom.
BUT ON CONTRARY THIS NEVER HAPPENS, CONCLUSION:
The Welch solution is not credible

Luis Amaral Afonso
.



Relevant Pages

  • =?ISO-8859-1?Q?Re:_Welch_formula_doesn=B4t_converge?=
    ... difference of means for equal variance normal ... PRINT "  WELCH FORMULA: degrees of freedom ... Jack - Ocultar texto citado - ... But even more important: the df (Welch) DON´T ...
    (sci.stat.math)
  • Re: [opensuse] Linus loves GPL v2 ---- and is not on a crusade
    ... might consider that the difference (freedom) is still worth fighting "for". ... Open source doesn't just mean access to the source code. ... The license shall not restrict any party from selling or giving away the ... software as a component of an aggregate software distribution containing ...
    (SuSE)
  • Re: Why is Fedora not a Free GNU/Linux distributions?
    ... Freedom 0 is the freedom to run the program, ... Access to the source code is a precondition for this. ... The license shall not restrict any party from selling or giving away the ... software as a component of an aggregate software distribution containing ...
    (Fedora)
  • Re: Civility towards others on this site
    ... the only thing preventing distribution of postings would be ... We all have the freedom of limited speech, ... 'radio station' meets our personal sense of civility. ... through large group practices of measurable coherence ...
    (comp.sys.hp48)
  • Re: P-value in Excel
    ... test statistic has t distribution, and you must need a two-tailed test. ... Microsoft designed the Excel functions for the t ... TDIST and TINV are different. ... and degrees of freedom) and always gives the *two-tailed* value. ...
    (sci.math)