Re: Welch formula doesn´t converge



On 16 Fev, 06:07, Jack Tomsky <jtom...@xxxxxxxxxxxxx> wrote:
On 15 fev, 21:41, Jack Tomsky <jtom...@xxxxxxxxxxxxx>
wrote:
Welch formula doesn´t converge...

Given that the estimator of the variance of the
difference of means for equal variance normal
Populations,

___var = [(ssX+ssY)/(nX+nY-2)]*(1/nX + 1/nY)

when dividing the difference on observed means:
Xhat
- Yhat, follows exactly a T Student Distribution
(with nX+nY-2 df) it was expected that the
Welch´s
formula:

___df = (vX/ nX + vY/ nY) ^2 / u
___ u =
= (1/(nX-1))* (vX/ nX) ^2  + (1/(nY-1))*(vY/
nY)^2
___________________(A)

Data:

Experiment 1 (10´000 pairs of samples)
______X ~ N(0, sd=1): 10____Y ~ N(0, sd=5): 10
__9_0.293___10_0.565__11_0.105__12_0.025 ...

______X ~ N(0, 1):10____Y~N(0, 1): 10
__. . .
__13_0.035___14_0.063___15_0.100___16_0.147
__17_0.246___18_0.383___

It's clearly seen that if (A) converges relative
to
the Population sd, then the frequency for df =
10+10-2 would be 1.000, 0.000 otherwise. BUT IN
FACT
the frequencies vary from df=9 to df=18 when
10´000
pairs of simulated samples are used.

___100´000 sample pairs
______X ~ N(0, 1):10____Y~N(0, 1): 10
__. . .
__13_0.037___14_0.062___15_0.095___16_0.149
__17_0.251___18_0.380___

Luis Amaral Afonso

        REM "DID"
        CLS
        DEFDBL A-Z
PRINT "  WELCH FORMULA : degrees of freedom
freedom  "
        INPUT " sX , nX  "; sX, nX
        INPUT " sY , nY  "; sY, nY
        DIM x(nX), y(nY), df(nX + nY)
        all = 100000
        pi = 4 * ATN(1)
        FOR rpt = 1 TO all: RANDOMIZE TIMER
        swX = 0: sswX = 0: swY = 0: sswY = 0
        FOR i = 1 TO nX
        aa = SQR(-2 * LOG(RND))
        x(i) = sX * aa * COS(2 * pi * RND)
        x = x(i)
        swX = swX + x: sswX = sswX + x * x
        NEXT i
        FOR i = 1 TO nY
        aa = SQR(-2 * LOG(RND))
        y(i) = sY * aa * COS(2 * pi * RND)
        y = y(i)
        swY = swY + y: sswY = sswY + y * y
        NEXT i
        vX = (sswX - swX * swX / nX) / (nX - 1)
        vY = (sswY - swY * swY / nY) / (nY - 1)
        v1 = (1 / (nX - 1)) * ((vX / nX) ^ 2)
        vv = (1 / (nY - 1)) * ((vY / nY) ^ 2)
        a = (vX / nX + vY / nY) ^ 2
        df = a / (v1 + vv)
        u = INT(df + .5)
REM     PRINT USING "##  "; df;
        df(u) = df(u) + 1
        IF df > (nX + nY - 3) THEN g = g + 1
        NEXT rpt
        LOCATE 10, 1
        FOR t = 0 TO nX + nY
        IF df(t) = 0 THEN GOTO 40
        PRINT USING "## #.###  "; t; df(t) / all;
40     NEXT t: END

The Welsh df is not supposed to converge to a
constant.  It's a random variable which depends on
the sample standard deviations and the sample sizes.
It takes all values between min(N1, N2) - 1 and N1+
+ N2 - 2.

Jack (moderator)- Ocultar texto entre aspas -

- Mostrar texto entre aspas -

The convergence concerns HERE, degrees of freedom.
Luis Amaral Afonso

It took me a minute, using elementary calculus, to find the range of values taken by Welch's df in terms of N1 and N2.  How long did Afonso take for his MC BASIC program to arrive at the same results?  And he still can't properly explain what it means.

Jack (moderator)- Ocultar texto citado -

- Mostrar texto citado -

Jack

There are NOT the RANGES that means(you COPIED
FROM TEXBOOKS, you are unable to go further, whatever the topic ) but
the FREQUENCIES OF EACH evaluated degree of freedom!
But even more important : the df (Welch) DON´T
CONVERGE to the EXACT df value when the two
underlying normal Populations have equal variances. Then this MEANS
THAT the Welch
solution´s worthless.
THIS IS WHAT MEANS!

Luis Amaral Afonso
.