Re: modify Probit coefficients to match desired number of yes predictions
- From: m00es <m00es@xxxxxxxxx>
- Date: Thu, 16 Aug 2007 03:49:35 -0700
On Aug 15, 8:53 pm, Bob <frott...@xxxxxxxxx> wrote:
Hello,
I have a problem to accomplish the following:
I have a probit model with a 0/1 predictor y and let's say 10
variables so that x1 to x10. (like y=a+b1*x1+b2*x2+...+b10*x10)
For now, I am only interested in x1 and x2.
In my sample of size 1000 there are
500 observations where (case 1) y=1 and x1=0 and x2=0,
100 obs. where (case 2) y=1 and x1=1 and x2=0,
10 obs. where (case 3) y=1 and x1=0 and x2=1.
Now I want to change the intercept a and the coefficients b1 and b2 so
that if I predict from the same data, the result includes
600 times case 1,
150 times case 2, and
20 times case 3.
I have tried several things but I can't figure it out. Do I have to
try to get numerically to the desired values or is there an analytical
way?
Thanks a lot for any help,
Bob
(My thought was that first I have to modify a to increase the average
probability of case 1 to 60%, then increase the avg. probability of
case 2 by 50% taking into account the changed offset due to a, and
similarly for case 3.
One try was: Using the probability function of the standard normal to
convert a to a probability, raise the resulting value by 0.1 and then
convert back to a z-value which became the new a. I am probably
thinking completely wrong.)
Okay, you say that you have a probit model with y either 0 or 1 and
you are interested in the case where there are two predictors x1 and
x2. That implies the following:
Pr(y = 0 | x1, x2) = F( tau - beta1*x1 - beta2*x2 )
Pr(y = 1 | x1, x2) = 1 - F( tau - beta1*x1 - beta2*x2 )
where F() is the cdf of a standard normal distribution and tau is the
threshold parameter.
You want:
Pr(y = 0 | x1, x2) = F( tau - beta1*x1 - beta2*x2 ) = .23
Pr(y = 1 | x1, x2) = 1 - F( tau - beta1*x1 - beta2*x2 ) = .77
and more specifically:
Pr(y = 1 | x1=0, x2=0) = 1 - F( tau ) = .60
Pr(y = 1 | x1=1, x2=0) = 1 - F( tau - beta1 ) = .15
Pr(y = 1 | x1=0, x2=1) = 1 - F( tau - beta2 ) = .02
That can be solved as follows:
1) Pr(y = 1 | x1=0, x2=0) = 1 - F( tau ) = .60
implies that tau must be -.2533.
2) Pr(y = 1 | x1=1, x2=0) = 1 - F( -.2533 - beta1 ) = .15
then implies that beta1 must be -1.2897.
3) Pr(y = 1 | x1=0, x2=1) = 1 - F( -.2533 - beta2 ) = .02
then implies that beta2 must be -2.3070.
However, one caveat. The results above then imply that:
Pr(y = 0 | x1, x2) = F( -.2533 + 1.2897*x1 + 2.3070*x2 ) = .23
If you plug in any combination of x1 = 0 or 1 and x2 = 0 or 1, then
this will not give you the desired value of .23. For example, for x1 =
0 and x2 = 0:
Pr(y = 0 | x1=0, x2=0) = F( -.2533 ) = .40.
Therefore, if x1 and x2 can only take on the values 0 or 1, then it is
not possible to specify tau, beta1, and beta2 to give you the desired
frequencies. For those people where y = 0, at least one of the two (x1
or x2) must be able to take on negative values.
Hope this helps,
m00es
.
- Follow-Ups:
- References:
- Prev by Date: Simulating normally distributed data with constraints
- Next by Date: Re: Simulating normally distributed data with constraints
- Previous by thread: modify Probit coefficients to match desired number of yes predictions
- Next by thread: Re: modify Probit coefficients to match desired number of yes predictions
- Index(es):
Relevant Pages
|