Re: Finding Statistically Significant Rules
- From: hgwelec <hgwelec@xxxxxxxxx>
- Date: 11 May 2007 01:03:38 -0700
On May 11, 10:43 am, Ray Koopman <koop...@xxxxxx> wrote:
On May 11, 12:11 am, hgwelec <hgwe...@xxxxxxxxx> wrote:
Dear All,
I have used a C4.5 decision tree to make an analysis. The analysis
(classification) is about finding the common characteristics of "good"
clients.
Say for example that out of the decision tree the following "rule" is
shown:
IF AGE >32
AND NUM_OF_CHILDREN > 2
AND CLIENT_PROFESSION="DOCTOR"
AND GENDER="MALE"
THEN
CLIENT="GOOD"
Now, the above rule has 85% accuracy and a 25% coverage on the
dataset.
The dataset consists of 700 cases
What i would have to do in order to assess whether this fact is NOT
atrtributed to pure chance?
A chi-square test clearly cannot be used since AGE and NUM_OF_CHILDREN
are not categorical variables.
Any Help greatly appreciated
Hgwelec
Reanalyze the data with the target variable permuted randomly.
Do this a few thousand times, keeping track of the accuracy of
the classifications. Look at the distribution of accuracies.
Where does your 85% figure stand in that distribution?
Hi Ray and thanks for your reply.
If i understood well, basically you say to do some sort of cross-
validations and keep track of how the 85% accuracy changes in each
fold. Of course i am not a statistician but this seems to me something
like an "empirical" rule.
With a chi-square test you are able to quantify statistical
significance and present your findings -say on a scientific paper-
but how can i quantify the significance in such a way you described?.
Again, sorry if i am totally mistaken about this
Thanks,
Hgwelec
.
- Follow-Ups:
- Re: Finding Statistically Significant Rules
- From: Ray Koopman
- Re: Finding Statistically Significant Rules
- References:
- Finding Statistically Significant Rules
- From: hgwelec
- Re: Finding Statistically Significant Rules
- From: Ray Koopman
- Finding Statistically Significant Rules
- Prev by Date: Re: Finding Statistically Significant Rules
- Next by Date: Re: Finding Statistically Significant Rules
- Previous by thread: Re: Finding Statistically Significant Rules
- Next by thread: Re: Finding Statistically Significant Rules
- Index(es):
Relevant Pages
|
|