Re: Feature Selection
- From: "hhsaffar@xxxxxxxxx" <hhsaffar@xxxxxxxxx>
- Date: 24 Feb 2007 09:02:51 -0800
Dear Image Analyst
I know PCA, but I don't think the author meant PCA, he is talking
about correlation and not PCA, I want to know how is it possible to
select features merely upon correlation!
He says that he excluded features with correlation with correlation
more than .4. I want to know how does this!
Thank you very much.
On Feb 24, 4:35 pm, "ImageAnalyst" <imageanal...@xxxxxxxxxxxxxx>
wrote:
hhsaffar:
What's the part you don't understand? How to do a correlation, or why
you can exclude measurements of one of the features?
I don't know the features they were analyzing but it might be better
to keep all the features and do a principal components analysis. For
example, let's say I want to predict volume of people based on their
height and weight. Now height and weight are probably correlated at
greater than .4 because the trend is that if you are taller, you are
heavier and vice versa. So you could make a model that said volume =
a*height, or volume = b*weight and it would be sort of close, but not
always as accurate as it could be because of course with a 6 foot tall
person, some have big volume and some have small volume. So you could
get better prediction by including both: volume = c*weight+d*height.
Now they might be fine with throwing out some measurements because
sometimes measurements are to difficult to obtain (require elaborate/
expensive equipment or take an excessively long time), and you could
also throw out measurements that aren't correlated with anything. For
example, in my example of predicting volume, you would probably choose
not to measure hair color and eye color. And I can't make any
critique of the paper you read because I haven't read it.
But anyway, back to my first sentence, what don't you understand
exactly?
Regards,
ImageAnalyst
On Feb 23, 11:54 pm, "hhsaf...@xxxxxxxxx" <hhsaf...@xxxxxxxxx> wrote:
Hi All
I am reading an article, I don't understand a part of it.
It says:
"The original data set consists of features that are partly highly
correlated. Therefore, the number of features were reduced by
excluding features with a correlation higher than 0.4. This led to a
smaller data set of 21 features."
What does it mean by "excluding features with a correlation higher
than 0.4"?
I know autocorrelation matrix. Does it has anything to do with it? If
it is so, how?
Thanks in advance
.
- Follow-Ups:
- Re: Feature Selection
- From: ImageAnalyst
- Re: Feature Selection
- References:
- Feature Selection
- From: hhsaffar@xxxxxxxxx
- Re: Feature Selection
- From: ImageAnalyst
- Feature Selection
- Prev by Date: Re: image processing project for a SigEx Foundry project
- Next by Date: Re: connection between preselecting parts of information with high v
- Previous by thread: Re: Feature Selection
- Next by thread: Re: Feature Selection
- Index(es):
Relevant Pages
|