Re: Feature Selection



hhsaffar:
What's the part you don't understand? How to do a correlation, or why
you can exclude measurements of one of the features?

I don't know the features they were analyzing but it might be better
to keep all the features and do a principal components analysis. For
example, let's say I want to predict volume of people based on their
height and weight. Now height and weight are probably correlated at
greater than .4 because the trend is that if you are taller, you are
heavier and vice versa. So you could make a model that said volume =
a*height, or volume = b*weight and it would be sort of close, but not
always as accurate as it could be because of course with a 6 foot tall
person, some have big volume and some have small volume. So you could
get better prediction by including both: volume = c*weight+d*height.
Now they might be fine with throwing out some measurements because
sometimes measurements are to difficult to obtain (require elaborate/
expensive equipment or take an excessively long time), and you could
also throw out measurements that aren't correlated with anything. For
example, in my example of predicting volume, you would probably choose
not to measure hair color and eye color. And I can't make any
critique of the paper you read because I haven't read it.

But anyway, back to my first sentence, what don't you understand
exactly?
Regards,
ImageAnalyst

On Feb 23, 11:54 pm, "hhsaf...@xxxxxxxxx" <hhsaf...@xxxxxxxxx> wrote:
Hi All

I am reading an article, I don't understand a part of it.
It says:
"The original data set consists of features that are partly highly
correlated. Therefore, the number of features were reduced by
excluding features with a correlation higher than 0.4. This led to a
smaller data set of 21 features."

What does it mean by "excluding features with a correlation higher
than 0.4"?
I know autocorrelation matrix. Does it has anything to do with it? If
it is so, how?

Thanks in advance


.



Relevant Pages

  • Re: Feature Selection
    ... Should I exclude both of them? ... "The original data set consists of features that are partly highly ... excluding features with a correlation higher than 0.4. ...
    (sci.image.processing)
  • Re: Feature Selection
    ... On Feb 24, 9:09 am, Martin Leese ... "The original data set consists of features that are partly highly ... excluding features with a correlation higher than 0.4. ...
    (sci.image.processing)
  • Re: Feature Selection
    ... "The original data set consists of features that are partly highly ... What does it mean by "excluding features with a correlation higher ...
    (sci.stat.math)
  • Re: Feature Selection
    ... "The original data set consists of features that are partly highly ... What does it mean by "excluding features with a correlation higher ... I know autocorrelation matrix. ...
    (sci.stat.math)
  • Re: Feature Selection
    ... I know PCA, but I don't think the author meant PCA, he is talking ... about correlation and not PCA, I want to know how is it possible to ... He says that he excluded features with correlation with correlation ... Now they might be fine with throwing out some measurements because ...
    (sci.image.processing)