Re: information content of a variable

From: Rajarshi Guha (rajarshi_at_presidency.com)
Date: 06/18/04


Date: Fri, 18 Jun 2004 09:37:17 -0400

On Thu, 17 Jun 2004 23:53:43 -0400, Richard Ulrich wrote:

> On Thu, 17 Jun 2004 16:13:30 -0400, Rajarshi Guha
> <rajarshi@presidency.com> wrote:
>
>> Hi,
>> I'm not sure if this is the correct place to post this, so if not I'd
>> appreciate pointers to where I could.
>>
>> When building models (say, regression or neural network) we need to choose
>> a set of 'information rich' independent variables.
>>
>> Is there any literature related to this topic?
>
>
> "Information theory" has a serious concern, but I think you
> are after something different than Shannon's Information.
> A set of dichotomies hold most possible information if they
> are 0.50, which is also where the Variance is most.

Thanks for the pointers.
Actually I was indeed thinking in terms of an information theory approach.
But then the question arises: if a variable is deemed as information rich
(in terms of information theory) does that make it useful for a
statistical model or does the information theory idea imply something else
(i.e., is not directly applicable in the context of statistical models)

Thanks,
Rajarshi