Re: binomial 'association' measure?
From: Aleks Jakulin (a_jakulin_at_@hotmail.com)
Date: 10/03/04
- Next message: Ian Jermyn: "Re: A simple but confusing question"
- Previous message: Aleks Jakulin: "Re: binomial 'association' measure?"
- In reply to: Dan Bolser: "Re: binomial 'association' measure?"
- Next in thread: Ross Clement: "Re: binomial 'association' measure?"
- Messages sorted by: [ date ] [ thread ]
Date: Sun, 3 Oct 2004 20:27:59 +0200
Dan Bolser wrote:
> >
>>We explored the latter two approaches in our paper at
>>http://www.ailab.si/aleks/Int/jakulin-bratko-ICML2004.pdf.
>
> "In problems with many attributes, the joint PDF may become sparse.
> The objective of learning is to construct a model of the joint PDF
> that will avoid this sparseness."
>
> Do you mean learning in this particular problem domain or learning
> in general?
> Is such a constructed model 'the best' in some way? Why is it the
> objective?
Actually, either way. In general, probability becomes meaningless with
sparseness. For meaningful probability, you need some kind of overlap
of multiple instances in the same locale. Even procedures that claim
to be sparse, such as support vector machines, are truly just
projecting the high-dimensional instances into one-dimensional
distances from a certain hyperplane.
> You say that 'high or low' interaction information among attributes
> is an indication that the attributes interact and should not be
> factorized. This makes sense because you say 'factorization takes
> advantage of independencies among attributes', however, you also
> say 'of course, the factors themselves need not be independent'.
> I am confused!
Imagine some joint PMF P(A,B,C,D,E). The interactions are only present
in p(A,B) and in p(B,C,D). These, along with P(E) can be seen as
"factors". But you cannot factorize P(A,B,C,D,E) into
P(A,B)P(B,C,D)P(E) because B appears twice, and the "factors" overlap.
This can be corrected using the chain rule and conditioning P(A,B) and
P(B,C,D) on P(B), as is a common practice with Bayesian networks. The
factorization of P(A,B,C,D,E) is then P(A|B)P(B,C,D,B)P(E).
Aleks
-- mag. Aleks Jakulin http://www.ailab.si/aleks/ Artificial Intelligence Laboratory, Faculty of Computer and Information Science, University of Ljubljana, Slovenia.
- Next message: Ian Jermyn: "Re: A simple but confusing question"
- Previous message: Aleks Jakulin: "Re: binomial 'association' measure?"
- In reply to: Dan Bolser: "Re: binomial 'association' measure?"
- Next in thread: Ross Clement: "Re: binomial 'association' measure?"
- Messages sorted by: [ date ] [ thread ]