Re: Finding pattern i high dimensional dataset
- From: Hagen <knaf@xxxxxxxxxxx>
- Date: Wed, 20 Jun 2007 04:35:44 EDT
Hi,
I have a simulation of an algorithm working on a list
of objects. Each
object has a priority which reflects how much that
particular object is
being used.
Here is an example of some sample data with 4 objects
in a list. The first
set in each line is the priorities and the second set
in each line is the
resulting usage frequency
{{p1,p2,p3,p4},{f1,f2,f3,f4}}.
{{1,2,10,10},{0.0909,0.0909,0.4091,0.4091}}
{{1,2,11,10},{0.0833,0.0833,0.4167,0.4167}}
{{1,2,10,12},{0.0800,0.0800,0.2800,0.5600}}
{{1,2,11,1},{0.0909,0.0909,0.7273,0.0909}}
{{1,2,11,2},{0.0909,0.0909,0.7273,0.0909}}
{{1,2,11,3},{0.0909,0.0909,0.7272,0.0909}}
{{1,2,11,4},{0.0909,0.0909,0.7046,0.1136}}
{{1,2,11,5},{0.0870,0.0870,0.6956,0.1304}}
{{1,2,11,6},{0.0833,0.0834,0.6667,0.1666}}
{{7,1,2,11},{0.1818,0.0909,0.0909,0.6364}}
{{1,2,11,8},{0.0909,0.0909,0.5909,0.2273}}
{{9,1,2,11},{0.2727,0.0909,0.0909,0.5455}}
{{1,2,11,10},{0.0833,0.0833,0.4167,0.4167}}
{{1,2,11,11},{0.0833,0.0833,0.4167,0.4167}}
{{1,2,11,12},{0.0833,0.0833,0.4167,0.4167}}
I can create as much data as I want with as many
elements as I want.
What I would like to do is to find a pattern. More
precisely: How does the
set of usage frequencies depend of the set of
priorities. I don't know much
about multivariate data analysis so any hint on where
to start or what to
read about would be most helpful.
Thanks in advance.
The keyword for your problem is 'regression analysis': you want
to find a function f that relates the priorities to the frequencies
up to some statistical error e.
(f1,f2,...) = f(p1,p2,...) + e
Since the priorities can only take integers as values, some
so-called 'coding' of these variables will be neccessary.
The situation becomes much simpler if there were no
couplings between the different objects. That means the
frequency fi for object i depends solely on the priority pi.
Then the problem is a univariate regression to be solved
for every i separately, which is easier than solving the
full multivariate problem.
H
.
- References:
- Finding pattern i high dimensional dataset
- From: Philbråd Tikenstraadt
- Finding pattern i high dimensional dataset
- Prev by Date: Re: Quotient Group Element Mimicked by Group Element?
- Next by Date: Re: Path-connectedness on connected manifolds
- Previous by thread: Finding pattern i high dimensional dataset
- Next by thread: Twenty three months.
- Index(es):
Relevant Pages
|