Re: Finding pattern i high dimensional dataset



Hi,

I have a simulation of an algorithm working on a list
of objects. Each
object has a priority which reflects how much that
particular object is
being used.
Here is an example of some sample data with 4 objects
in a list. The first
set in each line is the priorities and the second set
in each line is the
resulting usage frequency
{{p1,p2,p3,p4},{f1,f2,f3,f4}}.

{{1,2,10,10},{0.0909,0.0909,0.4091,0.4091}}
{{1,2,11,10},{0.0833,0.0833,0.4167,0.4167}}
{{1,2,10,12},{0.0800,0.0800,0.2800,0.5600}}
{{1,2,11,1},{0.0909,0.0909,0.7273,0.0909}}
{{1,2,11,2},{0.0909,0.0909,0.7273,0.0909}}
{{1,2,11,3},{0.0909,0.0909,0.7272,0.0909}}
{{1,2,11,4},{0.0909,0.0909,0.7046,0.1136}}
{{1,2,11,5},{0.0870,0.0870,0.6956,0.1304}}
{{1,2,11,6},{0.0833,0.0834,0.6667,0.1666}}
{{7,1,2,11},{0.1818,0.0909,0.0909,0.6364}}
{{1,2,11,8},{0.0909,0.0909,0.5909,0.2273}}
{{9,1,2,11},{0.2727,0.0909,0.0909,0.5455}}
{{1,2,11,10},{0.0833,0.0833,0.4167,0.4167}}
{{1,2,11,11},{0.0833,0.0833,0.4167,0.4167}}
{{1,2,11,12},{0.0833,0.0833,0.4167,0.4167}}

I can create as much data as I want with as many
elements as I want.

What I would like to do is to find a pattern. More
precisely: How does the
set of usage frequencies depend of the set of
priorities. I don't know much
about multivariate data analysis so any hint on where
to start or what to
read about would be most helpful.

Thanks in advance.

The keyword for your problem is 'regression analysis': you want
to find a function f that relates the priorities to the frequencies
up to some statistical error e.

(f1,f2,...) = f(p1,p2,...) + e

Since the priorities can only take integers as values, some
so-called 'coding' of these variables will be neccessary.

The situation becomes much simpler if there were no
couplings between the different objects. That means the
frequency fi for object i depends solely on the priority pi.
Then the problem is a univariate regression to be solved
for every i separately, which is easier than solving the
full multivariate problem.

H
.



Relevant Pages

  • Finding pattern i high dimensional dataset
    ... I have a simulation of an algorithm working on a list of objects. ... resulting usage frequency,}. ... set of usage frequencies depend of the set of priorities. ...
    (sci.math)
  • Finding a pattern in high dimensional data
    ... I have a simulation of an algorithm working on a list of objects. ... resulting usage frequency,}. ... set of usage frequencies depend of the set of priorities. ...
    (sci.stat.math)