Re: Generating a Matrix of Random values with a specified Correlation



On Sep 25, 12:17 pm, Robert Israel
<isr...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
willgarri...@xxxxxxxxx writes:
I'm trying to generate two columns of random values where each value
is between 1 and 5 where the matrix as a whole has a specified
correlation.

Ideally this is what I'm looking for:

Example algorithm:

1. Generate 20 random values between 1 and 5 in array A.
2. Generate 20 random values between 1 and 5 in array B.
3. Create a 3rd array (array C), where array B is sorted so that array
A and array C have a correlation of X.

Does anyone have any ideas on how to do this?

This may be a horrible way to solve this problem. If so, does anyone
have any ideas on how else I could go about it?

Start with your two arrays A and B.  Let a=Var(A), b=Var(B), c=Cov(A,B),
so the (Pearson) correlation of A and B is rho = c/sqrt(a b).  Let's
suppose the desired correlation x > rho (where of course x <= 1).  
Consider C = tA + (1-t)B, where 0 <= t <= 1.  
We have Cov(A,C) = ta + (1-t) c and
Var(C) = t^2 a + 2 t (1-t) c + (1-t)^2 b so the correlation of A and C
is (ta + (1-t)c)/sqrt(a(t^2 a + 2t(1-t)c + (1-t)^2 b)).
Set this equal to x and solve for t: there should be exactly one solution
in the interval 0 <= t <= 1.  
On the other hand, if -1 <= x < rho,
try C = t R + (1-t) B where each R_j = 6 - A_j.
--
Robert Israel              isr...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Department of Mathematics        http://www.math.ubc.ca/~israel
University of British Columbia            Vancouver, BC, Canada


Thank you Robert.

I think this is not only over my head but perhaps not the exact answer
I'm looking for. However, it being over my head, it's impossible for
me to know.

Let me phrase this as a simpler question with hopes of improving my
chances of understanding your response.

Let say I have some number, A, that is in the range of 1 to 5. I need
to generate another number B, also in the range of 1 to 5 where the
two numbers, A and B, have a correlation of X.

Is this even possible, or is the calculation of correlation a one way
street, meaning that I can only calculate the correlation of a data
set as opposed to generate a data set based on a predefined
correlation?

I apologize again if you already answered this question in your
previous reply and just went too over my head for me to understand.



.



Relevant Pages

  • Re: using a named cell in a fuction
    ... Josh wrote: ... > row has a different title that matches an array of cells. ... > correlation of the arrays matching its column and row, ...
    (microsoft.public.excel.worksheet.functions)
  • Re: coorelating randomly generated data
    ... correlation between them (such as small cap and large cap stocks) ... ItemDefs: array of ... ... {Initialization of the Item Value recs:} ...
    (borland.public.delphi.non-technical)
  • Re: java based supercomputer
    ... checking the correlation beteween an array of data and another array ... java psuedo remote threads will take a considerrably less time. ... Does your algorithm lend itself well to paralellization? ... the only bottleneck i can see is checking the correlation value ...
    (comp.lang.java.programmer)
  • Re: Generating a Matrix of Random values with a specified Correlation
    ... (array C), where array B is sorted so that array A and array C have a ... correlation of X. ...  Let's suppose the desired correlation x> rho (where of course x ... chances of understanding your response. ...
    (sci.math)
  • Re: Optimizing the creation of an array of correllation coefficents
    ... if (rowcount> colcount) ... largest array you have room for in your matlab memory. ... content but shifted in location within the 3D image frame, ... Your computation provides no measure of a correlation ...
    (comp.soft-sys.matlab)