Re: Generating a Matrix of Random values with a specified Correlation



On Thu, 25 Sep 2008 09:13:36 -0700, willgarrison wrote:

Hello,

I'm trying to generate two columns of random values where each value is
between 1 and 5 where the matrix as a whole has a specified correlation.

Ideally this is what I'm looking for:

Example algorithm:

1. Generate 20 random values between 1 and 5 in array A.
2. Generate 20 > random values between 1 and 5 in array B.
3. Create a 3rd array (array C), where array B is sorted so
that array A and array C have a correlation of X.

Does anyone have any ideas on how to do this?
....

It would be more clear if you say "Create array C containing a
permutation of the elements of B, such that corr(A,C) ~ X", where
~ denotes approximately equal, and should specify a tolerance t.

Following is an approach that might converge and might not
involve too much computation. Let e(C) = |corr(A,C)-X|,
and C' = C with elements C_i, C_j exchanged.

1. Sort A into ascending order.
2. If corr(A,C) < X-t, go to step 5.
3. If corr(A,C) > X+t, go to step 7.
4. Quit with solution in C.
5. Find a pair of elements C_i, C_j such that i<j and C_i < C_j
and e(C') < e(C); if none such, go to 9.
6. Set C = C' and go to 2.
7. Find a pair of elements C_i, C_j such that i<j and C_i > C_j
and e(C') < e(C); if none such, go to 9.
8. Set C = C' and go to 2.
9. Quit with no solution.

I imagine this would work most of the time, but that cases
exist where a pair exchange isn't sufficient to improve the
current value of C. If that intuition is incorrect, then
the method would always work, because it repeatedly decreases
e(C) and there are finitely many permutations of C.

Note that there are several heuristics and techniques one could
use to improve the rate of convergence and decrease the work of
finding i, j in steps 5 and 7. Example heuristic: large values
of j-i generally will cause a larger change in e(C) than small
values will. Example technique: corr(A,C) is a function of n,
sum(A_i), sum(C_i), sum(A_i ^2), sum(C_i ^2), and sum(A_i * C_i).
When you exchange C_i and C_j the only one of these that changes
is the last, which changes by (A_i - A_j) * (C_j - C_i), so
you can compute e(C') with perhaps as few as 9 operations.

--
jiw
.



Relevant Pages

  • Re: RAID Recommendations for SBS2003
    ... -We currently have a total of 45GB, including the 14GB Exchange ... A small array does not perform as well as a large array with 7 ... Make one large partition "C" on the drive, ... If you were actually going to hammer this server with SQL and Exchange, ...
    (microsoft.public.windows.server.sbs)
  • Re: restoring quorum after array and logical drive repartitioning
    ... The previous setuppartioned 6 logical drives in a single array, with the quorum drive being on a logical drive in that array. ... I've removed SQL and moved it to a new cluster(Yea!!!!), but Exchange still runs on this cluster, and will remain on this cluster until Exchange 2007 rolls out and we replace the hardware with 64-bit servers. ...
    (microsoft.public.windows.server.clustering)
  • Re: upgrade udea needed
    ... Stop all Exchange and IIS services and set to manual for startup. ... Say you have a 5-drive RAID-5 array, all 73GB drives. ...
    (microsoft.public.exchange.design)
  • Re: Repeating numbers in same cell
    ... You could enter this array formula in two adjacent columns, ... Dim cNumList As Collection ... 'Sort Numerically Ascending ... ' following it, exchange the two elements. ...
    (microsoft.public.excel.worksheet.functions)

Loading