Re: How to ensure data correlation
- From: "Greg Heath" <heath@xxxxxxxxxxxxxxxx>
- Date: 12 Dec 2006 03:29:31 -0800
Raj wrote:
Hi,
I am post graduate student doing a project in the area of soft
computing.Th problem i am trying to solve is that of predicting
operaional risk. I have a small dataset of 25 data points that includes
five input variables - System downtime, Number of employees, Data
Quality, Number of transactions, Number of losses, and one output
variable - the loss amount. Because i have a very small dataset, i went
through the following process to generate additional data points:
step 1: Select one variable at a time and fit various distributions
over it.
step 2: Based upon the goodness of fit tests, select the best
distribution for each variable seperately.
step 3: Generate random numbers for each variable over the selected
distribution seperately.
step 4: Tabulate the values.
My question is how do we ensure the correleation among the variables
that was there in the original sample data over the randomely selected
data as well. Because the random numbers were generated seperately for
each variable i could not find any correlation among the variables that
was present in the original sample data.
please do give me some suggestions, i will be waiting for them.
See the recent thread "Data Simulation from Correlation matrix"
Hope this helps.
Greg
.
- References:
- How to ensure data correlation
- From: Raj
- How to ensure data correlation
- Prev by Date: How to ensure data correlation
- Next by Date: Server-Side, Embeddable Numerical Analysis Software
- Previous by thread: How to ensure data correlation
- Next by thread: Server-Side, Embeddable Numerical Analysis Software
- Index(es):
Relevant Pages
|