Re: Surrogate Data Analysis (Part II)

From: Dr Chaos (mbNOSPAMkennel_at_yahoo.dot.com)
Date: 09/16/04


Date: Thu, 16 Sep 2004 13:11:17 -0700

Costas Vorlow wrote:
> Thanks for all the answers on my previous question (Matt and Roger).
>
> I have posted this because I am not entirely sure that the experiment I
> am running is proper.
>
> In short, I have a sequence of closing prices from the stock market
> (daily data). I am generating from the levels (not returns i.e., 1st
> differences) the surrogates using AAFT (testing for null hypothesis 3
> i.e., that the series is a monotonic nonlinear transformation of
> linearly filtered noise). For this reason I do not use a pivotal
> discriminating statistic. I actually choose to fit a regression model
> which incorporates a Mackey-Glass component and an error term
> (heteroskedastic GARCH(1,1)). This has been shown to work fine with
> data, implying that the actual data generating process can be
> complex-non-stochastic and still have an error term which is not white
> noise. This also confirms the fat-tailedness of the stock returns as
> well as the volatility clustering observed.
>
> I found out that creating surrogates on the levels (not the 1st
> differences of prices), generating returns and then fitting the model on
> original and surrogate "returns" now sequences, produced meaningfully
> results on all sequences and that the estimations on the original data
> set were very different from those of the surrogates (the hypothesis was
> rejected, which I think is o.k. according to past research).

with low frequency behavior you will even get the spectrum wrong
because of the assumed periodicity with most FFT models. Did
you use a different kind of spectral estimator (not fourier based)
to estimate the true and surrogate spectra? There's a good
chance you'd have a fatter high frequency tail on the surrogates.

  However
> when returns were used directly in order to generate the surrogates, the
> results were very bad and inconclusive (i.e., the estimations did not
> make any sence on the original sequence and the surrogates and the
> hypothesis was accepted). This is also something I read in a book by
> Galcka (http://www.ieap.uni-kiel.de/plasma/ag-pfister/privat/galka/).

I don't understand quite what you mean here.

First differencing of
most financial return time series will result in very white signals,
and FFT based surrogate methods will not produce the conditional
volatility correlations in GARCH models and known to exist in observed
time series.

>
> 1. So I am asking basically, can you actually fit a model on original
> and surrogate data sets and then compare the estimated coefficient
> values instead of choosing a discriminating statistic to be computed on
> all sequences?
>
> Or
>
> 2. Should I fit the model on surrogates and originals and then use a
> discriminating statistic on the residuals?

It depends on what you are particulary interested in. I think 1 can
be squirrely unless you have a robust estimator for that parameter
(i.e. the parameter uncertainty is fairly well contained) and directly reflects
the issue of interest. 2 can have problems too, as modeling may whiten data
to non-distinguishability.

Personally if I have a model comparison problem my preference is to
obtain a minimum-description-length or Bayes factor (stochastic complexity)
statistic if possible. Those generically are like log likelihoods but with
appropriate penalty terms for model complexity so you can fairly compare models
across classes of multiple parameters.

If the idea is that you have a GARCH modeling procedure and multiple data sets,
and you want to say "Does this model fit the observed data set X better than
many others Y_1,Y_2,Y_3,..." that's what I'd do.

I'd try to compute a negative log-likelihood/codelength for all of them.
Note--in the simple case here, all must have the same length.

For instance suppose you have parameters \p in the Garch model. You will
have to put a prior on these parameters, P(p). Suppose now you can get the
negative log likelihood of the GARCH model given a data set L(Z,p) = -log
P(Z_1,Z_2, ... | p)

then for each you would minimize the total codelength over p

L(Z) = min_p [ -log P(p) + L(Z,p) ]

i.e. maximum likelihood fitting but with the penalty term for the parameters.
(this corresponds to a 2 part code sending parameter values first and then
the data given the model class and the particular parameters).

If
L(X) gives significantly smaller values than L(Y_1), L(Y_2), etc, then
you may be able to say that the GARCH model fits L(X) better than the surrogates.

Note though this may not give results you expect---if the GARCH model has a
parameter which reflects conditional volatility autocorrelation, and that
paramter can be set to zero, i.e. giving effectively IID returns or just
linear stochastic noise then the surrogates will be fittable with the GARCH
model well and give a small codelength and the two may not be very
distinguishable. All its saying there is that the one model class can
fit both kinds of data.

If you really want to test whether the conditional volatility parameter is >0
or not, then there are probably direct test procedures for that.

> I have not seen any clear application as such in Physical Review E,
> Physica D and other relevant journals. Apologies if I am wrong...
>
> Any thoughts?

What models and hypotheses for the time series are you trying to
compare exactly?

>
> Thanks for your time and patience.
>
> Costas