Re: hypothesis testing method when sample is too small and you have no controls...



On Sep 16, 1:06 am, Gaetan <gaetanl...@xxxxxxxxx> wrote:
Colleagues have asked me what to do within an hypothesis testing
framework when somehow you can not get Controls and you have a really
small sample size.

In considering the above challenge, I figured that traditional paired
testing (paired t test, etc...) is obsolete and not done anymore.
This is because of the placebo effect, or if you deal with financial
or economic variables inherent noise such as; 1) trend; 2)
seasonality; and 3) change in underlying business conditions.  Because
of those reasons you never know within a paired test framework if
positive changes are for real or not.

Instead I proposed a simple solution.  It consists of using linear
regression and using dummy variables for the "treatment period."

Let's say you own a single shoe store.  And, all you have is three
years of monthly unit sales data.  In the most recent 3 months you
used a different marketing strategy.  And, you want to measure if it
had any effect.

The regression model would have the following variables:
1) A trend variable that captures the embedded monthly growth in unit
sales over time;
2) A qualitative variable to capture seasonality;
3) You could add macroeconomic variables (inflation, GDP growth,
unemployment) to capture change in the business environment.
4) You would have three dummy variables that represent the 1st- ,
2nd- , and 3d-month of the "treatment" period.
5) The dependent variable would be monthly shoes unit sales.

You then look at the output of the regression and the regression
coefficients (constants really) associated with the dummy variables
would represent the Effect Size(s) in the 1st, 2nd, and 3d month of
the treatment period.  Those coefficients divided by their respective
standard error would give you a t stat and a related p value.  The p
value would reveal what is the probability that the coefficient is
equal to zero.  Also, you can readily build a 95% confidence interval
if the regression has not already done so.  The coefficient + 1.96
(standard error) = maximum range of coefficient or Effect Size.  The
coefficient - 1.96(standard error) = minimum range of Effect Size.

By doing so, this would resolve a situation where you have no Controls
and essentially a sample of one (a single store).  Yet, you could
readily statistically derive the Effect Size of your marketing
campaign.  In other words, this method measures the lift in sale from
your marketing campaign above the trend line.

Do you think the proposed methodology is appropriate?  If not, why
not?  Can you think of tweaks that could make it work?

I have five words for you: Bayesian, Bayesian, Bayesian, Bayesian,
Bayesian. This is provably the only coherent universal way to test
hypotheses. The n = 1 fallacy does not occur of dependencies are
introduced between individuals as they should be.

illywhacker;
.



Relevant Pages