2 peculiar problems of data analysis



Hi All:

This is my first post here. I have two questions for the stats savants here regarding data analysis. I have some background in probability and statistics, but these have me stumped so far (particularly the second).

First problem:

Suppose we have some random variable (call it "X") that can be sampled from some probability density function. For concreteness, lets say that the pdf is Gaussian, with mean mu and deviation sigma. Let's say we have generated n independant samples from this pdf. Call them x_i, where i runs from 1 to n. I know that the best estimates for mu and sigma given these data are

mu^ = (1/n) sum x_i

and

sigma^ = sqrt[ (1/(n-1)) sum (x_i - mu^)^2 ],

respectively. Now suppose that the samples x_i are subject to some error, e.g. they are generated by experiment. Let the i^th sample have an error bar of magnitude e_i. The e_i may all be different. What are the best estimates for mu and sigma, analogous to the formulae above, which take into account these experimental errors e_i?


Second problem:

This one is harder, and may, in fact, not have a definite solution.

I have an unknown function G(k). Due to the physics of the problem (which I won't go into), I have reason to believe that the function has a random component, i.e. it is of the form:

G(k) = g(k) + R(k),

where g(k) is some well-behaved, non-random function, and R(k) is a random variable which satisfies

< R(k) > = 0 and < R(k)^2 > = sigma(k)^2.

Here "< ... >" means "average of ...", and sigma(k) is some well-behaved function of k. My goal is to estimate the functions g(k) and sigma(k). What I have for data are a sequence of n measured data points (k_i, G_i). The various k_i may all be distinct, or there may be some repetition for some values of k_i.

How can I estimate g(k) and sigma(k) given these data?

I realize that there may be an infinite number of solutions for g(k) and sigma(k), since they are continuous, whereas my data are discrete. The physics of the problem suggest that these functions should be very well-behaved, i.e. they shouldn't "wiggle" too much. It may therefore be useful to impose some kind of auxiliary "entropy condition" which requires that the total variation (or something like that) of g(k) and sigma(k) be minimized. That is as far as I've gotten in my thinking.

Having solved the above, my next question would be: Suppose each of the G_i is subject to an experimental error e_i. How can we take that into account?

Thanks very much in advance for any and all suggestions.
.



Relevant Pages

  • Re: 2 peculiar problems of data analysis
    ... > This is my first post here. ... I have two questions for the stats savants here regarding data analysis. ... sampling distribution of the measurement error component of x_i? ...
    (sci.stat.math)
  • Re: 2 peculiar problems of data analysis
    ... > This is my first post here. ... I have two questions for the stats savants here regarding data analysis. ... sampling distribution of the measurement error component of x_i? ...
    (sci.stat.math)