Re: An optimization problem



On 7月2日, 下午5时37分, Niels Diepeveen <n529...@xxxxxxxxxxxx> wrote:
Wen wrote:
On 7月2日, 上午6时00分, Niels Diepeveen <n529....@xxxxxxxxxxxx> wrote:
Wen wrote:
I have N i.i.d. random variables X_i, i=1,...,N following certain
discrete distribution. We proceed as follows: initially the system
generates a realization of the N random variables according to their
probability distribution. Observing this, we choose one among them,
say X_j is chosen. Then all the random variables whose value is
smaller than or equal to X_j are eliminated from the system, and a
cost of X_j is incurred at this step. Same steps happens to the rest
of the random variables until all the random variables are eliminated
(They all follow the same distribution at each step). The objective is
to minimize the cost.
I model this problem using dynamic programming and theoretically it
can calculate the optimal solution using backward calculation.
However, with the increasing of N, the number of possible states
expands exponentially, making it impossible to enumerate all the
states. For example, if N=100 and the number of possible values X_i
can take is 10, then we are facing a state space of 10^100. Is there
an effective way to approximately solve this problem? Thanks a lot in
advance.
An exact solution is probably too much to ask for but if I understand
the problem correctly it is possible to get a good approximation.

Let us call the minimal expected cost for eliminating all of n variables
C_n. Clearly C_0 = 0 and C_1 = E(X).

Now suppose we know C_i for all i < n. Then we can express the optimum
strategy for eliminating n variables as follows:
Let x_1, x_2, ..., x_n be the outcome of the variables at the current
step. Put these in nondecreasing order: y_1 <= y_2 <= ... <= y_n..
The expected cost of eliminating the first k now and the rest later is
y_k + C_(n-k). We want to minimize that cost, so we take
min(k=1..n) (y_k + C_(n-k)).

Now it is clear that the minimal expected cost for n variables
C_n = E(min(k=1..n) (Y_k + C_(n-k))),
where Y_1, .., Y_n are the order statistics of X_1, .., X_n.

In general I don't think it's practical to calculate that expectation
exactly, but it's fairly easy to get a Monte Carlo approximation for it.

For example for a uniform distribution on [0, 1] I get
C_1 ~= 0.500
C_2 ~= 0.625
C_3 ~= 0.703
C_4 ~= 0.757
C_5 ~= 0.797
C_6 ~= 0.827
C_7 ~= 0.850
C_8 ~= 0.868
C_9 ~= 0.882
C_10 ~= 0.893

Is this what you would call an effective way?

--
Niels Diepeveen

Hi Niels,

Thanks for the tip, I will try to find some numerical results with
Monte Carlo approximation (not quite familiar with that technique but
in your solution it seems effective). Besides approximation, I am
actually trying to take the property of order statistics (since we are
considering a problem with ordered i.i.d. random variables) into
consideration. If some properties of C_i can be revealed, it may be
possible to approximate more accurately.

It's worth a try, but I don't think you will get much further without making
specific assumptions about the distribution.

Also, can you provide some more details about how you implement the
Monte Carlo approximation for the problem? My intuition is to generate
a large number of realizations and take the sample mean as the
theoretical mean value. I am not sure whether I understand it
correctly. Thank you very much!

Yes, that is what I meant. Here is the Python code I used.

----------------------------------------------------------
import random

def DISTRIBUTION():
# replace this by any function returning random numbers with
# the right distribution
return random.random()

MCITER = 1000000 # Number of iterations of Monte Carlo method
# Note that precision increases with square root
# of this number

MAXVARS = 10 # Maximum number of variables

####################################################################
C = [ 0.0 ]

for n in range(1, MAXVARS+1):
sum_exp = 0.0
for iter in range(MCITER):
sample = sorted(DISTRIBUTION() for i in range(n))
opt = min(sample[i] + C[n-i-1] for i in range(n))
sum_exp = sum_exp + opt
C.append(sum_exp / MCITER)
print 'C_%d =' % n, C[-1]
-------------------------------------------------------

Note that it takes a few minutes to run.

--
Niels Diepeveen

I notice that you are using the average value of equal weight for each
sample. It may be the case with uniform distribution, but will that
cause problems when the distribution of the random variables are not
uniform?
.



Relevant Pages

  • Re: An optimization problem
    ... specific assumptions about the distribution. ... Monte Carlo approximation for the problem? ... cause problems when the distribution of the random variables are not ...
    (sci.math)
  • Re: An optimization problem
    ... generates a realization of the N random variables according to their ... probability distribution. ... Let us call the minimal expected cost for eliminating all of n variables ... but it's fairly easy to get a Monte Carlo approximation for it. ...
    (sci.math)
  • Re: How random is a random variable
    ... underlying distribution or as a realization of the underlying ... distribution" and "realization of the underlying distribution" actually ... uniform distribution; but if X_1 and X_2 are two independent uniform r.v.s ... of random variables (eg. equality in distribution, almost sure equality, ...
    (sci.stat.math)
  • Re: How random is a random variable
    ... equality between random variables are defined. ... distribution or is it merely a uniform distribution with a different ... specific realization or as a function of all possible realizations. ... Notation is about suitable short cuts and will ...
    (sci.stat.math)
  • Re: Analysis Tool Pack: Random function help
    ... In Excel Help. ... Click the distribution method you want to use to create random values. ... Characterized by a probability of success on a given trial. ... Bernoulli random variables have the value 0 or 1. ...
    (microsoft.public.excel.programming)