Re: The intensity of a 2 dimensional poisson pattern from limited observational data



On Aug 28, 1:14 pm, peterp...@xxxxxxxxxxx wrote:

I can't promise I can help. Call me stupid, but I am not entirely sure
about your initial problem. You wrote: "Those observations are a set
of locations of pairs of trees relative to randomly chosen points,
with the random points being far apart relative to the proximity of
the two trees to
each point. That is, I know the x and y coordinates of 2 trees and a
nearby random point for a large number of well-spaced random points
throughout a large area. Therefore at each random point I know the
distance from the point to tree 1, from the point to tree 2, and the
distance between the two trees."
So you picked for example n random points in the forest, as you said,
more or less in a grid-layout, and observed the relative position of
two trees from this point. I suppose you have the coordinates of the
random points in your data. You go on about distances and ranked
distances, but I suppose that you can reconstruct the absolute
coordinates of the trees that constitute the pairs. Now, a question
that was asked, that is crucial, and that you have forgotten to answer
I think, is how the tree pairs are chosen. Are the tree pairs simply
the two closest trees from each random point, or two trees that you
have picked randomly from the (unknown) total number of trees in the
forest? Are there trees that get repeated for several random points?
Am I right in that you observe one and only one tree pair per random
point? Possibly a small sample of your data might help to understand
your problem...

Not stupid in the least. I have the coordinates for the two trees and
the "random" point to
which they correspond, which means I know 3 distances for each pair of
trees and their associated point (point to tree 1, point to tree 2 and
tree 1 to tree 2). The points are far apart from each other (0.8 km)
relative to the two trees at each point (on the order of 1-50 meters),
therefore each tree is associated with one and only one random point--
no repeated sampling. They are not necessarily the two closest trees
to the point, although in some cases they likely will be. The tree
pairs are chosen "randomly" in the sense that one does not know their
ranked distance (g) from their associated "random" point. For
example, they might be the 1st and 2nd closest, or the 2nd and 4th, or
the 1st and 8th, etc., (up to a maximum of g = 20th closest).
Procedurally, this was done by surveyors in the 19th century who
located two trees close to a survey point, and recorded the exact
distance and bearing from the point to each tree, but not the value of
g. (Pehaps "chosen randomly" is not the best term. The important
concept here is that each tree is some ranked distance from its
associated point, but I don't know what it is). My goal is therefore
to obtain a probabilistic estimate of this rank, for each tree in the
dataset, because if I know the ranked distance and the actual
distance, I can calculate tree density at each point (my overall
goal), based on a 1957 formula by Morisita.

I have approached this by (1) assuming that the trees around each
point are randomly arranged (a fair biological assumption, although
the absolute density around each point will vary from point to point),
and (2) searching for a statistic that will discriminate maximally
between the collection of distributions that arise, and also be
insensitive to actual density variation between points. Both are
requirements for the use of Morisita's formula. There are 190
distributions, one for each of the 190 pairs of ranks when limiting
the max rank to 20th closest. So far I have calculated 5 different
statistics (all ratios) meeting the above criteria (ratio of distance
of tree 1 to 2, intertree distance to tree 1, intertree dist to tree
2, ratio of mean to s.d. for the 3 distances, and ratio of mean to
skew for same), all via simulation. Now I have to figure out which
statistic discriminates maximally between the 190 distributions for
each (thus my other recent post), that is, to pick the statistic that
is maximally diagnostic. I've used simulation because I currently
have no hope of figuring out analytical solutions for these nearly
1000 distributions, if they're even possible. Nevertheless I'd prefer
to go that route if I can.

Hope that helps.

.



Relevant Pages