RE: Article: Model Selection and the Molecular Clock





"Robert Karl Stonjek" rstonjek@xxxxxxxxxxxxxx wrote:-

Copyright: C 2006 Oliver G. Pybus.
There are no mathematical equations in On the Origin of Species. A good
thing too, you might think, and it is undoubtedly true that Darwin's clear
and flowing narrative style helped ensure the popularity of his writings.

JE:-
The words supply meaning. What the numbers provide is a measure of that
meaning. Science requires meaning to be objective, i.e. able to be measured
(not just hand waving) so that it can be tested against nature. Thus
mathematics must work with and not against, _meaning_. This requires non
ambiguity which in turn requires at least one self exclusive assumption to
be allowed within the meaning. This will have the format: if you suppose A
then you cannot suppose B because they are contradictions of each other. If
no contradiction exists within the meaning provided then "anything goes"
simply because no meaning was actually provided.


Modern research in evolutionary biology can make for less easy reading.
Much
of it concerns the development of an expanding arsenal of mathematical and
statistical techniques, necessary to do battle with the relentless
onslaught
of gene and genome sequences. Of course, the discrete, ordered nature of
genetic information and the stochastic character of Mendelian inheritance
have naturally lent themselves to numerical analysis.

JE:-
Yes but the cost has been extraordinary: the deletion of almost all
epigenetic heritability. It is this _massive_ deletion that has allowed
misused heuristic models of gene centricity to dominate fertile organism
centric Neo Darwinism.

Consequently, the
mathematical foundations of evolutionary genetics have, somewhat unusually
for biology, tended to precede the data to which they are applied.

JE:-
Simplified/oversimplified models allow "the mathematical foundations of
evolutionary genetics" to PRECEDE "the data to which they are applied" only
because models are fitted up to the facts and not tested to refutation by
them (see below).

The
Genetical Theory of Natural Selection by R. A. Fisher, published only
fifty
years after Darwin's death, is full of equations.

JE:-
Yes, but almost none of them allow significant heritable epistasis
(heritable non additive genetic associations) when in nature, the fitness of
every single gene observed within nature remains epistatic (non additive to
at least one other within the same genome) no matter how you define fitness.


The simplest weapon in the armoury of evolutionary genetics is genetic
distance, a measure of the number of evolutionary changes between
sequences
from different organisms. Genetic distances can be calculated for a pair
of
sequences by simply counting the number of nucleotides or amino acids that
differ between them. Unfortunately, this approach underestimates the
amount
of evolutionary change because it does not account for the fact that each
site may change more than once during evolutionary history.

Yes, somewhat similar to the problem of a double cross over during meiosis.

Statistical
tools, called nucleotide or amino acid substitution models, are therefore
used to estimate genetic distances between sequences. There is a
bewildering
hierarchy of substitution models available, each making a different and
specific set of assumptions about the evolutionary process of sequence
change [2]. The simplest models assume that all types of mutation are
equivalent and that all sites in a sequence change at the same rate. More
complex models loosen these assumptions, allowing heterogeneity in the
process of sequence change, but they can be reliably applied to larger
datasets only. The task of deciding amongst these competing models is
known
as statistical model selection and can be thought of as a trade-off
between
model accuracy and model complexity.

JE:-
Accuracy and complexity are always a trade off because they are relative
opposite propositions like "up" and "down". The role of a theory is to set a
limit to complexity. This means that theory, by necessity, is mostly words.
OTOH simplified/oversimplified models of that theory sacrifice
complexity for accuracy. The more of the theory you delete the more accurate
the model but the less that model represents the theory it was
simplified/oversimplified from. Therefore, also by necessity, models are
mostly mathematical constructs. The BIG problem is that models built by
specialist mathematicians can and have, become divorced from their parent
theory, contesting it and finally replacing it. It should be obvious that
such an event represents an absurdity. However such absurdities remain
common within gene centric Neo Darwinism which has almost entirely divorced
itself from the only empirically refutable parent theory that it has:
fertile organism centric Darwinian evolution by natural selection.

The degree to which a model fits the
data at hand (accuracy) is always improved by adding more parameters
(complexity), but since the amount of data remains constant the
statistical
uncertainty about each parameter increases. In addition, the biological
meaning of each parameter becomes harder to decipher so the explanatory
power of the model decreases (Figure 1). Thus the chosen model should have
enough parameters to adequately explain the data-but no more. Once an
appropriate model is chosen, genetic distances are combined using other
statistical techniques to generate a phylogenetic tree of the sequences
being studied. The lengths of the branches in the phylogeny thus represent
estimated numbers of sequence changes (Figure 2A).

JE:-
Since the ideal model should only have "enough parameters to adequately
explain the data-but no more" then the chosen data set (data selected to
demonstrate a particular part of a more complex theory) must adequately
constrain the model to the facts. IOW, models are only a valid demonstration
tool fitted up to the facts but are NOT a testing tool. A model has to be
well chosen and not be misused if it is to adequately represent and not just
misrepresent the parent theory it was simplified/oversimplified from.

Regards,

John Edser
Independent Researcher'

edser@xxxxxxx;au




.



Relevant Pages

  • Re: X = Z
    ... travel a path and return to its beginning, a goal of mathematics is ... that have otherwise generally canonical meaning. ... the fundamental discovery of truth in mathematics, ... appear on the subject of 'primes' above and below mine, ...
    (sci.math)
  • Re: X = Z
    ... travel a path and return to its beginning, a goal of mathematics is ... that have otherwise generally canonical meaning. ... Fashion has nothing to do with truth. ... appear on the subject of 'primes' above and below mine, ...
    (sci.math)
  • Re: truth/falsity of sentences in first-order logic
    ... a lot of what he said, like "Truth has nothing to do with models." ... the notion that a formula doesn't have a meaning ... mathematics, and anything that's technically wrong at the foundation ... of reasoning could "sink the ship of reasoning", ...
    (sci.logic)
  • Re: How do We Know that ZF is the Axiomatization that Proves everything Provable?
    ... formalize or shrink these results.) ... whether it's a valid proof according to CBL? ... meaning "Turing Machine a halts no on input b at iteration number c", ... untethered to any real mathematics. ...
    (sci.logic)
  • Re: Choice sequences, intuition, etc
    ... specifying a different method of constructing the sequence, ... Regarding Bishop, ... axiom systems for mathematics that are weaker. ... Somehow the distinction in meaning between an implication ...
    (sci.logic)

Loading