Re: where do so many tenses come from?



"Joachim" == Joachim Pense <spam-collector@xxxxxxxxxxxxxxx> writes:

Joachim> Information theory gives the standard definition of
Joachim> "information". The information of a sign (in this case a
Joachim> phoneme) within a sequence of signs is the negative
Joachim> dyadic logarithm of the probability that this sign occurs
Joachim> in this position.
>> Well... you have to get these probabilities first.

Joachim> It's an abstract thing. But you can approximate
Joachim> it. That's what programs like "winzip" do.

No. General compressions like Winzip do not do approximations.
Nowadays, they usually use the Deflate algorithm, which is a variant
of LZ77. It doesn't approximate. It looks for recurring substrings,
and the stores the location and length of them. This latter is then
Huffman encoded. The Huffman coder counts the frequencies, instead of
approximating them. (Better compression is obtained by counting,
instead of approximating.)


Joachim> So if there are less signs to choose from, then for each
Joachim> the probability to occur in any given position is higher
Joachim> (on average), so the information is less.
>> On average? Does this average translate to an assumption that
>> the probabilities are more or less EVENLY DISTRIBUTED among
>> every sign?

Joachim> OF COURSE NOT! If I say that all men are - on average -
Joachim> 180 cm high, then I don't assume that they all have the
Joachim> same size.

But phonemes frequencies are not normally distributed. So, your 180cm
high example isn't applicable.


--
Lee Sau Dan 李守敦 ~{@nJX6X~}

E-mail: danlee@xxxxxxxxxxxxxxxxxxxxxxxxxx
Home page: http://www.informatik.uni-freiburg.de/~danlee
.



Relevant Pages

  • Re: where do so many tenses come from?
    ... Joachim> content of each phoneme you encounter in a text is less ... The average amount of information per character is meaningless, ... phoneme occurs with equal probability. ...
    (sci.lang)
  • Re: where do so many tenses come from?
    ... Joachim> "information". ... Joachim> phoneme) within a sequence of signs is the negative ... Joachim> dyadic logarithm of the probability that this sign occurs ... Herman Rubin, Department of Statistics, Purdue University ...
    (sci.lang)
  • Re: where do so many tenses come from?
    ... Joachim> "information". ... Joachim> phoneme) within a sequence of signs is the negative ... Joachim> dyadic logarithm of the probability that this sign occurs ...
    (sci.lang)
  • Re: where do so many tenses come from?
    ... Joachim> Information theory gives the standard definition of ... Joachim> "information". ... Joachim> phoneme) within a sequence of signs is the negative ... Joachim> dyadic logarithm of the probability that this sign occurs ...
    (sci.lang)
  • Re: where do so many tenses come from?
    ... Joachim> My answer was: ... Joachim> content of each phoneme you encounter in a text is less ... Next, consider language B with a phoneme set of size 128, where each ...
    (sci.lang)

Loading