Re: More of an Algorithems question



quasi <quasi@xxxxxxxx> wrote in
news:t7ipn118hl17f4p3sbtetcnsqv7jj250f3@xxxxxxx:

> On Thu, 17 Nov 2005 13:13:06 -0500, quasi <quasi@xxxxxxxx> wrote:
>
>>On Thu, 17 Nov 2005 17:59:28 +0000 (UTC), Dave Seaman
>><dseaman@xxxxxxxxxxxx> wrote:
>>
>>>On Thu, 17 Nov 2005 12:51:17 -0500, quasi wrote:
>>>> On Thu, 17 Nov 2005 17:21:45 +0000 (UTC), Dave Seaman
>>>><dseaman@xxxxxxxxxxxx> wrote:
>>>
>>>>>On Thu, 17 Nov 2005 12:15:12 -0500, quasi wrote:
>>>>>> On Thu, 17 Nov 2005 15:07:58 GMT, cri@xxxxxxxx (Richard Harter)
>>>>>> wrote:
>>>>>
>>>>>>>On 15 Nov 2005 14:55:51 -0800, "Filter"
>>>>>>><filtermedialtd@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>Hi,
>>>>>>>>
>>>>>>>>How can I proove that no Algorithm can compress every file of
>>>>>>>>length 10^6?
>>>>>>>
>>>>>>>However there is an algorithm that will losslessly compress every
>>>>>>>file of length 10^6 except one.
>>>>>>>
>>>>>
>>>>>> I don't believe that.
>>>>>
>>>>>> The pigeonhole principle applies easily to invalidate such a
>>>>>> claim ...
>>>>>
>>>>>> Let n=10^6.
>>>>>
>>>>>> There are 2^n possible files with n bits, but only 2^(n-1) files
>>>>>> with n-1 bits. Hence, at most half the files can be compressed.
>>>>>
>>>>>There are 2^n-1 files with fewer than n bits. Hence, the claim
>>>>>stands.
>>>
>>>> Not by my count.
>>>
>>>> For example, 16 possible files with 4 bits but only 8 possible
>>>> files with 3 bits, no?
>>>
>>>Are there any other files with fewer than 4 bits that you have
>>>overlooked in your count?
>>
>>Ok, I see my confusion.
>>
>>I failed to take into account files with less than (n-1) bits.
>>
>>I get it now -- thanks.
>>
>>quasi
>
> Ok, but then along the lines of what I was thinking, the following
> claim is valid:
>
> In any lossless compression scheme, less than half of the files with n
> or fewer bits can be compressed.
>
> quasi

Isn't "less than half" even stronger as "none"? A 1-bit file cannot be
compressed, since the only smaller file (0 bits) is needed to
represent itself. Then a 2-bit file cannot be compressed, since all of
the 0- and 1-bit files are needed to represent themselves. Induction.

By the way, the "except for 1" applies only with binary representation.
In general base b, there are (b^n-1)/(b-1) number with fewer than n
base b digits, not only losing the "-1" in (b^n-1) but also dividing
by (b-1). In base 10, only 1 of the 1-digit numbers can be compressed,
leaving 9 orphans (or, pigeons without unoccupied pigeon holes).

Lynn Killingbeck
.



Relevant Pages

  • Re: More of an Algorithems question
    ... >>>Ok, I see my confusion. ... >> In any lossless compression scheme, less than half of the files with n ... Yes, I saw that the two 1-bit files were a problem as far compression, ...
    (sci.math)
  • Re: Inneresting gas-saving factoid
    ... That's how the compression release worked on my old Yamaha enduro, ... So Jon is also saying that *leaving* the valves open the piston will still ... Mae West (yer fav Congressman) to the Gangster: ...
    (rec.crafts.metalworking)