Re: Larkin, Power BASIC cannot be THAT good:
- From: Martin Brown <|||newspam|||@nezumi.demon.co.uk>
- Date: Thu, 21 May 2009 08:25:20 +0100
John Larkin wrote:
On Wed, 20 May 2009 13:07:29 -0700 (PDT), mrdarrett@xxxxxxxxx wrote:
On May 15, 11:27 am, John Larkin
<jjlar...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
On Fri, 15 May 2009 14:34:51 GMT, Jan Panteltje
<pNaonStpealm...@xxxxxxxxx> wrote:The run time in C is 13 seconds here on a 1GHz processor.
Can you specify your 'old HP computer' ?
I can win maybe 1 second by writing the code a bit different.
And a 3GHz would do it in 12 / 4 = 4 seconds...
That is about right for a non-optimising naive native code compiler that saves every loop variable back to memory. When fully optimised to be all in registers the figure should come down to 2.2s or so.
A bigger cache would help a bit perhaps.Here's my PowerBasic code:
A Cray would be even better.
What does you C code look like? Mine is in the other posting.
Else you goofed a factor 10.
Seems to me anyways :-)
===================================================
#COMPILE EXE
' SUM.BAS
' TRY SUMMING A LOT OF INTS INTO AN ARRAY OF LONGS...
' JL MAY 14, 2009 PBCC4
FUNCTION PBMAIN () AS LONG
COLOR 15,9
CLS
DIM A(64000000) AS INTEGER ' INPUT ADC SAMPLES
DIM S(64000000) AS LONG ' SUMMING ARRAY
There is a small difference between the BASIC and C. You are using signed INTEGERs he is using unsigned. It shouldn't affect the runtime though provided that the C compiler generates the right opcodes.
On my computer, a 1.9 GHz Xeon with 2G ram, winXP, I get this
result...
Start... Done
Time per loop 0.231 sec 3.61 ns/add
That is believable. Most compilers get between 0.22 and 0.3 depending on how fast the memory subsystem is under sustained sequential access.
One of my guys did a C version (I refuse to program in C) to run on
the Kontron under Linux, a slightly slower CPU, 2G ram. I asked him
for his source code, and he spent about a half hour cleaning it up to
be presentable... which I asked him NOT to do. Anyhow, here it is:
Have you tested his code on your PC and your code on his embedded system? It could be that DMA transfer of raw data is robbing him of memory bandwidth. Or the Kontron board has other memory speed issues.
for (multiply = 0; multiply < 10; multiply ++) // 10 x
{
for ( index = 0; index < DATA_ARRAY_SIZE; ++index )
sum_data[index] += inbound_data[index];
}
It is difficult to see how even the dumbest compiler could get this to take more than 0.5s per loop on modern hardware. Be interesting to see the generated code for this loop. If it looks sensible then we can establish that you are looking at a hardware problem.
My program is prettier.
Only to boneheaded BASIC hackers.
From what little I remember about C pointers, I believe usingsubscripts in this case is identical to using pointers. But it's been
awhile, so I could be wrong.
The assembly code produced by the C program is only five opcodes, and
appears to be about as smart as it can be. The only improvement I can
suggest is to count down, not up, so a simple test for zero can end
the loop.
It is the memory subsystem that isn't performing. You could add additional computation to the loop and it should not affect the timing.
Regards,
Martin Brown
.
- Follow-Ups:
- Re: Larkin, Power BASIC cannot be THAT good:
- From: John Larkin
- Re: Larkin, Power BASIC cannot be THAT good:
- References:
- Larkin, Power BASIC cannot be THAT good:
- From: Jan Panteltje
- Re: Larkin, Power BASIC cannot be THAT good:
- From: John Larkin
- Re: Larkin, Power BASIC cannot be THAT good:
- From: mrdarrett
- Re: Larkin, Power BASIC cannot be THAT good:
- From: John Larkin
- Larkin, Power BASIC cannot be THAT good:
- Prev by Date: Re: Electronic Dragonflies
- Next by Date: Re: Larkin, Power BASIC cannot be THAT good:
- Previous by thread: Re: Larkin, Power BASIC cannot be THAT good:
- Next by thread: Re: Larkin, Power BASIC cannot be THAT good:
- Index(es):
Relevant Pages
|