Re: Larkin, Power BASIC cannot be THAT good:



Jan Panteltje <pNaonStpealmtje@xxxxxxxxx> wrote:

On a sunny day (Tue, 26 May 2009 16:31:52 +0300) it happened Anssi Saari
<as@xxxxxx> wrote in <vg3tz38ourb.fsf@xxxxxxxxxxxxxxxxxxxxxxx>:

mrdarrett@xxxxxxxxx writes:

Will have to learn the SIMD instructions. I didn't think I had an
assembler that could handle even MMX, but apparently FreeBASIC can...

Interestingly, I got SSE2 code for the inner loop when compiling the C
code below with gcc 4.3.2 and -O3 -march=core2. The C code was from
John Devereux in the PowerBasic Rocks! thread, just changed a little
to print out the time per addition. Oh yeah, doesn't seem to run any
faster than "normal" code with addls and movls...

Here's the assembler output for the summing loops:

xorl %edx, %edx
.p2align 4,,10
.p2align 3
.L2:
xorl %eax, %eax
.p2align 4,,10
.p2align 3
.L3:
movdqa a(%eax), %xmm0
paddd s(%eax), %xmm0
movdqa %xmm0, s(%eax)
addl $16, %eax
cmpl $256000000, %eax
jne .L3
incl %edx
cmpl $10, %edx
jne .L2

And the C code:

#include <stdio.h>
#include <stdlib.h>

#define SIZE 64000000

int s[SIZE];
int a[SIZE];

int main(int argc, char **argv)
{
unsigned start_usecs, current_usecs, diff_usecs;
struct timeval start_timeval, current_timeval;

/* get start time */
gettimeofday(&start_timeval, NULL);

int x,y;

/* The Loop */

for(y=0;y<10;y++)
{
for(x=0;x<64000000;x++)
{
s[x] = s[x] + a[x];
}
}


Anyone tried to unroll this loop a little?

for(x=0;x<64000000;x+4)
{
s[x] = s[x] + a[x];
s[x+1] = s[x+1] + a[x+1];
s[x+2] = s[x+2] + a[x+2];
s[x+3] = s[x+3] + a[x+3];
}

It could reduce the overhead caused by the for loop.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
"If it doesn't fit, use a bigger hammer!"
--------------------------------------------------------------
.



Relevant Pages