Re: Larkin, Power BASIC cannot be THAT good:



On a sunny day (Tue, 26 May 2009 16:31:52 +0300) it happened Anssi Saari
<as@xxxxxx> wrote in <vg3tz38ourb.fsf@xxxxxxxxxxxxxxxxxxxxxxx>:

mrdarrett@xxxxxxxxx writes:

Will have to learn the SIMD instructions. I didn't think I had an
assembler that could handle even MMX, but apparently FreeBASIC can...

Interestingly, I got SSE2 code for the inner loop when compiling the C
code below with gcc 4.3.2 and -O3 -march=core2. The C code was from
John Devereux in the PowerBasic Rocks! thread, just changed a little
to print out the time per addition. Oh yeah, doesn't seem to run any
faster than "normal" code with addls and movls...

Here's the assembler output for the summing loops:

xorl %edx, %edx
.p2align 4,,10
.p2align 3
.L2:
xorl %eax, %eax
.p2align 4,,10
.p2align 3
.L3:
movdqa a(%eax), %xmm0
paddd s(%eax), %xmm0
movdqa %xmm0, s(%eax)
addl $16, %eax
cmpl $256000000, %eax
jne .L3
incl %edx
cmpl $10, %edx
jne .L2

And the C code:

#include <stdio.h>
#include <stdlib.h>

#define SIZE 64000000

int s[SIZE];
int a[SIZE];

int main(int argc, char **argv)
{
unsigned start_usecs, current_usecs, diff_usecs;
struct timeval start_timeval, current_timeval;

/* get start time */
gettimeofday(&start_timeval, NULL);

int x,y;

/* The Loop */

for(y=0;y<10;y++)
{
for(x=0;x<64000000;x++)
{
s[x] = s[x] + a[x];
}
}

/* get elapsed time */
gettimeofday(&current_timeval, NULL);

/* calculate the difference */
current_usecs = current_timeval.tv_usec + (1000000 * current_timeval.tv_sec);
start_usecs = start_timeval.tv_usec + (1000000 * start_timeval.tv_sec);
diff_usecs = current_usecs - start_usecs;

fprintf(stderr, "Time used is %d us (%.4f s).\n", diff_usecs, (float)
diff_usecs / 1000000.0);

fprintf(stderr, "Time used per add is %.4f ns.\n", (float)diff_usecs / 640000.0);
fprintf(stderr, "Ready\n");
exit(0);
}


Actually the largest part of that original C code is from me :-)
You can see that from the use of gettimeofday and the subtraction in us.
.


Quantcast