Re: PowerBasic rocks!



On Fri, 15 May 2009 05:23:53 -0700, panteltje wrote:

Even with your values I still have problems with this :
[snip program]
When I run that on the eeePC with 512MB RAM, I get these times:

eeepc-unknown:/root> ./test2
memory needed=384 MB
mem=0xa8a54008
b=0xa1041008
Time used is 13920337 us (13.9203 s). Ready

On my system*, an example output of ./time-snipped-prog is:
memory needed=384 MB
mem=0x2aaaaaae8010
b=0x2aaab9f0d010
Time used is 2891413 us (2.8914 s).
Ready

-------------------------------------------
*Some extracts from free and per-processor /proc/cpuinfo on my system:
total used free
Mem: 3936312 3537024 399288
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 5200+
cpu MHz : 1000.000
bogomips : 2042.00
-----------------------------

The program shown below is shorter and faster than the snipped
program (shorter because of formatting, no error checks, and no
bother with incrementing pointers, which can get in the way of
compiler optimization). Eg, in repeated runs, 1.7760 s was the
least time from the snipped program, while <1.6 seconds was typical
for program below, whose output is like following:

Setup time = 0.418803 seconds = 0.041880 seconds/pass = 0.654 nanoseconds/item
i4[11M,49M]: 11000000 49000000
Runtime 1 = 1.591445 seconds = 0.159144 seconds/pass = 2.487 nanoseconds/item
i4[11M,49M]: 11010010 49010010
Runtime 2 = 1.555033 seconds = 0.155503 seconds/pass = 2.430 nanoseconds/item
i4[11M,49M]: 11020020 49020020

In the output, Runtime 1 is for a simple loop with increment of 1,
while Runtime 2 is for loop unrolled with increment of 2. Note,
unrolling with a factor of 4 or 8 rather than 2 gave similar
results, 5% to 25% faster than no unrolling. The "i4[...]" lines
show the 11 and 49 millionth elements of i4. Here's the program:
-----------------------------
/* jiw 15 May 2009
Re: timing of adding an array of int16's to an array of int32's
Compile via: gcc time-addloops.c -O3 -Wall -o time-addloops
Copyright 2009 James Waldby. Offered without warranty
under GPL terms as at http://www.gnu.org/licenses/gpl.html
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#define NP 10
#define NPP 64000000
//================================================
double ttime(double base) {
struct timeval tod;
gettimeofday(&tod, NULL);
return tod.tv_sec + tod.tv_usec/1e6 - base;
}
//================================================
void ttell(double t, int32_t *i4, char *item) {
printf ("%10s = %8.6f seconds = %8.6f seconds/pass = %0.3f nanoseconds/item\n",
item, t, t/NP, 1e9*(t/NP)/NPP);
printf ("i4[11M,49M]: %5d %5d\n", i4[11000000], i4[49000000]);
}
//================================================
int main() {
double t0=ttime(0);
int i, j;
int16_t *i2 = malloc(NPP*sizeof(int16_t));
int32_t *i4 = malloc(NPP*sizeof(int32_t));

for (j=0; j<NPP; ++j) {
i4[j] = j;
i2[j] = 1001;
}
ttell (ttime(t0), i4, "Setup time");

t0=ttime(0);
for (i=0; i<NP; ++i)
for (j=0; j<NPP; ++j)
i4[j] += i2[j];
ttell (ttime(t0), i4, "Runtime 1");

t0=ttime(0);
for (i=0; i<NP; ++i)
for (j=0; j<NPP; j+=2) {
i4[j+0] += i2[j+0]; i4[j+1] += i2[j+1]; /*
i4[j+2] += i2[j+2]; i4[j+3] += i2[j+3];
i4[j+4] += i2[j+4]; i4[j+5] += i2[j+5];
i4[j+6] += i2[j+6]; i4[j+7] += i2[j+7]; */
}
ttell (ttime(t0), i4, "Runtime 2");
return 0;
}
-----------------------------
--
jiw
.



Relevant Pages

  • Re: Prime Numbers
    ... /*This code finally compiles as a 'c' code ... One more question....if i don't free up the memory allocated at the end ... it would appear that we are best served by an array ... prime numbers can be represented by an unsigned long int. ...
    (comp.lang.c)
  • RE: Robby, __int8 limited to: -128 to 127.
    ... And yes an 8bit int cannot hold a value of 256. ... //MIDSB OF MESSAGE ADDRESS IN PAR FLASH ... In order to access the data from my external memory I need a three byte ... and increment the A0 value from 1 to 2. ...
    (microsoft.public.vc.language)
  • Re: Java Indexing- Historical question
    ... And when the array is of primitives there should be a dramatic benefit in memory usage as well. ... RestrictedRangeIntegerKeyedMapif you prefer) implements Mapthat only accepted keys in a given range, and stored its elements in an array. ... public ArrayMap(int base, int size) { ...
    (comp.lang.java.programmer)
  • Re: contiguity of arrays
    ... Dan.Pop@cern.ch (Dan Pop) writes: ... There is nothing magic about dynamically allocated memory. ... object and then used to access such an object or an array of such ... could disallow ptrbecause the relevant array of int is only 2 ...
    (comp.lang.c)
  • Re: memory and speed
    ... but only handles int to save resource/memory ... The size of the IntVector could grow very big, such as 50000, ... Please comment my code regarding the speed and memory usage, ... Instead of keeping all the values in one array whose ...
    (comp.lang.java.programmer)