Re: Fast Image access for binarization



siddharth wrote:

hi,
is there any way to access image data efficiently to speedup the global
binarization code speed. our code is taking abt 40msec right now but we
need to reduce it down to 5-10msec.

our loop is like this:

for(i = 0; i < (width * height); i++)
{
if (image[i] > threshold)
image[i] = 1;
else
image[i] = 0;
}

image size is 512x512 to 4096x4096

there are different checks involved but we r trying to remove them from
the loop completely. this one check however cannot be moved out of the
loop.

On some architectures

image[i] = (image[i]>threshold);

may be faster since there is no branch in the loop. You are then a hostage to the values the system uses for true and false but you avoid a making an unpredictable branch if the threshold is near the median.

is there any platofrm and hardware independent way of speeding up this
code?

Not significantly provided that the optimising compiler is half decent (not always a safe assumption). The simplest way to check is disaasemble the generated code and examine it.

It is possible that explicit use of pointers might be slightly quicker (but these days unlikely if the optimiser is any good).

All other go faster stripes come with machine dependencies of one sort or another.

any unique solutions to this problem? please help us out.

Loop unrolling might help (but on modern machines with clever branch prediction and limited instruction cache could also hinder).

SIMD on x86 MMX instructions and vectorised routines in MSC like __mm_cmpgt_pi8 should be a fair bit faster provided your array is correctly aligned. It takes 8 bytes in parallel and compares them in a single instruction using dedicated hardware support.

Regards,
Martin Brown
.



Relevant Pages

  • Re: Fast Image access for binarization
    ... use a lookup table and eliminate the compare and branching. ... is there any way to access image data efficiently to speedup the global ... the loop completely. ...
    (sci.image.processing)
  • Fast Image access for binarization
    ... is there any way to access image data efficiently to speedup the global ... our code is taking abt 40msec right now but we ... the loop completely. ...
    (sci.image.processing)
  • Fast Image access for binarization
    ... is there any way to access image data efficiently to speedup the global ... our code is taking abt 40msec right now but we ... the loop completely. ...
    (sci.image.processing)
  • Re: programming language
    ... you will find the source code to my bf interpreter. ... instruction_pointer is the index of the instruction currently being executed in the instruction array. ... execute() is where the action happens. ... executegets a pointer to a bf_vm, where it executes one instruction, increments the instruction pointer of the bf_vm so that it points to the next instruction (or does a loop), and returns. ...
    (comp.programming)
  • Re: How much does it take to execute MMX instruction?
    ... a unrolled loop with lots of nop's in the ... This way we have accurate enough instruction timings. ... Pentium M, in general, has latency one clock cycle less, than Pentium ...
    (comp.lang.asm.x86)

Loading