Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: Martin Brown <|||newspam|||@nezumi.demon.co.uk>
- Date: Thu, 07 Aug 2008 14:17:06 +0100
aruzinsky wrote:
On Aug 5, 5:29 am, Martin Brown <|||newspam...@xxxxxxxxxxxxxxxxxx>
wrote:
Did you establish experimentally that a loop unrolled by 4 was optimum?
I think so. Maybe, I copied it from a similar function that I
optimized.
Worth checking. Cache behaviour can be rather quirky at times.
Tempting to suggest the usual peephole optimisations, and also to try
out the shortest loop to see how well or badly that performs. eg.
$L2:
movaps xmm0, [esi]
movntpd [ebx+esi], xmm0
add esi, 16
dec ecx ; or loop $L2
jnz $L2
Registers used all different ebx contains old(eax-edx) etc.
And then try loop unrolling.
But rather than optimise by trial and error, blindfolded why not enable
the performance monitoring counters and do it properly by monitoring
cache misses and stalls.
But, I still have to use trial and error to find the best code
arrangement for prefetches because prefretches should be interleaved
with computations.
If you are serious about coding for multiple CPUs with near optimality then the parameterised generated codelets approach used by FFTW and others is probably the way to go. See Bugbears post for details.
The Ring0 driver ia32.sys is at the University of Texas site (playing up
at the moment) but I think Google cache still has a copy - including a
new class library around it.
Try at your own peril but it can be very useful for tuning critical code.
http://216.239.59.104/search?q=cache:H9fYu1GGy0EJ:iss.ices.utexas.edu...
Thank you. I take it that you no longer have easy access to a C++
compiler to do your own experiments?
More lack of time than anything else.
I would only optimise at this sort of low level as a last resort. YMMV
Regards,
Martin Brown
** Posted from http://www.teranews.com **
.
- Follow-Ups:
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- References:
- Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: Hendrik van der Heijden
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: Hendrik van der Heijden
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: Hendrik van der Heijden
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: Martin Brown
- Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- From: aruzinsky
- Share Your Experience with 3DNow, SSE, SSE2 etc.
- Prev by Date: Re: Personal image registration web site
- Next by Date: Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- Previous by thread: Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- Next by thread: Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
- Index(es):
Relevant Pages
|