Re: Sin & Cos speed worries
- From: "Julian V. Noble" <jvn@xxxxxxxxxxxx>
- Date: Tue, 01 Aug 2006 09:58:20 -0400
rgootee@xxxxxxxxx wrote:
I have recently started work on a project that is VERY focused the
execution time of the algorithms. The lead programmer on said project
has said to minimize "sin,cos,and square-roots" to keep the execution
time low.
Unfortunetly some of the algorithms that my group has developed are
extremely sin and cosine intensive. To compensate, one of the algorithm
designers attempted to use "small angle approximations" or basically
just the first three entries of a Taylor series expansion.
When I tested the Taylor series code, against the "math.h sin"
function, it seemed that the "sin" function was faster! I basically
just did a loop over about 100000 random values (the same value for
each method, defined in an array) bounded between 0 - pi and calculated
the overall time.
Does anyone know how "math.h" 's "sin" function is calculated, and a
source to go along with the knowledge? From word of mouth I think it
uses a combination of a lookup table and Taylor expansion.
If its of any help, I am on a dual-G5 running OSX, and I can post the
code as well.
I would be very surprised if sin/cos and sqrt are defined as high
level C functions. Since Pentia and advanced PowerPC architectures
all include these functions in hardware, any implementations of C
for PC or Mac would almost certainly default to the (vastly faster)
hardware versions. It is very hard to beat them using e.g. a lookup
table or rational approximation in high-level C unless you do it
with inline assembler.
That said, if you go to the Intel or Motorola web sites you can
get specs for how many cycles sine or cosine take. IIRC, on late-
model Pentia it's around 500-1000. OTOH sqrt takes about the same
as fp division, much slower than fp multiplication. So it definitely
pays to avoid computing trig functions, if you can.
Jon Bentley, in "Programming Pearls" (or in "More Programming
Pearls") discusses ways to revise an algorithm to avoid computing
trig functions. They may apply to your problem.
I would also advise care in the use of fp divisions---if you are
dividing by a constant inside a loop, it is always worthwhile to
compute the inverse of the constant before the loop and then to
multiply by that within the loop, on a machine where division is
much slower than multiplication.
Finally, I would look at varous books on code optimization for
Pentium-class cpu's. There are some non-trivial tricks about
ordering the operations that are worth looking into. I also
discovered recently, when I wrote a machine-code version of a
program to do high-precision arithmetic using a 64+64 -bit
representation ("double-double"), that exactly where machine
code resides in memory can make a factor-10 difference in speed.
Good luck. If you have to resort to tricks and machine code to
achieve the necessary speed, the code will not be portable and
it will be harder to maintain. Don't forget to document it more
thoroughly than usual.
--
Julian V. Noble
Professor Emeritus of Physics
University of Virginia
.
- Follow-Ups:
- Re: Sin & Cos speed worries
- From: Ronald Bruck
- Re: Sin & Cos speed worries
- From: David N. Williams
- Re: Sin & Cos speed worries
- References:
- Sin & Cos speed worries
- From: rgootee
- Sin & Cos speed worries
- Prev by Date: Re: Problems with Franke's inverse distance weighting interpolation
- Next by Date: symmetric tridiagonal eigenproblem
- Previous by thread: Re: Sin & Cos speed worries
- Next by thread: Re: Sin & Cos speed worries
- Index(es):
Relevant Pages
|