Re: Matrix Multiplication



I would like to re-iterate my point. The text that I has written was
not about making benchmarks at all. It was designed for novices who
are interested in programming of computations with matrices. The goal
was actually to show that a naive three loop implementation is not
good in any programming language. To this end I have chosen the
dimension 1000x1000 deliberately to make sure that the matrix does not
fit the processor cache. On the other hand, in my view it is useful to
make a three loop implementation at the beginning anyway, say as a
reference point.

For those who are interested in theory I would recommend the ATLAS
paper where there is a discussion why a compiler cannot do things that
ATLAS does.

So the main point was that it is important to use a good library even
in this seemingly simple case.

http://matrixprogramming.com/MatrixMultiply/

On Jan 15, 9:31 am, Sebastian Hanigk <han...@xxxxxxxxx> wrote:
pixel <p...@xxxxxxxx> writes:
The original poster's naive three-loop implementation shootout is not
a good benchmark. For other strategies I could point you in the
direction of cache-obliviousness[1].

It's as good a benchmark as any

If you want to benchmark the matrix multiplication performance, it is
not a good benchmark because the implemented algorithm is suboptimal
from a theoretical and practical point of view.

- it dismantles your cache-obliviousness claim which, if true, should
produce a big performance gain, not a tiny one.

Careful. Cache-oblivious algorithms will not provide a performance boost
if your existing algorithm and its implementation are already quite
good. Look again at matrix multiplication: Intel's MKL routines provide
already over 90% of theoretical peak performance, one does not expect
large benefits here. The real benefit of a cache-oblivious algorithm is
its portability; instead of fiddling around with cache sizes, optimal
blocking and so on for every new processor and architecture, you
implement a cache-oblivious algorithm once and it's good on every
architecture - but obviously it will have a hard time against
hardware-optimised implementations.

Furthermore, language does influence the performance - change arrays
from static to dynamic, make matrix sizes a power of two, and
reevaluate what's "ridiculous" about the *same* algorithm grinding to
a halt.

I do not negate the influence of the choice of languages, but compared
with the right choice of algorithms it should be a small effect. There
exists a language benchmark (<http://shootout.alioth.debian.org/>), but
beware that they compare compounds of implementation, language and
run-time system against each other.

Sebastian

.



Relevant Pages

  • Re: Matrix Multiplication
    ... It's as good a benchmark as any ... If you want to benchmark the matrix multiplication performance, ... The real benefit of a cache-oblivious algorithm is ... exists a language benchmark, ...
    (sci.math.num-analysis)
  • Re: Organizing data for readability and efficiency
    ... same as choosing a good algorithm. ... benchmark to see where the bottleneckare. ... implementation in a lower level language for efficiency. ...
    (comp.lang.perl.misc)
  • Re: Status of the Kernighan Scum project
    ... analysis which I should have performed before entering the benchmark ... The Pike code is a bit over-engineered, ... If you can't translate to some other language x more or less by copying line for line, then that's a weakness rather than a strength. ... A general regular expression is powerful, but is too difficult for the average user to enter. ...
    (comp.programming)
  • Re: Intercepting the text interpreter.
    ... I'm not aware of the Perl community issuing such a joke. ... I will note that such a benchmark of zero-length source code is portable ... The most funny thing would be the messure until what number of NOPs ... If not, the language is ...
    (comp.lang.forth)
  • Re: Alternative COBOL "telco" source program
    ... Didn't we already discuss the concept of a language being Turing ... and when you've stopped changing the rules of the game in ... > Hey its not MY benchmark. ... game, I'll consider writing something in C++. ...
    (comp.lang.cobol)