Multilinear regression - techniques and performance



Hi guys!

I have been working on some multilinear regression code
lately, and was wondering what techniques might improve the
performance.

I have about 5,000,000 scalar observations with 2000 independent
variables (Xij, Yi) for i=1..n, j=1..m n=5000000, m=2000.
The matrix Xij is rather sparse.

At the moment, it takes several hours to do the regression
on a mid-spec PC, using rather primitive ad-hoc methods.
I suspect it could be done much faster, perhaps with
Monte-Carlo methods and/or resampling.

Could this kind of problem typically be solved in a few seconds
on a regular PC? Are there any free/GPL libraries or packages
out there which can do this? Which methods are likely to be
the fastest for this kind of problem?

Thanks in advance for any suggestions/links/hints!
--
renderer

(not homework!)

.