Lanczos Singular Value Decomposition for sparse matrices


#1

Math.NET Numerics has singular value decomposition (SVD), but the current implementation doesn’t scale well on large sparse matrices. Also, the implemented algorithm doesn’t allow specifying a maximum number of dimensions: this feature would be useful when doing principal component analysis.
The solution is to get a library like LIBSVDC (https://tedlab.mit.edu/~dr/SVDLIBC/) and DllImport it. This seems to work reliably after some fiddling since the code is not exactly “friendly” to .NET in its current form. However, this unmanaged C code, (sadly, not thread-safe at all), is extremely fast and precise as long as you want the “top n” high order singular values.

I had some time on my hands and I translated the underlying Lanzcos SVD algorithm into pure C#, with extension methods to work on Math.NET Numerics Matrix<double> objects. It is almost as fast as LIBSVDC (discounting the format conversion time) and my limited testing validated the results with respect to that library. It is also thread- and multiprocessor safe.
It presents the factors as a derivative of the Svd<double> type, albeit without condition number, rank, determinant and L2 Norm, or the ability to solve systems: I didn’t need the functionality and my natural laziness turned it into an exercise for the reader :slight_smile:.

What I would like to do is contribute that code (one assembly containing 6 source files and a package config) to Math.NET Numerics, but being lazy I’d like to just send it to a project representative and be done with it.

That is, if anyone is interested.

Vincent Van Den Berghe
Bureau van Dijk Electronic Publishing


(Christoph Rüegg) #2

Thanks, I’d certainly be interested to have a look at it and maybe integrate into mainline. You should be able to add a zip file as attachment to a message right here in the forum.

Have you compared your implementation to our existing Intel MKL native provider?


#3

Hello Christoph,

I’d like to upload the thing, but I get the error message “new users cannot upload attachments”. I have no idea how long one has to wait before one becomes of age here.

Regarding anything comparable in the existing Intel MKL provider, I am assuming you refer to the methods SingularValueDecomposition in MklLinearAlgebraProvider.
Sadly, trying to factor a 75000x30000 sparse matrix to get the 10 largest singular values (vector S) fails miserably (it either reports “out of memory” because it expects a “dense” representation, or continues to run after several hours if you manage to have enough memory to hold it all).
It’s about 20 seconds with Lanczos, directly from the sparse matrix.

Vincent


(blvy) #4

Hi Vincent,

I am looking for a C# implemetation for Lanczos SVD which works with Math.NET SparseMatrix. Do you mind share your implementation to me? Thank you.

  • Bruce

#6

Lanczos.zip (721.7 KB)

perhaps it will work now: see attached


#7

Hello Christoph,

It looks like I’m finally old enough to post attachments, so you’ll find it as a reply to someone else in this thread.

Have fun with it!

Vincent