'Finding median eigenvalue with sparse matrix in r

I am working with SVD on a matrix $$Y_{m,n} = T_{m,m} \Sigma D^T_{n,n} $$
where $T$ and $D$ describe the row and the column entities of Y, respectively.

The truncated SVD takes the first $r$ eigenvalues and reduces as well the dimensionality of the problem: $$ \hat{Y}{m,n} = T{m,r} S_{r,r} D^T_{r,n} $$

Instead of looking at the scree plot and get the 90% of Variance https://arxiv.org/pdf/1305.5870.pdf sets out a (approximate) rule to pick r optimally (eq. 5). However, the approximation depends on $\sigma_{med}$ that is the "median empirical singular value" of the matrix $\Sigma$.

Problem is that Y is a sparse matrix 150000*400000 and I don't know it's rank and the number of eigenvalues. I'd like to run svd on the matrix get the diagonal matrix and find the optimal $\tau$ (truncation threshold under which the $\sigma_{i}$ is not considered), however I cannot compute the full svd because the problem is too large. I could think of few options:

  1. run svds with very large $r$ and then computing $s_{med}$ as an approximation
  2. given second answer Full SVD of a large sparse matrix (where only the eigenvalues are required) look for eigs of square 150k matrix (by $YY^T$)
  3. switch to matlab or other languages?

what I tried: > lambdas = eigs(Y %*% Matrix::t(Y),symmetric=TRUE,only.values=TRUE)

getting error: Error in eigs_real_sym(A, nrow(A), k, which, sigma, opts, mattype = "sym_dgCMatrix", : argument "k" is missing, with no default

Any shortcut that exploits linear algebra properties of the matrix and doesn't require computing the whole decomposition?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source