'Python CUDA parallize multiple SVD's of small matrices
I've seen a similar post on stackoverflow which tackles the problem in C++: Parallel implementation for multiple SVDs using CUDA I want to do exactly the same in python, is that possible? I have multiple matrices (approximately 8000 with size 15x3) and each of them I want to decompose using the SVD. This takes years on a CPU. Is it possible to do that in python? My computer has an NVIDIA GPU installed. I already had a look at several libraries such as numba, pycuda, scikit-cuda, cupy but didnt found a way to implement my plan with that libraries. I would be very glad for some help.
Solution 1:[1]
cuPy gives access to cuSolver, including a batched SVD:
https://docs.cupy.dev/en/stable/reference/generated/cupy.linalg.svd.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | talonmies |