Suppose that, in my CUDA grid block, I have a Matrix, which I want to multiply by a vector. And that my data type is either half, single, or double precision (i
In the CUDA Programming guide, v11.7, section B.24.6. Element Types & Matrix Sizes, there's a table of supported type combinations, in which the multiplicat