'max.col with the value not the index

If I have a matrix:

mod_xgb_softprob$pred[1:3,1:3]
             [,1]         [,2]         [,3]
[1,] 6.781361e-04 6.781361e-04 6.781422e-04
[2,] 2.022457e-07 2.022457e-07 4.051039e-07
[3,] 6.714367e-04 6.714367e-04 6.714399e-04

Generated by:

> dput(mod_xgb_softprob$pred[1:3,1:3])
structure(c(0.00067813612986356, 2.02245701075299e-07, 0.000671436660923064, 
0.00067813612986356, 2.02245701075299e-07, 0.000671436660923064, 
0.000678142241667956, 4.05103861567113e-07, 0.000671439862344414
), .Dim = c(3L, 3L))

I can transform it into a data frame and get the column with the highest value:

x <- mymatrix %>% as.data.frame %>% mutate(max_prob = max.col(., ties.method = "last"))

Looks like this:

> x
            V1           V2           V3 max_prob
1 6.781361e-04 6.781361e-04 6.781422e-04        3
2 2.022457e-07 2.022457e-07 4.051039e-07        3
3 6.714367e-04 6.714367e-04 6.714399e-04        3

If I wanted max_prob to be the actual value not the column index, how would I do that?



Solution 1:[1]

If you don't mind base R you can use apply. For example:

> x <- matrix(rnorm(9), ncol = 3)
> apply(x, 1, max)
[1] 0.246652 1.063506 2.148525

gives the maximum of the column vectors of x.

Solution 2:[2]

Beside the apply method from @Mariane and matrix indexing from @lmo's comment, you can also use matrixStats::rowMaxs:

matrixStats::rowMaxs(mymatrix)
# [1] 6.781422e-04 4.051039e-07 6.714399e-04

If you have a data frame, you can use do.call(pmax, ...) to calculate the parallel maxima of the input columns:

mymatrix %>% as.data.frame %>% mutate(max_val = do.call(pmax, .))

#            V1           V2           V3      max_val
#1 6.781361e-04 6.781361e-04 6.781422e-04 6.781422e-04
#2 2.022457e-07 2.022457e-07 4.051039e-07 4.051039e-07
#3 6.714367e-04 6.714367e-04 6.714399e-04 6.714399e-04

Solution 3:[3]

Another option which uses max.col, seq_along and mathematics. If m is your matrix, then the following works as well:

mc <- max.col(m, ties.method = 'last')
m[(mc - 1) * nrow(m) + seq_along(mc)]

The result:

[1] 6.781422e-04 4.051039e-07 6.714399e-04

With cbind you can than bind this result to the matrix again:

> cbind(m, m[(mc - 1) * nrow(m) + seq_along(mc)])
             [,1]         [,2]         [,3]         [,4]
[1,] 6.781361e-04 6.781361e-04 6.781422e-04 6.781422e-04
[2,] 2.022457e-07 2.022457e-07 4.051039e-07 4.051039e-07
[3,] 6.714367e-04 6.714367e-04 6.714399e-04 6.714399e-04

Solution 4:[4]

This is a variation on @h3rm4n's answer, but you can use a special kind of matrix subsetting as well:

> x[cbind(1:nrow(x), max.col(x))]
[1] 6.781361e-04 4.051039e-07 6.714367e-04

Using an index like cbind(i, j) extracts row i and column j for each entry in the resulting matrix.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4 Noah