'max.col with the value not the index
If I have a matrix:
mod_xgb_softprob$pred[1:3,1:3]
[,1] [,2] [,3]
[1,] 6.781361e-04 6.781361e-04 6.781422e-04
[2,] 2.022457e-07 2.022457e-07 4.051039e-07
[3,] 6.714367e-04 6.714367e-04 6.714399e-04
Generated by:
> dput(mod_xgb_softprob$pred[1:3,1:3])
structure(c(0.00067813612986356, 2.02245701075299e-07, 0.000671436660923064,
0.00067813612986356, 2.02245701075299e-07, 0.000671436660923064,
0.000678142241667956, 4.05103861567113e-07, 0.000671439862344414
), .Dim = c(3L, 3L))
I can transform it into a data frame and get the column with the highest value:
x <- mymatrix %>% as.data.frame %>% mutate(max_prob = max.col(., ties.method = "last"))
Looks like this:
> x
V1 V2 V3 max_prob
1 6.781361e-04 6.781361e-04 6.781422e-04 3
2 2.022457e-07 2.022457e-07 4.051039e-07 3
3 6.714367e-04 6.714367e-04 6.714399e-04 3
If I wanted max_prob to be the actual value not the column index, how would I do that?
Solution 1:[1]
If you don't mind base R you can use apply. For example:
> x <- matrix(rnorm(9), ncol = 3)
> apply(x, 1, max)
[1] 0.246652 1.063506 2.148525
gives the maximum of the column vectors of x
.
Solution 2:[2]
Beside the apply
method from @Mariane and matrix indexing from @lmo's comment, you can also use matrixStats::rowMaxs
:
matrixStats::rowMaxs(mymatrix)
# [1] 6.781422e-04 4.051039e-07 6.714399e-04
If you have a data frame, you can use do.call(pmax, ...)
to calculate the parallel maxima of the input columns:
mymatrix %>% as.data.frame %>% mutate(max_val = do.call(pmax, .))
# V1 V2 V3 max_val
#1 6.781361e-04 6.781361e-04 6.781422e-04 6.781422e-04
#2 2.022457e-07 2.022457e-07 4.051039e-07 4.051039e-07
#3 6.714367e-04 6.714367e-04 6.714399e-04 6.714399e-04
Solution 3:[3]
Another option which uses max.col
, seq_along
and mathematics. If m
is your matrix, then the following works as well:
mc <- max.col(m, ties.method = 'last')
m[(mc - 1) * nrow(m) + seq_along(mc)]
The result:
[1] 6.781422e-04 4.051039e-07 6.714399e-04
With cbind
you can than bind this result to the matrix again:
> cbind(m, m[(mc - 1) * nrow(m) + seq_along(mc)])
[,1] [,2] [,3] [,4]
[1,] 6.781361e-04 6.781361e-04 6.781422e-04 6.781422e-04
[2,] 2.022457e-07 2.022457e-07 4.051039e-07 4.051039e-07
[3,] 6.714367e-04 6.714367e-04 6.714399e-04 6.714399e-04
Solution 4:[4]
This is a variation on @h3rm4n's answer, but you can use a special kind of matrix subsetting as well:
> x[cbind(1:nrow(x), max.col(x))]
[1] 6.781361e-04 4.051039e-07 6.714367e-04
Using an index like cbind(i, j)
extracts row i
and column j
for each entry in the resulting matrix.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | |
Solution 4 | Noah |