'Turn a named vector into a symmetric matrix in R?
Similar questions have been asked, however, none have the additional element of having to split the vector names, so I am asking a new question.
I am trying to turn a named vector into a symmetric matrix in R. My vector contains the names of every combination of values in the matrix. so I need to split the names into their component parts.
For example, if my data looks like this:
v <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
We can see that each name is a combination of 2 of the variables. I am trying to turn this into a symmetric matrix (with the diagonal as zeros). For clarity, my desired output would look like:
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
Any suggestions as to how I could do this?
EDIT
As one of the answers highlighted that my question was too vague, im making an edit to reflect this. I am looking for a general solution to this problem, no matter what the names are in the vector. For example, my named vector could look like this:
v <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
Solution 1:[1]
We could split the names, expand the data to create the missing combinations (complete
) and reshape to wide with pivot_wider
library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>%
mutate(v = v) %>%
complete(V1 = un1, V2 = un1,
fill = list(v = 0)) %>%
pivot_wider(names_from = V1, values_from = v) %>%
column_to_rownames('V2') %>%
as.matrix %>%
{. + t(.)}
-output
> out
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
Or using base R
d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)
-output
m1
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
Using the second example
> m1
apple banana orange pear plum
apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
Solution 2:[2]
1) We generate the vertices v
using scan
and then use a nested sapply
to generate the required matrix. No packages are used.
edge2adj <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
sapply(v, function(i) sapply(v, function(j)
Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}
# tests where v1 and v2 are the two examples in the question
edge2adj(v1)
## x1 x2 x3 x4 x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
edge2adj(v2)
## apple banana orange pear plum
## apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
2) (1) is likely preferable to this alternative due to its greater generality but we point out that if we knew that the edges were in the order shown in the question (sorted and in upper triangular order) then we could use upper.tri
like this. No packages are used.
edge2adj2 <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
m <- sapply(v, function(i) sapply(v, function(j) 0))
m[upper.tri(m)] <- e
m + t(m)
}
identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE
identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE
Note
v1 <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
v2 <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
Solution 3:[3]
Another base R
way:
comb = names(v)
inds = sapply(comb, function(x){
c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])
m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))
m[t(inds)]=v
m[t(inds1)]=v
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | tushaR |