'Turn a named vector into a symmetric matrix in R?

Similar questions have been asked, however, none have the additional element of having to split the vector names, so I am asking a new question.

I am trying to turn a named vector into a symmetric matrix in R. My vector contains the names of every combination of values in the matrix. so I need to split the names into their component parts.

For example, if my data looks like this:

v <- c(
  "x1 x2" = 0.81899860,
  "x1 x3" = 0.10764701,
  "x2 x3" = 0.03923967,
  "x1 x4" = 0.03457240,
  "x2 x4" = 0.05954789,
  "x3 x4" = 0.15535316,
  "x1 x5" = 0.04041266,
  "x2 x5" = 0.05421003,
  "x3 x5" = 0.09198977,
  "x4 x5" = 0.15301872
)

We can see that each name is a combination of 2 of the variables. I am trying to turn this into a symmetric matrix (with the diagonal as zeros). For clarity, my desired output would look like:

           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

Any suggestions as to how I could do this?

EDIT

As one of the answers highlighted that my question was too vague, im making an edit to reflect this. I am looking for a general solution to this problem, no matter what the names are in the vector. For example, my named vector could look like this:

v <- c(
  "apple banana" = 0.81899860,
  "apple orange" = 0.10764701,
  "banana orange" = 0.03923967,
  "apple pear" = 0.03457240,
  "banana pear" = 0.05954789,
  "orange pear" = 0.15535316,
  "apple plum" = 0.04041266,
  "banana plum" = 0.05421003,
  "orange plum" = 0.09198977,
  "pear plum" = 0.15301872
)


Solution 1:[1]

We could split the names, expand the data to create the missing combinations (complete) and reshape to wide with pivot_wider

library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>% 
   mutate(v = v) %>% 
   complete(V1 = un1, V2 = un1, 
     fill = list(v = 0)) %>% 
   pivot_wider(names_from = V1, values_from = v) %>% 
   column_to_rownames('V2') %>% 
   as.matrix %>% 
   {. + t(.)}

-output

> out
           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

Or using base R

d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)

-output

m1
     x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

Using the second example

> m1
            apple     banana     orange       pear       plum
apple  0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear   0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum   0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

Solution 2:[2]

1) We generate the vertices v using scan and then use a nested sapply to generate the required matrix. No packages are used.

edge2adj <- function(e) {
  v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
  sapply(v, function(i) sapply(v, function(j) 
    Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}


# tests where v1 and v2 are the two examples in the question

edge2adj(v1)
##            x1         x2         x3         x4         x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

edge2adj(v2)
##             apple     banana     orange       pear       plum
## apple  0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear   0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum   0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

2) (1) is likely preferable to this alternative due to its greater generality but we point out that if we knew that the edges were in the order shown in the question (sorted and in upper triangular order) then we could use upper.tri like this. No packages are used.

edge2adj2 <- function(e) {
  v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
  m <- sapply(v, function(i) sapply(v, function(j) 0))
  m[upper.tri(m)] <- e
  m + t(m)
}

identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE

identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE

Note

v1 <- c(
  "x1 x2" = 0.81899860,
  "x1 x3" = 0.10764701,
  "x2 x3" = 0.03923967,
  "x1 x4" = 0.03457240,
  "x2 x4" = 0.05954789,
  "x3 x4" = 0.15535316,
  "x1 x5" = 0.04041266,
  "x2 x5" = 0.05421003,
  "x3 x5" = 0.09198977,
  "x4 x5" = 0.15301872
)

v2 <- c(
  "apple banana" = 0.81899860,
  "apple orange" = 0.10764701,
  "banana orange" = 0.03923967,
  "apple pear" = 0.03457240,
  "banana pear" = 0.05954789,
  "orange pear" = 0.15535316,
  "apple plum" = 0.04041266,
  "banana plum" = 0.05421003,
  "orange plum" = 0.09198977,
  "pear plum" = 0.15301872
)

Solution 3:[3]

Another base R way:

comb = names(v)
inds = sapply(comb, function(x){
        c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
        simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])


m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))

m[t(inds)]=v
m[t(inds1)]=v

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 tushaR