'Issue with corr.test() results
I am running corr.test() to look at potential correlations between genes and bacteria in a dataframe using this code:
spearman=cor.test(FullSet$counts.Bac, FullSet$counts.Gene, method="spearman", alternative=c("two.sided"))
My dataframe is structured as follows:
Subject | name.Bac | counts.Bac | name.Gene | counts.Gene |
---|---|---|---|---|
10C | Finegoldia | -2.07 | CCL4 | 1.73 |
10C | Finegoldia | -2.07 | CKAP4 | 6.7 |
In total my dataframe has approximately 4 million rows as I am testing about 2000 genes against 33 bacteria across 24 patients.
When I run the above code I get this as the results:
Spearman's rank correlation rho
data: FullSet$counts.Bac and FullSet$counts.Gene
S = 1.1501e+19, p-value = 8.368e-09
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
-0.002845856
However, I was aiming to get the results as a matrix with individual test results and p.values for each comparison so I could plot the results using corrplot(). What is the best way to do this?
Solution 1:[1]
Maybe something like the this?
Partition the data by name.Bac
and name.Gene
, and run the tests in a lapply
loop. Then extract the relevant values with a sequence of sapply
loops and form a results matrix with cbind
.
sp <- split(FullSet, list(FullSet$name.Bac, FullSet$name.Gene))
spearman_list <- lapply(sp, \(x) {
cor.test(x$counts.Bac, x$counts.Gene, data = x, method = "spearman", alternative = "two.sided")
})
stat <- sapply(spearman_list, `[[`, 'statistic')
pval <- sapply(spearman_list, `[[`, 'p.value')
cbind(stat, pval)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Rui Barradas |