Category "statistics"

Getting error while generating heatmap in python:ValueError: Must pass 2-d input. shape=()

I am getting errors while generating maps of h3 moments in python for an image file called "image_test". The error apparently is for the shape of the values tha

Trying to replicate figures from Bayesian statistics without tears: A sampling-resampling perspective, but failed

I'm trying to replicate the three figures from the paper Bayesian statistics without tears: A sampling-resampling perspective, which can be fo

Calculating Mean Squared Error with Sample Mean

I was given this assignment, and I'm not sure if i understand the question correctly. We considered the sample-mean estimator for the distribution mean. Anothe

Random proababilty of rolling down to 1 from 1000 [closed]

If I roll a random number between 1 and a 1000, then use this random number as the new range from 1 and new number for example if it rolled 50

Causal Inference where the treatment assignment is randomised

I have mostly worked with Observational data where the treatment assignment was not randomized. In the past, I have used PSM, IPTW to balance and then calculate

fisher.test crash R with *** caught segfault *** error

As title said, fisher.test crash R with *** caught segfault *** error. Here is the code to produce the error: d<-matrix(c(1,0,5,2,1,90,0,0,0,1,0,14,0,0,0,0,0

How to plot correlation matrix/heatmap with categorical and numerical variables

I have 4 variables of which 2 variables are nominal (dtype=object) and 2 are numeric(dtypes=int and float). df.head(1) OUT: OS_type|Week_day|clicks|avg_app_s

How to get multiple combinations of multiple lists in python (Multiple n Choose K or nCr)

I have been looking on google and stack overflow for a few hours and I am sure there is an answer for what this is mathematically or perhaps it is just what the

Statistical difference between linear regressions

I have a statistical question on which I am stuck: Imagine you have 5 corn fields. You know the number of corn plant there is in each fields. You know want to c

why a specific model is not appropriate, given a data with 6 variables (they are chr variables)

i want to show why a specific model is not appropriate, given a data with 6 variables (they are chr variables) the model is y= abc*(x1+x2) a and b from the data

How to test symmetry of distribution in Python?

Given data I want to test symmetry of their distribution. In R is function symmetry.test(..) https://www.rdocumentation.org/packages/lawstat/versions/3.4/topics

Issue with corr.test() results

I am running corr.test() to look at potential correlations between genes and bacteria in a dataframe using this code: spearman=cor.test(FullSet$counts.Bac, Full

P value and critical value in hypothesis testing

I need little clarification in p value and critical value approach in hypothesis testing regarding below example. Null Hypothesis : population mean = 80 Altern

How do I determine the likelihood of my data coming from a model distribution using Julia?

I am trying to do a statistical analysis in Julia on experimental data. I tried to create a model and use Turing to obtain distributions for the mean and standa

Can I include covariates outside of the minimally sufficient set in a causal framework that aren't in the causal pathway?

I am applying a causal method to a cohort study analysis on pollutant exposure and disease X. Based on our understanding of the disease, we believe that aging i

How to combine countpct and binomCI into the same summary statistic to be used in tableby function?

I'm using the tableby function from the arsenal package to create summary tables. For most of the statistics I need to generate, this package gives me exactly t

Difference in R and SPSS LMM output

I am working on a linear mixed model and am attempting to run one on the same data in r and spss. I'm using a treatment with two levels, looking at 10 different

How can I find the mode (a number) of a kde histogram in python

I want to determine the X value that has the highest pick in the histogram. The code to print the histogram: fig=sns.displot(data=df, x='degrees', hue="TYPE", k

Generate underlying distribution from bins in python

I found a PDF document describing the income distribution in the US in 1978. Per income range I have the percentage of the population that falls in that income

perl: Finding mean and variance of large numbers without overflow

I am using a subroutine (stats) to calculate statistics for a list of numbers. These numbers may be big enough to lose precision if stored as normal perl number