Category "statistics"

How to combine countpct and binomCI into the same summary statistic to be used in tableby function?

I'm using the tableby function from the arsenal package to create summary tables. For most of the statistics I need to generate, this package gives me exactly t

Difference in R and SPSS LMM output

I am working on a linear mixed model and am attempting to run one on the same data in r and spss. I'm using a treatment with two levels, looking at 10 different

How can I find the mode (a number) of a kde histogram in python

I want to determine the X value that has the highest pick in the histogram. The code to print the histogram: fig=sns.displot(data=df, x='degrees', hue="TYPE", k

Generate underlying distribution from bins in python

I found a PDF document describing the income distribution in the US in 1978. Per income range I have the percentage of the population that falls in that income

perl: Finding mean and variance of large numbers without overflow

I am using a subroutine (stats) to calculate statistics for a list of numbers. These numbers may be big enough to lose precision if stored as normal perl number

How to perform a Levene's test using scipy

I've been trying to use scipy.stats.levene with no success. I have a numpy matrix with shape (2128, 45100). Each row is a sample and belongs to one of 3 cluste

How to run cor.test() on two different dataframes

I would like to run cor.test() on two seperate dataframes but I am unsure how to proceed. I have two example dataframes with identical columns (patients) but di

returning cov and std from sklearn gaussian process?

I can return the covariance or the standard deviation from a GP using sklearn, like: y, cov = gp.predict(Xpredict,return_cov=True) y, std = gp.predict(Xpredict,

Strange statistics in Google Play Developer Console

Today I noticed strange statistics in my Google Play Developer Console in one of my application It is about Final installs on active devices: 17 July - th

How can I apply fisher.test in R to a large matrix, and extract p-values to a new matrix?

I have a large matrix (12 rows, 53 columns) with counts of how many times genes in my clusters "A", "B", "C", etc. overlap with clusters created by someone else

error with SAlib library for Sensitivity analysis in python

I am trying to perform sensitivity analysis using Sobol`s method. I always get an error which i can not solve. the code and the result are below. the input vari

Python : How to interpret the result of logistic regression by sm.Logit

When I run a logistic regression by sm.Logit (in the statsmodel library), part of the result is like this: Pseudo R-squ.: 0.4335 Log-Likeliho

replacing missing values in r

I need help in replacing missing values in the following dummy file. The following rule need to be followed when replacing a missing value. If the value is the

BlueSky Statistics Hanging

When I try to start BlueSky Statistics, sometimes the application hangs and the "Starting BlueSky Statistics" box remains on the screen. I see the app open in

R: Panel Data: calculating mean and median of variables based on date / dummy variable

So i am analysing fund panel data. I estimated a fixed effect model with double clustered error terms along the identification (ISIN) and (Date). Each fund has

To which value in the statsmodels summary relates the error bar size in the plot?

With the following code, I get a plot how the regression was done for my data. In the plot also vertical (error?) bars are shown. To which number in the sum

How to get median and quartiles/percentiles of an array in JavaScript (or PHP)?

This question is turned into a Q&A, because I had struggle finding the answer, and think it can be useful for others I have a JavaScript array of value

what exactly is 'flow' in nfdump? can i get tcp sessions with nfdump?

i need to create some statistics from packets in my network interface, but i'm concerned only for my tcp sessions. i thought i could do that with nfdump and nfs

Random document in ElasticSearch

Is there a way to get a truly random sample from an elasticsearch index? i.e. a query that retrieves any document from the index with probability 1/N (where N i

How to properly remove redundant components for Scikit-Learn's DPGMM?

I am using scikit-learn to implement the Dirichlet Process Gaussian Mixture Model: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/mixture/dp