Category "statistics"

How to perform a Levene's test using scipy

I've been trying to use scipy.stats.levene with no success. I have a numpy matrix with shape (2128, 45100). Each row is a sample and belongs to one of 3 cluste

How to run cor.test() on two different dataframes

I would like to run cor.test() on two seperate dataframes but I am unsure how to proceed. I have two example dataframes with identical columns (patients) but di

returning cov and std from sklearn gaussian process?

I can return the covariance or the standard deviation from a GP using sklearn, like: y, cov = gp.predict(Xpredict,return_cov=True) y, std = gp.predict(Xpredict,

Strange statistics in Google Play Developer Console

Today I noticed strange statistics in my Google Play Developer Console in one of my application It is about Final installs on active devices: 17 July - th

How can I apply fisher.test in R to a large matrix, and extract p-values to a new matrix?

I have a large matrix (12 rows, 53 columns) with counts of how many times genes in my clusters "A", "B", "C", etc. overlap with clusters created by someone else

error with SAlib library for Sensitivity analysis in python

I am trying to perform sensitivity analysis using Sobol`s method. I always get an error which i can not solve. the code and the result are below. the input vari

Python : How to interpret the result of logistic regression by sm.Logit

When I run a logistic regression by sm.Logit (in the statsmodel library), part of the result is like this: Pseudo R-squ.: 0.4335 Log-Likeliho

replacing missing values in r

I need help in replacing missing values in the following dummy file. The following rule need to be followed when replacing a missing value. If the value is the

BlueSky Statistics Hanging

When I try to start BlueSky Statistics, sometimes the application hangs and the "Starting BlueSky Statistics" box remains on the screen. I see the app open in

R: Panel Data: calculating mean and median of variables based on date / dummy variable

So i am analysing fund panel data. I estimated a fixed effect model with double clustered error terms along the identification (ISIN) and (Date). Each fund has

To which value in the statsmodels summary relates the error bar size in the plot?

With the following code, I get a plot how the regression was done for my data. In the plot also vertical (error?) bars are shown. To which number in the sum

How to get median and quartiles/percentiles of an array in JavaScript (or PHP)?

This question is turned into a Q&A, because I had struggle finding the answer, and think it can be useful for others I have a JavaScript array of value

what exactly is 'flow' in nfdump? can i get tcp sessions with nfdump?

i need to create some statistics from packets in my network interface, but i'm concerned only for my tcp sessions. i thought i could do that with nfdump and nfs

Random document in ElasticSearch

Is there a way to get a truly random sample from an elasticsearch index? i.e. a query that retrieves any document from the index with probability 1/N (where N i

How to properly remove redundant components for Scikit-Learn's DPGMM?

I am using scikit-learn to implement the Dirichlet Process Gaussian Mixture Model: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/mixture/dp

How to use `Dirichlet Process Gaussian Mixture Model` in Scikit-learn? (n_components?)

My understanding of "an infinite mixture model with the Dirichlet Process as a prior distribution on the number of clusters" is that the number of clusters is d

MCAR Little's test in Python

How can I execute Little's Test, to find MCAR in Python? I have looked at the R package for the same test, but I want to do it in Python. Is there an alternate

Does a plug-in selector bivariate kernel density estimator with weights exist for python?

I am trying to calculate the kernel density estimate for a set of weighted bivariate data points. I am currently using KDEpy.FFTKDE. However, this does not prov

How to show the y-axis of seaborn displot as percentage

I'm using seaborn.displot to display a distribution of scores for a group of participants. Is it possible to have the y axis show an actual percentage (example

Equivalent C# Function For Excels Norm.S.Inv Function

I plan on finding the benchmark Z's of some data in C#. For this I need the Norm.S.Inv function from Excel. I am not able to find any sort of implementation for