Category "quantile"

bucketing with QuantileDiscretizer using groupBy function in pyspark

I have a large dataset like so: | SEQ_ID|RESULT| +-------+------+ |3462099|239.52| |3462099|239.66| |3462099|239.63| |3462099|239.64| |3462099|239.57| |3462099|

Python qcut: At precision = 1, my first bin has obnoxiously long decimal value for left boundary

The following code creates quartile columns with bins: for(a, c) in zip(colnames, cols): cstats[c] = pd.qcut(cstats[a], 4, precision = 1) I understand tha

pandas using qcut on series with fewer values than quantiles

I have thousands of series (rows of a DataFrame) that I need to apply qcut on. Periodically there will be a series (row) that has fewer values than the desired

Prometheus query quantile of pod memory usage performance

I'd like to get the 0.95 percentile memory usage of my pods from the last x time. However this query start to take too long if I use a 'big' (7 / 10d) range. T