I am trying to merge large dataframes using dask.dataframe.multi.merge_asof, but I am running into issues with accumulating unmanaged memory on the cluster. I h
I have an xarray.Dataset with two 1D variables sun_azimuth and sun_elevation with multiple timesteps along the time dimension: import xarray as xr import numpy
Am building a custom graph for one operation with Dask. Am familiar with how to pass arguments to a function in Dask graph and have read up on the docs. However
I'm working on doing some data aggregation across a dask-dataframe. The data is natively stored as parquet but I can manipulate it through to the following line
I am creating a dask dataframe from a pandas dataframe using the from_pandas() function. When I try to select two columns from the dask dataframe using the squa
The progress bar works beautifully when used with the multiprocessing backend but doesn't seem to work at all when using a distributed scheduler as the backend.
I have a dask cluster with n workers and want the workers to do queries to the database. But the database is only capable of handling m queries in parallel wher
I have an image stack stored in an XArray DataArray with dimensions time, x, y on which I'd like to apply a custom function along the time axis of each pixel su
I'm running into a problem using a Xarray together with SLURMcluster from Dask. I'm using pandas_plink to load some data into a Xarray, then filtering it and ma