'Unable to call Dask Array Log10 on native datatypes
I'm working on doing some data aggregation across a dask-dataframe. The data is natively stored as parquet but I can manipulate it through to the following lines. I am power summing log-values that are stored in each row to a single vector and then returning to log value. It is the final step that I am having issues with
slice_dataframe = source_dataframe[filter_idx]
#linearize and sum. The below line works when calling vals.compute()
#slice_dataframe['data'].values returns a dask array
vals = da.sum(10**(slice_dataframe['data'].values/10),axis=0)
# cast back to log spacing. This does not work
log_values = 10*da.log10(vals)
I get the error returned as:
''' TypeError: Parameters of such types are not supported by log10 '''
Any ideas
Solution 1:[1]
After doing some reading I found in the documentation that the error being thrown is a data-type operation.
The error is sourced in the line:
vals = da.sum(10**(slice_dataframe['data'].values/10),axis=0)
The return is the generic type object. By explicitly casting the value in advance to a float64 this resolve the error posted above.
Fixed:
vals = da.sum(10**(slice_dataframe['data'].values/10),axis=0).astype('float64')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | fnavyblu |