'Analyzing unevenly spaced timeseries
I have been tasked with analyzing the input flow in a water tank in relation to a number of weather parameters. In a narrower sense, I have to investigate any possible effect that these variables might have on the variable of interest.
That being said, I don't know which method(s) to apply as I'm thinking only of Pearson's correlation coefficient. Even with this one, the sampling rate is different as the weather conditions are measured every 3 hours while input flow every 5 minutes. Should I average over 3 hours, disregard data not corresponding to weather dataset timestamp or would you suggest something else?weather = [ (1.21,0) , (1.08, 0.5), (1.04, 1), (1.02, 1.5)]
input_flow = [ (120,0), (124,1)]
A representation of such data where the first index is the value of the parameter while the second one is time in seconds
Solution 1:[1]
One way to achieve this : `
import numpy as np
a = np.arange(100).reshape(-1,1)
b = np.arange(10).reshape(-1,1)
#How do we -expand- make "B" a set of points the same width as "A"?
expansion_factor = a.shape[0]/b.shape[0]
b_expanded = np.repeat(b, expansion_factor, axis=0)
#How can we combine input data using A and B ?
c = np.concatenate((a, b_expanded),axis=1)
#Could this be what we want to achieve ?
c
It is possible to use sparse matrices as another way.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |