'Analyzing unevenly spaced timeseries

I have been tasked with analyzing the input flow in a water tank in relation to a number of weather parameters. In a narrower sense, I have to investigate any possible effect that these variables might have on the variable of interest. That being said, I don't know which method(s) to apply as I'm thinking only of Pearson's correlation coefficient. Even with this one, the sampling rate is different as the weather conditions are measured every 3 hours while input flow every 5 minutes. Should I average over 3 hours, disregard data not corresponding to weather dataset timestamp or would you suggest something else?

weather = [ (1.21,0) , (1.08, 0.5), (1.04, 1), (1.02, 1.5)]
input_flow = [ (120,0), (124,1)]

A representation of such data where the first index is the value of the parameter while the second one is time in seconds



Solution 1:[1]

One way to achieve this : `

import numpy as np

a = np.arange(100).reshape(-1,1)
b = np.arange(10).reshape(-1,1)

#How do we -expand- make "B" a set of points the same width as "A"?

expansion_factor = a.shape[0]/b.shape[0]
b_expanded = np.repeat(b, expansion_factor, axis=0)

#How can we combine input data using A and B ?
c = np.concatenate((a, b_expanded),axis=1)

#Could this be what we want to achieve ?
c

It is possible to use sparse matrices as another way.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1