'Fitting steps of different amplitude to data using Matlab
I want to fit steps to the data in blue shown in the attached image to get the estimate of their amplitudes as shown in red (I drew those lines using Powerpoint).
How to go about it using Matlab? I tried using the derivative, and was marginally successful. Does Matlab have a function that can be utilized?
Solution 1:[1]
You can do this with a moving mean approach.
Let's start with some random data, where in your case you already have x
and y
:
nFlats = 10; wFlats = 50; % Number of flat regions and number of points per flat
rng(3); % Fix random number generation for the demo
x = (1:nFlats*wFlats).'; % x axis
y1 = repelem(rand(nFlats,1)*3,wFlats); % Flat steps
y2 = (rand(size(x))-0.5)/4; % Random noise
y = smooth(y1 + y2); % Make it more like OP's data
figure(1); clf; hold on;
legend( 'show' );
plot( x, y, '.', 'displayname', 'Points' );
plot( x, y1, '-r', 'linewidth', 1, 'displayname', 'Original steps' );
Now we can use the movmean
function to calculate a forwards and backwards moving mean.
I'm using 1/5 of the expected width of the flat regions as a weighting here. Your mileage may vary on this depending on your data, especially if you don't have an expected flat width in advance, so it may need some tuning.
mBk = movmean( y, [wFlats/5,0] ); % backwards moving mean
mFw = movmean( y, [0,wFlats/5] ); % forwards moving mean
plot( x, mBk, 'linewidth', 1, 'displayname', 'Backwards moving mean' );
plot( x, mFw, 'linewidth', 1, 'displayname', 'Forwards moving mean' );
Now comes the algorithm... From the above plot, we can see that either
- The backwards and forwards means agree within some tolerance, this is definitely a "flat" region.
- The moving means have a delta, which happens during the step transition. We can handle this by
- Starting each "flat" region from the midpoint between the end of the previous region within tolerance and the start of the next. This extends the start of each region back a little bit.
- Filling in any points at the end which haven't been accounted for, since these will be the flats which need extending to meet the start of the next region.
See the code comments for details, final result at the bottom
yEst = NaN(size(x)); % The final result
flatStart = 1; % Start index of current flat
flatEnd = 1; % End index of current flat
isFlat = false; % Track if current point is in a flat
thresh = 0.1; % Threshold to characterise "flatness", depends on data scale
for ii = 1:numel(x)
if (abs(mBk(ii) - mFw(ii)) < thresh) % Check tolerance
if ~isFlat
% Previous point wasn't flat, so extend the start point back to
% the transition mid-point and set isFlat to true
flatStart = floor((flatEnd+ii)/2);
isFlat = true;
end
flatEnd = ii; % Increment end of flat to at least current point
elseif (ii == numel(x))
flatEnd = ii; % Edge case: end point is also end of last flat
else
isFlat = false; % Outside tolerance, not flat
end
% Update the current region to be the mean over the whole flat
yEst( flatStart:flatEnd ) = mean(y(flatStart:flatEnd));
end
% Extend the end-points to meet the next flat
yEst = fillmissing( yEst, 'previous' );
% Plot
plot( x, yEst, 'k', 'linewidth', 1, 'displayname', 'Estimated steps' );
The only bit I've glossed over is the extension using fillmissing
. Technically the mean value used for each flat does not account for these points - you could have a final loop over the unique flat regions to calculate your mean values if needed to put a bow on this.
It seems to scale pretty well:
Depending on how noisy your data is relative to the steps, you may also want to add in a condition on the minimum step width, which could be done by not allowing a change in flatStart
unless flatEnd-flatStart > minWidth
, i.e. the tolerance condition becomes
if (abs(mBk(ii) - mFw(ii)) < thresh) || (flatEnd-flatStart) < minWidth
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |