I have recently been giving gslender a hand with a small bit of coding and, in particular, we were looking into alternative ways of doing median filtering. I had doubts about how useful median filters would be in reality, but they pose an interesting programming problem.
When I mentioned my doubts about median filters, we got into a discussion which overlapped with this thread. I had thought that I must have explained my reasoning here fairly well, because gslender included my tpsDOT changes in one of his patches. But he is still pretty doubtful about it and I thought that I should perhaps give it one more go to consolidate the reasoning.
Firstly, I must emphasise that this approach is purely for the dot variables. This is not to say that smoothing the MAP, CLT, etc. is unimportant, it's just that getting a good estimate of the rate of change of a value is not well served by smoothing that value first. IOW, smoothed MAP and mapDOT should be independently derived from the raw MAP samples.
OK, so I have dusted off the R statistical computer package and defined a simple test case along with five DOT smoothing approaches.
The test case:
A value, say TPS is running at a level reading of 0.2, ten samples in it ramps up at a constant 0.2 units per sample until it reaches 4.0 where it stops changing. To this ideal, I have added some gaussian noise with a sigma of 0.05 units to get our simulated raw values. Here's what it looks like:
raw.png
The algorithms:
- Raw: The rate of change is simply the change in y value.
- Median: The median of the most recent 9 values is used in each slot.
- Lag: A MS style lag of 50% is applied.
- Moving Anchor: the algorithm I described in this thread. A movement of more than 0.5 units is considered significant. More than 4 samples without significant movement means we've stopped changing.
- LSQ: My guess on what James and Ken might have in mind for this. The last five values have a line of best fit fitted. The slope of this line is used as the rate of change of the current value.
Code and results:
Here is the R code for these tests:
Code: Select all
#
# Implementation of MS lag averaging
#
lagavg<-function(a, lag=0.7)
{
rv<-a;
for (i in 2:length(a))
{
rv[i] = rv[i - 1]+(a[i] - rv[i - 1])*lag
}
return(data.frame(x<-1:length(rv), y<-rv))
}
#
# Implementation of the robs moving anchor algorithm
#
movanchor<-function(a, thresh=0.3, persist=4)
{
x<-1;
y<-a[1];
for (i in 2:length(a))
{
if (abs(y[length(y)] - a[i]) > thresh)
{
# Significant movement
x<-append(x,i);
y<-append(y, a[i]);
}
else if (i - x[length(x)] > persist)
{
# Reset the anchor
x<-append(x,i);
y<-append(y, a[i])
}
}
return(data.frame(x<-x, y<-y));
}
#
# Return the dot value for the data.frame provided
#
dotvalue<-function(f)
{
return(c(0, diff(f$y)/diff(f$x)));
}
#
# Median filter
#
median<-function(a, win=3)
{
#
# We want the results of runmed preceded by this many dummies
# since median(x) is to substitute for the last value in the
# window, not the centre.
#
h<-trunc(win/2)
x<-1:length(a);
y<-c(rep(a[1], h),(runmed(a,win))[1:(length(a)-h)])
return(data.frame(x<-x, y<-y));
}
#
# Least squares line of best fit
#
lsq<-function(a, win=5)
{
rv<-a;
#
# Go a window at a time getting the lsq win samples at a time
#
for (i in win:length(a))
{
q<-(lsfit(0:(win-1), a[(i-win+1):i]))$coefficients[2];
rv[i] = rv[i-1] + q;
}
return(data.frame(x=1:length(a), y=rv));
}
#
# Plot the values graph and its corresponding dot graph.
#
tplot<-function(ideal, orig, filt, col='black', title='')
{
#
# Plot the fixed data
#
yl<-range(c(ideal$y, filt$y))
plot(ideal$x, ideal$y, type='l', col='grey40', main=title,
xlab='x', ylab='y', ylim=yl)
points(orig$x, orig$y, col='orange')
lines(filt$x, filt$y, type='b', col=col)
#
# Plot the dot data
#
yl<-range(c(ideal$dot, filt$dot))
plot(ideal$x, ideal$dot, type='l', col='grey40', ylim=yl,
main=paste(title, "dot"), xlab='x', ylab='ydot')
lines(filt$x, filt$dot, type='b', col=col)
}
set.seed(1000)
pdf('noise.pdf', paper='a4')
#
# Firstly, the template waveform. Flat for 5, ramp for 20, flat for 5.
#
ideal<-data.frame(x=1:40, y=c(rep(2,10),2*(1:20),rep(40,10))/10)
ideal$dot<-dotvalue(ideal)
#
# Add Gaussian noise -- 30 samples, mean 0, sd 0.05
#
raw<-data.frame(x=1:40, y=ideal$y+rnorm(40,0,.05))
raw$dot<-dotvalue(raw)
#
# Median(9) of those points
#
med<-median(raw$y,9)
med$dot<-dotvalue(med)
#
# Lag average
#
lag<-lagavg(raw$y,0.7)
lag$dot<-dotvalue(lag)
#
# Moving anchor filter
#
manch<-movanchor(raw$y,0.5)
manch$dot<-dotvalue(manch)
#
# Moving window least-squares fit
#
lsqfit<-lsq(raw$y, 5)
lsqfit$dot<-dotvalue(lsqfit)
#
# Now plot the results
#
m<-par(mfrow=c(2,1))
tplot(ideal, raw, raw, 'red', 'Raw')
#plot(raw, type='b', col='orange', main='raw vs median(9)')
tplot(ideal, raw, med, col='red', title='Median')
#plot(raw, type='b', col='orange', main='raw vs lag(0.5)')
tplot(ideal, raw, lag, col='red', title='lag 50%')
#plot(raw, type='b', col='orange', main='raw vs movanchor(0.5)')
tplot(ideal, raw, manch, col='red', title='Moving Anchor')
tplot(ideal, raw, lsqfit, col='red', title='Moving Window LSQ')
#legend(0,9, c('raw','median(9)', 'lag 0.5', 'movanch 0.5'),
#fill=c('black','red','blue','orange'))
par(m)
dev.off()
And here is the document it generates:
noise.pdf
I did this as a PDF so you will be able to zoom in in your reader to get the full detail.
Just a quick explanation of the graphs. Each has a sensor values graph with a grey "ideal" line, an orange plot of the raw values and a red plot of the filtered values. There is also a sensor dot graph which shows the "ideal" dot line in grey, with the dot values returned by the current algorithm in red. This is the more important graph for dot purposes. The closer to the height and the start point, the better.
Raw gives something quite like what we're used to seeing. It gets the start and end of the acceleration event beautifully, but everything else is a jumble of noise.
Median gives us a similar result. It has greatly reduced the jitter on level running, but the accel event is as noisy as ever, just delayed by 5 samples. Woeful. I was a bit surprised to begin with, but it's really no surprise. When the samples are increasing, albeit roughly, the middle value of the last 9 will generally be 5 samples ago.
The 50% lag values are just like the raw samples with appreciable smoothing and a very slight delay.
The moving anchor has an appreciable delay, but it gets the levels pretty much spot on.
The moving window LSQ introduces a significant delay too, and eliminates most of the noise. Its general profile is very like the moving anchor approach.
So where does this get us? Nowhere much at the moment. I'd maintain that the moving anchor approach is the best of the bunch, but I'm happy to add more scenarios and/or algorithms (within reason) to the comparison, if you're interested.
Or you can run it yourself. R is free, and it's fairly easy to get started with (though pretty intricate when you get down to the details).
Have fun,
Rob.