Remove outlier from Historical variables  [Solved]

Easy to use, 100% Lua-based event scripting framework.

Moderator: leecollings

Post Reply
elmortero
Posts: 247
Joined: Sunday 29 November 2015 20:46
Target OS: Raspberry Pi / ODroid
Domoticz version: 3.9639
Location: Spain
Contact:

Remove outlier from Historical variables

Post by elmortero »

Hi,

I noticed strange behaviour from "is my xxx finished" scripts, being receiving notifications that the dishwasher is done while I didn't start the machine. I know my house is not yet that smart to go and decide that on its own :D
Some quick investigation showed that this was caused by strange peaks in the current readings (800 watts while not in use), one single spike.
This caused the virtual switch I use for later triggering to be switched on, and a minute later off again. In turn the script sent a notification.

(non)options:
  1. Finding a way to avoid these spikes --> I deem this near impossible
  2. The same way as detecting the 'In use status'. On the first value above 'idle' usage start a counter (of 10 minutes in my case) and for every run of the script deduct 1 until 0. This is not how I prefer, because the 'in use' status would be delayed with 10 minutes and then 10 minutes again at the end.
  3. Applying the hysteresis like in the heating scripts. The method I chose, but from the start posed another issue.
    I decided on taking 3 readings and use their average. But with a spike of 800 watts, and 2 zeros, the average is still 266 watts, way above the treshold.
So what I wanted to do is remove the highest value of the historical data before getting the average. In case of a spike in the last 3 readings this would be the outlier to be removed.
Now from the dZvents documentation, I knew how to find the highest value but not how to excluded it.
This is what I came up with so far, it does the trick, but am I overlooking some function?

Code: Select all

local WaReadings = domoticz.data.WaReadings
local clean_avg = (WaReadings.sum(1,3) - WaReadings.max(1,3)) / 2
Or beit: get the readings, get the sum of all thre and reduce with the max value of them and divide by 2 (the remaining values)
And from there on use the average value for triggering
Just started applying it, so not real reliable data, but all seems fine. This assumes that the spikes do not last for more than 61 seconds.
Your insights and input would be really appreciated.
User avatar
waaren
Posts: 6028
Joined: Tuesday 03 January 2017 14:18
Target OS: Linux
Domoticz version: Beta
Location: Netherlands
Contact:

Re: Remove outlier from Historical variables

Post by waaren »

elmortero wrote: Thursday 21 May 2020 21:31 Or beit: get the readings, get the sum of all thre and reduce with the max value of them and divide by 2 (the remaining values)
And from there on use the average value for triggering
Just started applying it, so not real reliable data, but all seems fine. This assumes that the spikes do not last for more than 61 seconds.
Nice catch !

local clean_avg = (WaReadings.sum(1,3) - WaReadings.max(1,3)) / 2 -- Looks like a workable solution to prevent these false positives.

Would this line also work for you?
local clean_usage = WaReadings.min(1,2) -- takes the lowest value of the last two readings = ignoring the spike
Debian buster, bullseye on RPI-4, Intel NUC.
dz Beta, Z-Wave, RFLink, RFXtrx433e, P1, Youless, Hue, Yeelight, Xiaomi, MQTT
==>> dzVents wiki
elmortero
Posts: 247
Joined: Sunday 29 November 2015 20:46
Target OS: Raspberry Pi / ODroid
Domoticz version: 3.9639
Location: Spain
Contact:

Re: Remove outlier from Historical variables

Post by elmortero »

waaren wrote: Friday 22 May 2020 10:19
Would this line also work for you?
local clean_usage = WaReadings.min(1,2) -- takes the lowest value of the last two readings = ignoring the spike
Hey,
Thing is, we don't know if the spike is the 1st reading, do we?
That is if I understand correctly that the data is indexed by time.
Now, if there is a way to first sort the values we would be able to know.

Anyway, it seems that my way gets the job done.
It doesn't matter if there is a spike or not, the top value gets removed, so even if it is a real value, we get a reliable average.
Actually, it was quite fun investigating it :D
User avatar
waaren
Posts: 6028
Joined: Tuesday 03 January 2017 14:18
Target OS: Linux
Domoticz version: Beta
Location: Netherlands
Contact:

Re: Remove outlier from Historical variables

Post by waaren »

elmortero wrote: Friday 22 May 2020 10:35 Thing is, we don't know if the spike is the 1st reading, do we?
If it works don't touch it :)
But just to help me understand..

I cannot think of a scenario where the spike would not be the newest (= 1st reading) at least once. By taking the minimum of the last two values you will ignore the spike.
Debian buster, bullseye on RPI-4, Intel NUC.
dz Beta, Z-Wave, RFLink, RFXtrx433e, P1, Youless, Hue, Yeelight, Xiaomi, MQTT
==>> dzVents wiki
elmortero
Posts: 247
Joined: Sunday 29 November 2015 20:46
Target OS: Raspberry Pi / ODroid
Domoticz version: 3.9639
Location: Spain
Contact:

Re: Remove outlier from Historical variables  [Solved]

Post by elmortero »

You are right!
I failed to see the reasoning at first, because it is not always the newest reading. But I realised that it does not matter, as you method takes only the lower value.
Only thing against it that I can find now is that worst case scenario - as I only use a history of 3 values - is that we would take the average of 1 value.
Thanks for your insight!
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest