Lots of ZWave reliability issues

For Z-Wave related questions in Domoticz

Moderator: leecollings

Post Reply
galinette
Posts: 68
Joined: Monday 11 December 2017 22:57
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Lots of ZWave reliability issues

Post by galinette »

Hi,

I have a bunch of qubino Z-Wave switches at home for controlling the electrical heaters, with a dzVents PID control script.

It have worked flawlessly in the last winter. Now I have switched the system on again, and quite often (at least once a day) a heater gets stuck in the "On" state and temperatures rises a lot...

The ZWave switch output stays on, while it's off in Domoticz. Since my script only toggles the device if the desired state is different from the current state, as the temperature rises, it never switches off. Manually switching on and off solves the issue, until the next one.

In the ZWave control panel logs, I get a bunch of:
2019-11-19 09:51:04.623 Error, Node021, ERROR: ZW_SEND_DATA could not be delivered to Z-Wave stack
2019-11-19 09:51:04.623 Always,
2019-11-19 09:51:04.623 Always, Dumping queued log messages
2019-11-19 09:51:04.623 Always,
2019-11-19 09:51:04.623 Always,
2019-11-19 09:51:04.624 Always, End of queued log message dump
2019-11-19 09:51:04.624 Always,
I have tried soft resetting the dongle (Aeotec Z-Wave Gen5), resetting the raspberry pi, and resetting the Zwave switches (by powering them off for 1 minute), and the issue comes back.

The domoticz version is the stable one, and it was updated in may. Could this be the cause'?
lost
Posts: 666
Joined: Thursday 10 November 2016 9:30
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by lost »

galinette wrote: Tuesday 19 November 2019 12:04 I have a bunch of qubino Z-Wave switches at home for controlling the electrical heaters, with a dzVents PID control script.
(...)
I have tried soft resetting the dongle (Aeotec Z-Wave Gen5), resetting the raspberry pi, and resetting the Zwave switches (by powering them off for 1 minute), and the issue comes back.

The domoticz version is the stable one, and it was updated in may. Could this be the cause'?
You may try to heal network, as this may be some network topology issue. Not sure this is done by soft-reset.

On my side, I manage my heaters with Qubino "pilot wire" modules. No external regulation there (that's just setting some offsets to own heaters setpoints) but I also used to miss commands in the past.

IMO, the reason was my schedules were changing lots of devices (I have 8 such devices) settings exactly at the exact same time, maybe too much for z-wave bandwith.
=> I changed a few schedules lines to distribute settings over 2 or 3mn.

Since then, no more issues.

I my case, as regulation is still managed by devices at least I was not in an "open loop" situation with over-heating. But this was sometimes harming my money saving heating strategy.

If this does not fix the issue, try to figure out if the device xml config file did not changed with the update: Sometimes, devices firmware modification trigger modifications there that may harm older devices. If you don't have old files anymore, use Domoticz github to trace history/dates (from last/current stable) for your device type file:

https://github.com/domoticz/domoticz/tr ... fig/qubino

Avoid trying latest files updated for beta that merged OZW 1.6, as current stable (as previous ones) is 1.4: IMO, you may run into compatibility issues...
galinette
Posts: 68
Joined: Monday 11 December 2017 22:57
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by galinette »

Hi,

I have tried the "heal network" already, without improvement. I also plugged the controller with an USB extension cable, in order to avoid interference with nearby wifi stuff.

I have checked the network topology, everything is connected, but some devices only have 2 connections over 20 devices, and not the closest ones. Strangely also, the topology map never changes at all upon healing/resetting.

I will try staging the commands, this will make my script slightly more complex. At the moment it runs every minute, and typically switches 2-3 heaters at a time (it's a somewhat complex sigma-delta, round-robin PID algorithm designed for good temperature stability & minimum peak power usage)

What is strange, is that every web search on the topic leads to Qubino devices... They seem pretty unreliable
galinette
Posts: 68
Joined: Monday 11 December 2017 22:57
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by galinette »

I just changed my script so that it only toggles one switch at a time, with 10 seconds interval.

It seems to be maybe better, but I still have errors:
2019-11-19 17:52:10.219 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 22 (0x16)
2019-11-19 17:52:20.217 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 8 (0x08)
2019-11-19 17:52:30.211 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 21 (0x15)
2019-11-19 17:52:40.241 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 20 (0x14)
2019-11-19 17:52:50.248 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 16 (0x10)
2019-11-19 17:53:00.235 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 19 (0x13)
2019-11-19 17:54:00.275 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 21 (0x15)
2019-11-19 17:54:10.223 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 20 (0x14)
2019-11-19 17:54:13.552 Status: OpenZWave: Received timeout notification from HomeID: 3487780330, NodeID: 20 (0x14)
2019-11-19 17:54:20.217 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 16 (0x10)
2019-11-19 17:54:30.246 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 19 (0x13)
2019-11-19 17:54:40.240 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 23 (0x17)
As logs show, zwave gave up sending the command to node 20 just 3 seconds after sending it.

Is this a normal behavior??

Here is my network topology with node 20. It seems well connected, but not directly to the controller. Which is weird, since the controller and heater node 20 are in the same room (kitchen). Whereas other nodes located on upper floor and at the other side of home work better.
Screenshot from 2019-11-19 18-03-12.png
Screenshot from 2019-11-19 18-03-12.png (142.34 KiB) Viewed 2626 times
Geitje
Posts: 170
Joined: Monday 22 January 2018 21:52
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by Geitje »

I recently solved reliability issues in my Zwave network (most far devices not reachable) by enabling polling on one device. I just used common sense to enable polling on the device which should be the only way to the most far device. (So not the most far device itself, but the one just before it in the main network).
Downside of polling is battery use, however, I still not find it draining much faster then the others. I read you should only enable polling when you have problems.
Domoticz beta, on Raspberry Pi 3B, Raspian Buster
Zwave, Zigate, RFlink etc.
lost
Posts: 666
Joined: Thursday 10 November 2016 9:30
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by lost »

galinette wrote: Tuesday 19 November 2019 18:01 As logs show, zwave gave up sending the command to node 20 just 3 seconds after sending it.

Is this a normal behavior??

Here is my network topology with node 20. It seems well connected, but not directly to the controller. Which is weird, since the controller and heater node 20 are in the same room (kitchen). Whereas other nodes located on upper floor and at the other side of home work better.

Screenshot from 2019-11-19 18-03-12.png
For the 3s timeout, this reminds me this comment (in this case timeout was suspected but not the issue because command was immediately discarded):
https://www.domoticz.com/forum/viewtopi ... 43#p227843

So this is expected behavior.

Strange to have the device not able to reach controller directly, but this may happen (fading). In the past, devices that were not included on their location but close to controller (before remote inclusion feature) also had issues, but this time this was a direct route that may be no more available after inclusion close to controller, so the opposite...

Anyway having hops may add latency even if 3s looks large. I sometimes also experience timeouts in my setup, but this still works: Timeout reached will trigger a retry...

I also observe that the previous messages were not timeouts but maybe more related to z-wave command stack: Still think this stack was not able to deliver commands at all because too many messages at the same time.

So 1st check if this does the job, if not, maybe a message on OZW github to get more advice from there: Some configuration knobs may help, or tips to have network heal really updating current mesh.

I have many Qubino devices, only one is really buggy: The smart-meter that monitors my power consumption. Not buggy on measurement side (even if additive counters on parameter side was not a good idea, so only current consumption are easy to read) but for relay control. After a "on" switch, an auto-off is done ~2/3s after... then another "on" a few seconds after first remains stable. According to Qubino this is because the device reports back it's changed status with too much delay for OZW. Had to device-script this to do the check and reproduce manual switch action sequence leading to a stable state. A bit painfull this one I must admit.

On top of "pilot wire" modules, only have a single relay module from them, ZMNHND1, that works flawlessly (even for I2 independent input, at least with my firmware, that is often flagged buggy by users).
lost
Posts: 666
Joined: Thursday 10 November 2016 9:30
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by lost »

Geitje wrote: Tuesday 19 November 2019 18:42 Downside of polling is battery use, however, I still not find it draining much faster then the others. I read you should only enable polling when you have problems.
Pooling may most of the time make things worse IMO. That's now only needed for very specific stuff or buggy devices.

As well, pooling a battery device that's almost always sleeping will not work: Only device waking up (because of a change to report: Movement, temperature change... ; or periodic wake-up, usually one or two times per day, to get parameters change from controller if any) will allow some communication. But controller will not be able to initiate one to a sleeping device.

Such node should not even be able to relay others. If they claim being able to do so, it's a bug!
Geitje
Posts: 170
Joined: Monday 22 January 2018 21:52
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by Geitje »

You are right, my mistake. I did another change back then (nightly network heal), maybe that did the trick for me?

lost wrote: Wednesday 20 November 2019 9:45 ...... Such node should not even be able to relay others. If they claim being able to do so, it's a bug!
Domoticz beta, on Raspberry Pi 3B, Raspian Buster
Zwave, Zigate, RFlink etc.
galinette
Posts: 68
Joined: Monday 11 December 2017 22:57
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by galinette »

Hi,

Thanks for the replies. I will contact the OZW github team for more advice.

When you say:
lost wrote: Wednesday 20 November 2019 9:45 Such node should not even be able to relay others. If they claim being able to do so, it's a bug!
Were you mentioning my devices or the post from someone else?

And also, apart from my zwave radio issues, when a command is dropped, the state in Domoticz becomes wrong. For instance, if I switch on the device, and it fails with the log errors mentioned in my first post, the devices stays off, but Domoticz state is On.

Isn't this a bug in Domoticz or OpenZWave? When a command fails, shouldn't the state stay to its previous value?

Thanks
Geitje
Posts: 170
Joined: Monday 22 January 2018 21:52
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by Geitje »

I think this was in reply to my post, not yours.
galinette wrote: Wednesday 04 December 2019 10:31 When you say:
lost wrote: Wednesday 20 November 2019 9:45 Such node should not even be able to relay others. If they claim being able to do so, it's a bug!
Were you mentioning my devices or the post from someone else?
Domoticz beta, on Raspberry Pi 3B, Raspian Buster
Zwave, Zigate, RFlink etc.
lost
Posts: 666
Joined: Thursday 10 November 2016 9:30
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: Lots of ZWave reliability issues

Post by lost »

galinette wrote: Wednesday 04 December 2019 10:31 Were you mentioning my devices or the post from someone else?
(...)
When a command fails, shouldn't the state stay to its previous value?
The comment on battery modules/relay ability was for Geitje.

Concerning command fail vs Domoticz status, this may depends on the fail: For timeouts the sent command may have made it's way to the device, we don't know. But if the command was not sent at all this is for sure erroneous. Maybe errors are just ignored there, behaving as other radio protocols not providing device feedback.

Anyway, if a z-wavedevice repeatedly fails, switches attached to faulty device will go red so we can notice.
rapscallion42
Posts: 2
Joined: Tuesday 03 March 2020 10:11
Target OS: Raspberry Pi / ODroid
Domoticz version: 4.11763
Location: Netherlands
Contact:

Re: Lots of ZWave reliability issues

Post by rapscallion42 »

I am seeing similar issues where Domoticz state and remote device state is different, causing in my case issues with the heating system, see https://www.domoticz.com/forum/viewtopi ... =6&t=31559 for details. I would be interested in your view...
rrozema
Posts: 470
Joined: Thursday 26 October 2017 13:37
Target OS: Raspberry Pi / ODroid
Domoticz version: beta
Location: Delft
Contact:

Re: Lots of ZWave reliability issues

Post by rrozema »

I know this is an old post, but there is something fundamentally wrong in the reasoning that if you understand it correctly, it can help you in the right direction. The timeout you see reported in the log is not open zwave giving up waiting for the device, it is the device reporting it gave up waiting for something it was expecting to come from the controller. Some devices are more sensitive for it -depending on their implementation- and it is triggered if communication is not completely reliable: most of the times due to the distance between the device and the controller. It is best resolved by adding a mains powered device in between the controller and the device, and then do a heal of the network (a network heal can take hours, so be very patient). Most battery powered devices do not actively route traffic for other devices, so placing a battery powered device in between most likely will not help.
galinette wrote: Tuesday 19 November 2019 18:01 I just changed my script so that it only toggles one switch at a time, with 10 seconds interval.

It seems to be maybe better, but I still have errors:
2019-11-19 17:52:10.219 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 22 (0x16)
2019-11-19 17:52:20.217 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 8 (0x08)
2019-11-19 17:52:30.211 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 21 (0x15)
2019-11-19 17:52:40.241 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 20 (0x14)
2019-11-19 17:52:50.248 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 16 (0x10)
2019-11-19 17:53:00.235 OpenZWave: Domoticz has send a Switch command!, Level: 60, NodeID: 19 (0x13)
2019-11-19 17:54:00.275 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 21 (0x15)
2019-11-19 17:54:10.223 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 20 (0x14)
2019-11-19 17:54:13.552 Status: OpenZWave: Received timeout notification from HomeID: 3487780330, NodeID: 20 (0x14)
2019-11-19 17:54:20.217 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 16 (0x10)
2019-11-19 17:54:30.246 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 19 (0x13)
2019-11-19 17:54:40.240 OpenZWave: Domoticz has send a Switch command!, Level: 0, NodeID: 23 (0x17)
As logs show, zwave gave up sending the command to node 20 just 3 seconds after sending it.

Is this a normal behavior??

Here is my network topology with node 20. It seems well connected, but not directly to the controller. Which is weird, since the controller and heater node 20 are in the same room (kitchen). Whereas other nodes located on upper floor and at the other side of home work better.

Screenshot from 2019-11-19 18-03-12.png
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest