Page 1 of 2

Hardware Domoticz Watchdog

Posted: Tuesday 02 January 2024 19:43
by Geofrancis
I have had domotcz crashing in wierd ways over the years, sometimes it crashes, sometimes the pi crashes, sometimes its running but not responding.

I have tried all the software watchdog solutions and they helped, but there was still scenarios that could stop it from running. My first solution was a Wifi controlled USB relay connected to the pi power so I could switch it off and on remotely using the smartlife app. but this still required me to notice that something was wrong usually when i couldnt log in.

So this got me thinking on how I could use a esp32 chip to reboot the pi if it stopped responding, to implement it I knew I needed to do some kind of command and response system. first idea was to just use a esp32 to ping the pi, so if it stops responding it reboots but it could reply to pings without being operational.

So the next idea was to make a switch on domoticz, change it and then check for the change, if its not in the position its supposed to be in then it reboots. this worked but there wasnt a direct connection from the pi to the esp32, any network issues would cause it to reboot the pi constantly as it would block the response.

the next idea was to wire it directly to the pi GPIO, this way it wont be effected by network issues. I used the domoticz guide https://www.domoticz.com/wiki/GPIOto set one pin and an input and another pin as an output. I used gpio 17 and 27 for in and out.

I then made a basic blockly script that sets the output pin high if the input pin is high

I then made a script using the espeasy rule set that changes a switch, checks that its moved, if it has then it resets and tries again in 10 minutes if not then it reboots.

you could easily just use any arduino board to do this, i just used esp32 and espeasy because i had it there, but it doesnt need network access to work as its just checking pin states.

pin 5 is connected to the PI4 Global_En pin to reset
pin 7 is connected to gpio 17 on the pi
pin 6 is connected to gpio 27 on the pi

Code: Select all

 On System#Boot do    //When the ESP boots, out low and reset high
  gpio,7,0
   gpio,5,1
   timerSet,1,600      //wait for 10 minutes after boot to check startup
 endon
 On Rules#Timer=1 do  //send pulse to PI 
   gpio,7,1
    gpio,8,0	//turn on esp32 led
   timerSet,2,10       //wait 10 seconds for reply 
 endon
 
 On Rules#Timer=2 do 
    if [receivepulsepin#State]=1 //if response is correct 
       gpio,5,1
       gpio,7,0
        gpio,8,1			//turn off esp32 led
       timerSet,1,600       	//wait for 10 minutes 
       
    else   				   //if response incorrect
      gpio,5,0  			  //reset pi
       gpio,7,0
       timerSet,3,5      	  //wait 5 seconds 
       endon
 
  On Rules#Timer=3 do 	 //restart counter
    gpio,7,0
    gpio,5,1
    timerSet,1,600      	//wait for 10 minutes 
       
 endon
 
 

Re: Hardware Domoticz Watchdog

Posted: Wednesday 03 January 2024 8:43
by gizmocuz
Or you can install Domoticz in a docker (compose) environment and it will restart if needed automatically.

Another option is to switch to a systemd startup script to let it auto restart on failure
https://ma.ttias.be/auto-restart-crashe ... e-systemd/

Or you can use Monit to restart the service

If you are using python scripts, the reason was probably duo a memory leak that is now hopefully solved in the latest stable/beta release.

Re: Hardware Domoticz Watchdog

Posted: Wednesday 03 January 2024 8:57
by Geofrancis
I have done all of that, this is not my first day. Domotics has had many different problems over the years, so this is not for some specific problem. I have tried all the things you have listed.

You're assuming that the pi can still run the script and that the hardware has not crashed due to power issues, bad sd card, cosmic rays or some other random event, and that domoticz is responding if the process is running, things like it getting overloaded with requests by a faulty sensor or a bad script is in a loop overloading it will stop it from responding while the process is still running.

monit only checks if the domoticz process is running, it can't check that its actually responding to requests.

systemd startup script cannot check if its responding to requests.

When running it in docker, the process can be restarted if it crashes but it still has no idea if it's responding to requests.

My method checks if it's responding within a reasonable time, not just that the process is running since running != functioning...

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 7:47
by Denny64
You could use reverse logic. Use cron and launch a script every x minutes that activates a virtual button in Dmx on which you insert a Telegram notification that reaches your smartphone directly. Then, with a fantastic application called Automate, manage the notification as you want.
In this way you can check Dmz and Rpi at the same time.

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 12:25
by Geofrancis
Denny64 wrote: Friday 05 January 2024 7:47 You could use reverse logic. Use cron and launch a script every x minutes that activates a virtual button in Dmx on which you insert a Telegram notification that reaches your smartphone directly. Then, with a fantastic application called Automate, manage the notification as you want.
In this way you can check Dmz and Rpi at the same time.
that wont work if the pi has crashed and it cant restart the pi without more hardware. At that point you might as well just use that hardware to monitor it.

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 12:35
by solarboy
I think this is a great idea, software solutions often have problems, this is one more layer of protection for our homes.

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 19:10
by Denny64
@ Geofrancis.

Yes, it's possible for the hardware to crash, but since using Dmz (2017) those rare times it crashed were just a software issue (for me, obviously). In case of hardware fault, you can use a cheapest Sonoff switch on the Rpi power supply.

@solarboy

In this way, those times that Dmz is blocked (very few), I've been notified and I was able to restart it through simple commands via ssh.

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 20:32
by Geofrancis
@Denny64. The best approach would be a combination of hardware and software watchdogs. You don't want to reset the pi unless you really have to, so having a software check running every few minutes with hardware check reduced to every 30 minutes would make sure it's had lots of time for any software watchdog to work before it will reset the pi.

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 21:29
by solarboy
Geofrancis wrote: Friday 05 January 2024 20:32 @Denny64. The best approach would be a combination of hardware and software watchdogs.
Absolutely this. I use this software to control lots of hardware, pumps, shutters, skylights, gates , irrigation. It cannot fail ever. This is why I have 2 pi's 2 Aeotec Sticks, 2 USB sticks , 2 ssd drives, 2 UPS, 2 RFX sticks. Yet still it can fail and it is always software. Even with multiple backups I often have database corruption, memory leaks, changes of syntax ("apt" & "apt-get" , "compose-up" & "compose up" anyone ?) and hundreds of other niggles that come from working with FOSS. A last resort hardware solution like this should always be welcomed unless you find software to be infallible...

Re: Hardware Domoticz Watchdog

Posted: Friday 05 January 2024 21:31
by solarboy
I'm kind of shocked that this has been jumped on as something "bad" to be honest.

Re: Hardware Domoticz Watchdog

Posted: Thursday 11 January 2024 10:54
by Geofrancis
solarboy wrote: Friday 05 January 2024 21:29
Geofrancis wrote: Friday 05 January 2024 20:32 @Denny64. The best approach would be a combination of hardware and software watchdogs.
This is why I have 2 pi's 2 Aeotec Sticks, 2 USB sticks , 2 ssd drives, 2 UPS, 2 RFX sticks.
How are you operating redundant pi's? Are you mirroring them? I have some spare pi boards and the idea of having a hot spare controller sitting is appealing. the hardware monitor could switch on a second pi if the first one fails to respond after a set time. it can just be switched off by holding the reset pin low and a second one switched on by releasing the reset.

Re: Hardware Domoticz Watchdog

Posted: Thursday 11 January 2024 11:02
by solarboy
I use a USB ethernet adaptor to preserve my fixed IP allocated by the router to my MAC. I found software programmable MAC unreliable.

I have a spare pi3b+ board, SSD which I mirror monthlythe whole root partition onto (using "SDcardcopier" via another USB pen as having the same type of SSD and the same UID sent my pi into a panic).

I also have a spare Aeotec zwave stick which I clone with the Zensys tools and also zwave-js-ui.

I also have a spare zwave stick (Slaesh's).

I keep it all away from the main pi in a metal box :D

I have tested it and I can get a cloned system up and running reasonably fast.

Re: Hardware Domoticz Watchdog

Posted: Thursday 11 January 2024 11:03
by solarboy
Geofrancis wrote: Thursday 11 January 2024 10:54
solarboy wrote: Friday 05 January 2024 21:29
Geofrancis wrote: Friday 05 January 2024 20:32 @Denny64. The best approach would be a combination of hardware and software watchdogs.
This is why I have 2 pi's 2 Aeotec Sticks, 2 USB sticks , 2 ssd drives, 2 UPS, 2 RFX sticks.
How are you operating redundant pi's? Are you mirroring them? I have some spare pi boards and the idea of having a hot spare controller sitting is appealing. the hardware monitor could switch on a second pi if the first one fails to respond after a set time. it can just be switched off by holding the reset pin low and a second one switched on by releasing the reset.
I like this idea, maybe it would be possible to automate it all with a 3rd controller as watchdog/controller.

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 11:35
by Denny64
To automatically reboot my system in the event of a hang, I used a Shelly One switch (with Espeasy firmware) on the RPI's power supply. If the Shelly does not receive the control message from the DMZ (generated by cron every 15 minutes), the Shelly turns off power to the RPI and, after 10 seconds, turns it back on. In this way it is possible to have an autonomous restart of the RPI both in the event of hardware and software failure.

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 11:41
by Kedi
Would not it be better to investigate and solve the event of the hang?
I had a raspberry which was up for more then 1000 days without any problem and Domoticz which was up for more then 900 days without ever going down.

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 12:07
by Denny64
@ Kedi

Yes, I agree with you, but an extra check doesn't hurt, especially if it's simple and cheap.

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 17:31
by Geofrancis
the problem is, domoticz troubleshooting guide consists of essentially:

installed to a SSD
look at crash log, there isnt one.
database check, mine is fine
disable hardware until it stops....I dont have any plugins install, they are all virtual
issue with a script, I have 29 blockly scrips running, as far as i know there no way to check if one of them is causing the issue other than turning them off one by one. so if I turn each off each one for a few days il figure out what one it is in a few months time....if that is even the issue.

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 19:08
by Denny64
Have you tried replacing the Rpi power supply?

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 19:28
by Geofrancis
3 different power supplies and a UPS. logs dont indicate any power issues.

Re: Hardware Domoticz Watchdog

Posted: Sunday 14 January 2024 20:35
by Denny64
When nothing seems to resolve... Have you tried changing the Rpi wiht another one?