Hardware Domoticz Watchdog Topic is solved

Moderator: leecollings

Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Hardware Domoticz Watchdog

Post by Geofrancis »

I have had domotcz crashing in wierd ways over the years, sometimes it crashes, sometimes the pi crashes, sometimes its running but not responding.

I have tried all the software watchdog solutions and they helped, but there was still scenarios that could stop it from running. My first solution was a Wifi controlled USB relay connected to the pi power so I could switch it off and on remotely using the smartlife app. but this still required me to notice that something was wrong usually when i couldnt log in.

So this got me thinking on how I could use a esp32 chip to reboot the pi if it stopped responding, to implement it I knew I needed to do some kind of command and response system. first idea was to just use a esp32 to ping the pi, so if it stops responding it reboots but it could reply to pings without being operational.

So the next idea was to make a switch on domoticz, change it and then check for the change, if its not in the position its supposed to be in then it reboots. this worked but there wasnt a direct connection from the pi to the esp32, any network issues would cause it to reboot the pi constantly as it would block the response.

the next idea was to wire it directly to the pi GPIO, this way it wont be effected by network issues. I used the domoticz guide https://www.domoticz.com/wiki/GPIOto set one pin and an input and another pin as an output. I used gpio 17 and 27 for in and out.

I then made a basic blockly script that sets the output pin high if the input pin is high

I then made a script using the espeasy rule set that changes a switch, checks that its moved, if it has then it resets and tries again in 10 minutes if not then it reboots.

you could easily just use any arduino board to do this, i just used esp32 and espeasy because i had it there, but it doesnt need network access to work as its just checking pin states.

pin 5 is connected to the PI4 Global_En pin to reset
pin 7 is connected to gpio 17 on the pi
pin 6 is connected to gpio 27 on the pi

Code: Select all

 On System#Boot do    //When the ESP boots, out low and reset high
  gpio,7,0
   gpio,5,1
   timerSet,1,600      //wait for 10 minutes after boot to check startup
 endon
 On Rules#Timer=1 do  //send pulse to PI 
   gpio,7,1
    gpio,8,0	//turn on esp32 led
   timerSet,2,10       //wait 10 seconds for reply 
 endon
 
 On Rules#Timer=2 do 
    if [receivepulsepin#State]=1 //if response is correct 
       gpio,5,1
       gpio,7,0
        gpio,8,1			//turn off esp32 led
       timerSet,1,600       	//wait for 10 minutes 
       
    else   				   //if response incorrect
      gpio,5,0  			  //reset pi
       gpio,7,0
       timerSet,3,5      	  //wait 5 seconds 
       endon
 
  On Rules#Timer=3 do 	 //restart counter
    gpio,7,0
    gpio,5,1
    timerSet,1,600      	//wait for 10 minutes 
       
 endon
 
 
Attachments
Screenshot 2024-01-02 185041.png
Screenshot 2024-01-02 185041.png (19 KiB) Viewed 14098 times
92f1e8e2fcc888726ff7838dad725dd94f43438d_2_690x301.jpeg
92f1e8e2fcc888726ff7838dad725dd94f43438d_2_690x301.jpeg (40.46 KiB) Viewed 14100 times
Screenshot 2024-01-02 184057.png
Screenshot 2024-01-02 184057.png (40.47 KiB) Viewed 14104 times
Last edited by Geofrancis on Wednesday 03 January 2024 9:30, edited 1 time in total.
User avatar
gizmocuz
Posts: 2350
Joined: Thursday 11 July 2013 18:59
Target OS: Raspberry Pi / ODroid
Domoticz version: beta
Location: Top of the world
Contact:

Re: Hardware Domoticz Watchdog

Post by gizmocuz »

Or you can install Domoticz in a docker (compose) environment and it will restart if needed automatically.

Another option is to switch to a systemd startup script to let it auto restart on failure
https://ma.ttias.be/auto-restart-crashe ... e-systemd/

Or you can use Monit to restart the service

If you are using python scripts, the reason was probably duo a memory leak that is now hopefully solved in the latest stable/beta release.
Quality outlives Quantity!
Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Re: Hardware Domoticz Watchdog

Post by Geofrancis »

I have done all of that, this is not my first day. Domotics has had many different problems over the years, so this is not for some specific problem. I have tried all the things you have listed.

You're assuming that the pi can still run the script and that the hardware has not crashed due to power issues, bad sd card, cosmic rays or some other random event, and that domoticz is responding if the process is running, things like it getting overloaded with requests by a faulty sensor or a bad script is in a loop overloading it will stop it from responding while the process is still running.

monit only checks if the domoticz process is running, it can't check that its actually responding to requests.

systemd startup script cannot check if its responding to requests.

When running it in docker, the process can be restarted if it crashes but it still has no idea if it's responding to requests.

My method checks if it's responding within a reasonable time, not just that the process is running since running != functioning...
User avatar
Denny64
Posts: 53
Joined: Friday 03 February 2017 11:34
Target OS: Raspberry Pi / ODroid
Domoticz version: 2023.1
Location: Italy
Contact:

Re: Hardware Domoticz Watchdog

Post by Denny64 »

You could use reverse logic. Use cron and launch a script every x minutes that activates a virtual button in Dmx on which you insert a Telegram notification that reaches your smartphone directly. Then, with a fantastic application called Automate, manage the notification as you want.
In this way you can check Dmz and Rpi at the same time.
Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Re: Hardware Domoticz Watchdog

Post by Geofrancis »

Denny64 wrote: Friday 05 January 2024 7:47 You could use reverse logic. Use cron and launch a script every x minutes that activates a virtual button in Dmx on which you insert a Telegram notification that reaches your smartphone directly. Then, with a fantastic application called Automate, manage the notification as you want.
In this way you can check Dmz and Rpi at the same time.
that wont work if the pi has crashed and it cant restart the pi without more hardware. At that point you might as well just use that hardware to monitor it.
Last edited by Geofrancis on Friday 05 January 2024 15:17, edited 5 times in total.
solarboy
Posts: 300
Joined: Thursday 01 November 2018 19:47
Target OS: Raspberry Pi / ODroid
Domoticz version: 2024.6
Location: Portugal
Contact:

Re: Hardware Domoticz Watchdog

Post by solarboy »

I think this is a great idea, software solutions often have problems, this is one more layer of protection for our homes.
Intel NUC with Ubuntu Server VM (Proxmox),mosquitto(docker),RFXtrx433E,zwavejsUI (docker),Zigbee2mqtt(docker),SMA Hub (docker),Harmony Hub plugin, Kodi plugin,Homebridge(docker)+Google Home,APC UPS,SMA Modbus,Mitsubishi MQTT, Broadlink,Dombus
User avatar
Denny64
Posts: 53
Joined: Friday 03 February 2017 11:34
Target OS: Raspberry Pi / ODroid
Domoticz version: 2023.1
Location: Italy
Contact:

Re: Hardware Domoticz Watchdog

Post by Denny64 »

@ Geofrancis.

Yes, it's possible for the hardware to crash, but since using Dmz (2017) those rare times it crashed were just a software issue (for me, obviously). In case of hardware fault, you can use a cheapest Sonoff switch on the Rpi power supply.

@solarboy

In this way, those times that Dmz is blocked (very few), I've been notified and I was able to restart it through simple commands via ssh.
Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Re: Hardware Domoticz Watchdog

Post by Geofrancis »

@Denny64. The best approach would be a combination of hardware and software watchdogs. You don't want to reset the pi unless you really have to, so having a software check running every few minutes with hardware check reduced to every 30 minutes would make sure it's had lots of time for any software watchdog to work before it will reset the pi.
solarboy
Posts: 300
Joined: Thursday 01 November 2018 19:47
Target OS: Raspberry Pi / ODroid
Domoticz version: 2024.6
Location: Portugal
Contact:

Re: Hardware Domoticz Watchdog

Post by solarboy »

Geofrancis wrote: Friday 05 January 2024 20:32 @Denny64. The best approach would be a combination of hardware and software watchdogs.
Absolutely this. I use this software to control lots of hardware, pumps, shutters, skylights, gates , irrigation. It cannot fail ever. This is why I have 2 pi's 2 Aeotec Sticks, 2 USB sticks , 2 ssd drives, 2 UPS, 2 RFX sticks. Yet still it can fail and it is always software. Even with multiple backups I often have database corruption, memory leaks, changes of syntax ("apt" & "apt-get" , "compose-up" & "compose up" anyone ?) and hundreds of other niggles that come from working with FOSS. A last resort hardware solution like this should always be welcomed unless you find software to be infallible...
Intel NUC with Ubuntu Server VM (Proxmox),mosquitto(docker),RFXtrx433E,zwavejsUI (docker),Zigbee2mqtt(docker),SMA Hub (docker),Harmony Hub plugin, Kodi plugin,Homebridge(docker)+Google Home,APC UPS,SMA Modbus,Mitsubishi MQTT, Broadlink,Dombus
solarboy
Posts: 300
Joined: Thursday 01 November 2018 19:47
Target OS: Raspberry Pi / ODroid
Domoticz version: 2024.6
Location: Portugal
Contact:

Re: Hardware Domoticz Watchdog

Post by solarboy »

I'm kind of shocked that this has been jumped on as something "bad" to be honest.
Intel NUC with Ubuntu Server VM (Proxmox),mosquitto(docker),RFXtrx433E,zwavejsUI (docker),Zigbee2mqtt(docker),SMA Hub (docker),Harmony Hub plugin, Kodi plugin,Homebridge(docker)+Google Home,APC UPS,SMA Modbus,Mitsubishi MQTT, Broadlink,Dombus
Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Re: Hardware Domoticz Watchdog

Post by Geofrancis »

solarboy wrote: Friday 05 January 2024 21:29
Geofrancis wrote: Friday 05 January 2024 20:32 @Denny64. The best approach would be a combination of hardware and software watchdogs.
This is why I have 2 pi's 2 Aeotec Sticks, 2 USB sticks , 2 ssd drives, 2 UPS, 2 RFX sticks.
How are you operating redundant pi's? Are you mirroring them? I have some spare pi boards and the idea of having a hot spare controller sitting is appealing. the hardware monitor could switch on a second pi if the first one fails to respond after a set time. it can just be switched off by holding the reset pin low and a second one switched on by releasing the reset.
solarboy
Posts: 300
Joined: Thursday 01 November 2018 19:47
Target OS: Raspberry Pi / ODroid
Domoticz version: 2024.6
Location: Portugal
Contact:

Re: Hardware Domoticz Watchdog

Post by solarboy »

I use a USB ethernet adaptor to preserve my fixed IP allocated by the router to my MAC. I found software programmable MAC unreliable.

I have a spare pi3b+ board, SSD which I mirror monthlythe whole root partition onto (using "SDcardcopier" via another USB pen as having the same type of SSD and the same UID sent my pi into a panic).

I also have a spare Aeotec zwave stick which I clone with the Zensys tools and also zwave-js-ui.

I also have a spare zwave stick (Slaesh's).

I keep it all away from the main pi in a metal box :D

I have tested it and I can get a cloned system up and running reasonably fast.
Intel NUC with Ubuntu Server VM (Proxmox),mosquitto(docker),RFXtrx433E,zwavejsUI (docker),Zigbee2mqtt(docker),SMA Hub (docker),Harmony Hub plugin, Kodi plugin,Homebridge(docker)+Google Home,APC UPS,SMA Modbus,Mitsubishi MQTT, Broadlink,Dombus
solarboy
Posts: 300
Joined: Thursday 01 November 2018 19:47
Target OS: Raspberry Pi / ODroid
Domoticz version: 2024.6
Location: Portugal
Contact:

Re: Hardware Domoticz Watchdog

Post by solarboy »

Geofrancis wrote: Thursday 11 January 2024 10:54
solarboy wrote: Friday 05 January 2024 21:29
Geofrancis wrote: Friday 05 January 2024 20:32 @Denny64. The best approach would be a combination of hardware and software watchdogs.
This is why I have 2 pi's 2 Aeotec Sticks, 2 USB sticks , 2 ssd drives, 2 UPS, 2 RFX sticks.
How are you operating redundant pi's? Are you mirroring them? I have some spare pi boards and the idea of having a hot spare controller sitting is appealing. the hardware monitor could switch on a second pi if the first one fails to respond after a set time. it can just be switched off by holding the reset pin low and a second one switched on by releasing the reset.
I like this idea, maybe it would be possible to automate it all with a 3rd controller as watchdog/controller.
Intel NUC with Ubuntu Server VM (Proxmox),mosquitto(docker),RFXtrx433E,zwavejsUI (docker),Zigbee2mqtt(docker),SMA Hub (docker),Harmony Hub plugin, Kodi plugin,Homebridge(docker)+Google Home,APC UPS,SMA Modbus,Mitsubishi MQTT, Broadlink,Dombus
User avatar
Denny64
Posts: 53
Joined: Friday 03 February 2017 11:34
Target OS: Raspberry Pi / ODroid
Domoticz version: 2023.1
Location: Italy
Contact:

Re: Hardware Domoticz Watchdog

Post by Denny64 »

To automatically reboot my system in the event of a hang, I used a Shelly One switch (with Espeasy firmware) on the RPI's power supply. If the Shelly does not receive the control message from the DMZ (generated by cron every 15 minutes), the Shelly turns off power to the RPI and, after 10 seconds, turns it back on. In this way it is possible to have an autonomous restart of the RPI both in the event of hardware and software failure.
Kedi
Posts: 536
Joined: Monday 20 March 2023 14:41
Target OS: Raspberry Pi / ODroid
Domoticz version:
Location: Somewhere in NL
Contact:

Re: Hardware Domoticz Watchdog

Post by Kedi »

Would not it be better to investigate and solve the event of the hang?
I had a raspberry which was up for more then 1000 days without any problem and Domoticz which was up for more then 900 days without ever going down.
Logic will get you from A to B. Imagination will take you everywhere.
User avatar
Denny64
Posts: 53
Joined: Friday 03 February 2017 11:34
Target OS: Raspberry Pi / ODroid
Domoticz version: 2023.1
Location: Italy
Contact:

Re: Hardware Domoticz Watchdog

Post by Denny64 »

@ Kedi

Yes, I agree with you, but an extra check doesn't hurt, especially if it's simple and cheap.
Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Re: Hardware Domoticz Watchdog

Post by Geofrancis »

the problem is, domoticz troubleshooting guide consists of essentially:

installed to a SSD
look at crash log, there isnt one.
database check, mine is fine
disable hardware until it stops....I dont have any plugins install, they are all virtual
issue with a script, I have 29 blockly scrips running, as far as i know there no way to check if one of them is causing the issue other than turning them off one by one. so if I turn each off each one for a few days il figure out what one it is in a few months time....if that is even the issue.
User avatar
Denny64
Posts: 53
Joined: Friday 03 February 2017 11:34
Target OS: Raspberry Pi / ODroid
Domoticz version: 2023.1
Location: Italy
Contact:

Re: Hardware Domoticz Watchdog

Post by Denny64 »

Have you tried replacing the Rpi power supply?
Geofrancis
Posts: 10
Joined: Tuesday 02 January 2024 19:28
Target OS: -
Domoticz version:
Contact:

Re: Hardware Domoticz Watchdog

Post by Geofrancis »

3 different power supplies and a UPS. logs dont indicate any power issues.
User avatar
Denny64
Posts: 53
Joined: Friday 03 February 2017 11:34
Target OS: Raspberry Pi / ODroid
Domoticz version: 2023.1
Location: Italy
Contact:

Re: Hardware Domoticz Watchdog

Post by Denny64 »

When nothing seems to resolve... Have you tried changing the Rpi wiht another one?
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest