How to monitor if Domoticz is still functioning correctly?

Topics (not sure which fora)
when not sure where to post, post here and mods will move it to right forum.

Moderators: leecollings, remb0

Post Reply
Hesmink
Posts: 168
Joined: Monday 22 June 2015 10:48
Target OS: Raspberry Pi / ODroid
Domoticz version:
Location: The Netherlands
Contact:

How to monitor if Domoticz is still functioning correctly?

Post by Hesmink »

Hello All,

Probably answered somewhere, but searching the forum for 'monitor' and 'Domoticz' doesn't really help.

Today I noticed Domoticz was down (crashed because my hs110 smart plug did not function correctly).
I do have a slave Domoticz instance that monitors my energy usage, so I'm thinking of also using that second instance to monitor the primary one (and vice versa).

Has anything already been build for that?
Is there an 'alive' query I can send to a Domoticz server that indicates that it is still functioning correctly?
Maxx
Posts: 58
Joined: Saturday 27 January 2018 20:59
Target OS: Raspberry Pi / ODroid
Domoticz version: Beta
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by Maxx »

Did something like this some time ago. There are probably better ways to do this but this works for me

Adapt for your system by using your own ip addresses and make the required devices

Put this script in your Secondary system:

Code: Select all

return {
	on = {
		devices = {},
		timer = {'every minute'},
	},
    data = {    heartbeatCounter    = { initial = 0 },
                State               = { initial = 0 },
                timeOut             = { initial = 0 },
	        },

	execute = function(dz, triggeredItem)
	    local HeartcheckOut = dz.devices('HeartcheckMirror')
	    if dz.data.State == 0 then
	        dz.log('Increase heartbeat and send to main')
	        dz.data.heartbeatCounter = dz.data.heartbeatCounter + 1
	        local string = 'http://192.168.1.231:8080/json.htm?type=command&param=udevice&idx=5247&nvalue=0&svalue=' .. dz.data.heartbeatCounter
	        dz.log(string)
            dz.openURL({
            url = string,
            method = 'GET',
            callback = 'dataRetrieved'
         })	        
	        dz.data.timeOut = 0
	        dz.data.State = 1
        elseif dz.data.State == 1 then
            dz.log('Wait for heartbeat from main')
            dz.data.timeOut = dz.data.timeOut + 1
            dz.log(HeartcheckOut.counter)
            dz.log(dz.data.heartbeatCounter)
            dz.log('Timeout timer : ' .. dz.data.timeOut)
            if HeartcheckOut.counter == dz.data.heartbeatCounter then
                dz.data.State = 0
            elseif dz.data.timeOut > 10 then
                dz.data.State = 2
            end
        elseif dz.data.State == 2 then
            dz.log('No response from main')
            local text = "No response domoticz main"
            dz.notify('Heartbeat ',text,dz.PRIORITY_HIGH,dz.NSS_TELEGRAM)
            dz.data.timeOut = 0
            dz.data.State = 0
        end

	end
}

And this in your main system:

Code: Select all

return {
	on = {
		devices = {},
		timer = {'every minute'},

	},

--    logging =   {level                   = domoticz.LOG_DEBUG,
--                 marker                  = "Heart"},	

	execute = function(dz, triggeredItem)
        local HeartbeatIn  = dz.devices('Heart')
        dz.log(HeartbeatIn.counter)
            dz.openURL({
            url = 'http://192.168.1.232:8080/json.htm?type=command&param=udevice&idx=33&nvalue=0&svalue='.. HeartbeatIn.counter,
            method = 'GET',
           callback = 'dataRetrieved'
         })	        
	end
}
User avatar
waltervl
Posts: 5148
Joined: Monday 28 January 2019 18:48
Target OS: Linux
Domoticz version: 2024.7
Location: NL
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by waltervl »

For standard monitoring on the primary device (but also on the secondary) you can follow the instructions in the wiki using the application monit:
https://www.domoticz.com/wiki/Monitoring_domoticz

They use the json command url:

Code: Select all

http://127.0.0.1:8080/json.htm?type=command&param=getversion
and content should have a value with '"status" : "OK"'
You can do the same with a dzVents script if you just want to check the primary from the slave.
Domoticz running on Udoo X86 (on Ubuntu)
Devices/plugins: ZigbeeforDomoticz (with Xiaomi, Ikea, Tuya devices), Nefit Easy, Midea Airco, Omnik Solar, Goodwe Solar
User avatar
Antori91
Posts: 136
Joined: Sunday 12 February 2017 17:12
Target OS: NAS (Synology & others)
Domoticz version: 4.10717
Location: France
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by Antori91 »

Hello,
I use this own code (nodeJS code) to both monitor main and backup domoticz servers (and also Mqtt). If main or backup Domoticz is detected down, it tries to restart Domoticz. This code also synchronizes (using Mqtt) choosen devices between main and backup domoticz databases: https://github.com/Antori91/Home_Automa ... Cluster.js
Domoticz High Availability Cluster: Synology Dz V4.10693 (Main) - Raspberry Dz V4.10717 (Backup) - Scripts Node.js
Alarm server: Raspberry - motionEye - iot_ALARM-SVR Node.js
Sensors/Actuators: ESP8266-Arduino
https://github.com/Antori91/Home_Automation
Hesmink
Posts: 168
Joined: Monday 22 June 2015 10:48
Target OS: Raspberry Pi / ODroid
Domoticz version:
Location: The Netherlands
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by Hesmink »

Thanks all, I opted for using a modified version of the simple bash shell script from the Wiki.
I modified it to only send a notification once, and use Pushover instead of Telegram.
User avatar
erem
Posts: 230
Joined: Tuesday 27 March 2018 12:11
Target OS: Raspberry Pi / ODroid
Domoticz version: 2021.1
Location: Amsterdam/netherlands
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by erem »

Here is my script to check if Domoticz is active, and restart it if not.

Code: Select all

#!/bin/bash
#****************************************************************************
# program    : check-domoticz-active.sh
# programmer : RM
# date       : 2020-02-21
#
# install in cron with crontab -e to run every 5 minutes
#
#  m h dom mon dow   command
#  */5 * * * *       /home/pi/domoticz/scripts/check-domoticz-active.sh >/dev/null 2>&1
#
#
# prerequisites
# jq installed (apt-get install jq)
# make sure the cron entry (example above)has the full path to the script!!
#
# revision
# 0.0.1      2020-23-21   initial release
#
#
#****************************************************************************

i=0

while [ $i -lt 3 ]
do
	# check if domoticz responds to a json query
	DOMOTICZ=`curl -s --connect-timeout 2 --max-time 5 "http://127.0.0.1:8080/json.htm?type=devices&rid=1"`
	STATUS=`echo $DOMOTICZ | jq -r '.status'`
	if [ "$STATUS" == "OK" ] ; then
		echo "Domoticz responded"
		break			# all ok
	else
		i=$(( $i + 1 ))
		echo "Domoticz did not respond on try $i "
	fi
	sleep 5
done

# if we do not have an OK, domoticz did not respond, stop and start it
if [ "$STATUS" != "OK" ] ; then
	echo "Stopping domoticz"
	sudo systemctl stop domoticz.service
	sleep 10
	echo "Starting domoticz"
	NOW=$(date +"%Y-%m-%d %H:%M:%S")
	sudo systemctl start domoticz.service
	Exitcode=$? 	# save systemctl exit code
	if [ $Exitcode != 0 ] ; then
		echo "$NOW - sudo systemctl start domoticz.service failed with exit code $Exitcode"
	else
		echo "$NOW - sudo systemctl start domoticz.service completed with exit code $Exitcode"
	fi
fi

# end of source
Regards,

Rob
twoenter
Posts: 76
Joined: Sunday 17 February 2019 15:01
Target OS: NAS (Synology & others)
Domoticz version: 4.10
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by twoenter »

I use monit. I wrote a complete setup tutorial about it for synology nas:
https://www.twoenter.nl/blog/domoticz/d ... met-monit/
Blog is in Dutch but you can translate with google if you need to ;-)
Check my Domoticz tutorials, ESP8266 and Synology blog at https://www.twoenter.nl/blog
jannl
Posts: 625
Joined: Thursday 02 October 2014 6:36
Target OS: Raspberry Pi / ODroid
Domoticz version: 2022.2
Location: Geleen
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by jannl »

This is my script, I also check if I (or someone else) is logged on, in that case I do not want Domoticz to be restarted, because it is most likely down intentionally.

Code: Select all

#!/bin/bash
# check domoticz
#-m 10 beperkt de duur tot 10 seconden
WHOCOUNT=$( who | wc -l )
if [[ "${WHOCOUNT}" != 0 ]]
then
    echo "Someone is logged on. No further actions" >&2
    exit
fi


status=`curl -m 20 -s -i -H "Accept: application/json" "http://192.168.1.2:8080/json.htm?type=devices&rid=1" | grep "status"| awk -F: '{print $2}'|sed 's/,//'| sed 's/\"//g
'`
if [ $status ]
then
        echo "Domoticz is al gestart"
else
        sudo /etc/init.d/domoticz.sh stop
        sleep 30
        sudo /etc/init.d/domoticz.sh start
        status=`curl -s  --form-string "token=XXXX"  --form-string "user=XXXX"  --form-string "message=Domoticz is opnieuw gestart"  https://api.pushover.net/1/messages.json`
fi

lost
Posts: 616
Joined: Thursday 10 November 2016 9:30
Target OS: Raspberry Pi / ODroid
Domoticz version:
Contact:

Re: How to monitor if Domoticz is still functioning correctly?

Post by lost »

erem wrote: Sunday 11 April 2021 16:45 Here is my script to check if Domoticz is active, and restart it if not.
Hello,

Just take care that domoticz service may still be up, with event system still running (schedules/scripts always done...) but web server side down so no user interaction possible.

In fact, this is what happened most to me in the past!

So checking service up may not always do the job & you may send a json query like other suggestions, or use httping (after install) in a simple cron triggered script.

On top of that, when http server side is down, stopping/restarting service may no work: At restart, in some situations http port bind was not possible because still 'in-use". I then had to restart whole system...

In the end, as domoticz down whatever the reason will always mean http server down, I just check this and after a retry the next minute, if still down I 1st restart service, wait 1mn for http server being up and if still down, do a full shutdown/reboot after saving last domoticz log lines for post-mortem debug if needed.

The /root/checkDomoticz.sh file:

Code: Select all

#!/bin/bash
# Check domoticz (from a crontab) is up a restart whole PI if needed...

domoticzUrl=localhost:8080
BN=`basename $0`

WHOCOUNT=$(who | wc -l)
if [ ${WHOCOUNT} -ne 0 ]
then
  logger $BN: Someone is logged on, no check.
  exit 0
fi

httping -c 5 -i 0 -t 1 --ts -v -Wsqg $domoticzUrl
STATUS=$?
if [ ${STATUS} -ne 0 ]
then
  logger $BN: Domoticz httping-ed KO, retry after 1mn... 
  sleep 1m
  # Retry once
  httping -c 5 -i 0 -t 1 --ts -v -Wsqg $domoticzUrl
  STATUS=$?
  if [ ${STATUS} -ne 0 ]
  then
    logger $BN: Still KO. Get last logs and try service restart then wait...
    tail -n 20 /tmp/domoticz.txt | logger
    service domoticz restart
    sleep 1m
    logger $BN: Check after service restart...
    httping -c 5 -i 0 -t 1 --ts -v -Wsqg $domoticzUrl
    STATUS=$?
    if [ ${STATUS} -ne 0 ]
    then
      logger $BN: Still KO. Get last logs and REBOOT...
      tail -n 20 /tmp/domoticz.txt | logger
      /sbin/shutdown --no-wall -r now
      STATUS=$?
    else
      logger $BN: Service restart OK !
    fi
  else
    logger $BN: Retry OK !
  fi
else
  logger $BN: Domoticz ALIVE !
fi

wait
logger $BN: DONE, status= ${STATUS} !!!
exit ${STATUS}
This is called from a root cron job every 30mn, here's the crontab line:

Code: Select all

0,30  *  *   *   *     /root/checkDomoticz.sh > /dev/null 2>&1
For now, this never failed even if I have less issues than in the past (still using v4.10717 with a few web interface/JS fixes, by far the most stable version I had for now).

Just don't forget to stop the cron/rename script when intentionally stopping domoticz service: I screwed a raspbian version update in the middle of the process, with then a non-bootable system & a full reinstall needed!

EDIT : Should add a apt lock check to this script, as I may forget about this when debian 11 will be out...
Added logged user check as suggested hereupper by Janni, that's much better!
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest