Page 1 of 1

Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 28 October 2018 12:04
by BertB
I am running the latest beta on a RPI 3B (now V4.10110)
It has a USB RFLink gateway and a ZWave stick.

For some time now, I have problems with domoticz at reboot, lets say after an update. Also, domoticz stops quite often (once in two weeks on average)
Hence I started to log things.

I caught the following lines in the domoticz.txt log file.

Code: Select all

2018-10-27 15:36:09.377  Status: RFLink Detected, Version: 1.1 Revision: 48 Build: 4
2018-10-27 15:36:09.383  Status: MQTT: Subscribed
2018-10-27 15:36:36.200  MQTT: Topic: domoticz/in, Message: {"idx":152,"RSSI":5,"nvalue":0,"svalue":"17.90;50.20;1"}
2018-10-27 15:37:30.534  Error: WebServer:8080 thread seems to have ended unexpectedly (last update 68.000000 seconds ago)
2018-10-27 15:37:30.535  Error: Domoticz(3600) is exiting due to watchdog triggered...
2018-10-27 15:37:35.043  Error: Stack frame for all threads:

2018-10-27 15:37:35.043  Error: > [New LWP 3604]
2018-10-27 15:37:35.044  Error: > [New LWP 3605]
2018-10-27 15:37:35.044  Error: > [New LWP 3606]
2018-10-27 15:37:35.044  Error: > [New LWP 3607]
2018-10-27 15:37:35.044  Error: > [New LWP 3608]
2018-10-27 15:37:35.044  Error: > [New LWP 3609]
2018-10-27 15:37:35.044  Error: > [New LWP 3610]
2018-10-27 15:37:35.044  Error: > [New LWP 3611]
2018-10-27 15:37:35.044  Error: > [New LWP 3612]
2018-10-27 15:37:35.044  Error: > [New LWP 3614]
2018-10-27 15:37:35.044  Error: > [New LWP 3615]
2018-10-27 15:37:35.044  Error: > [New LWP 3616]
2018-10-27 15:37:35.044  Error: > [New LWP 3617]
2018-10-27 15:37:35.044  Error: > [New LWP 3618]
2018-10-27 15:37:35.044  Error: > [New LWP 3621]
2018-10-27 15:37:35.044  Error: > [New LWP 3622]
2018-10-27 15:37:35.044  Error: > [New LWP 3623]
2018-10-27 15:37:35.044  Error: > [New LWP 3624]
2018-10-27 15:37:35.044  Error: > [New LWP 3625]
2018-10-27 15:37:35.044  Error: > [New LWP 3626]
2018-10-27 15:37:35.044  Error: > [New LWP 3627]
2018-10-27 15:37:35.044  Error: > [New LWP 3628]
2018-10-27 15:37:35.044  Error: > [New LWP 3629]
2018-10-27 15:37:35.044  Error: > [New LWP 3630]
2018-10-27 15:37:35.044  Error: > [New LWP 3631]
2018-10-27 15:37:35.044  Error: > [New LWP 3632]
2018-10-27 15:37:35.044  Error: > [New LWP 3633]
2018-10-27 15:37:35.044  Error: > [New LWP 3634]
2018-10-27 15:37:35.044  Error: > [Thread debugging using libthread_db enabled]
2018-10-27 15:37:35.044  Error: > Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
2018-10-27 15:37:35.045  Error: > 0x76e30d10 in nanosleep () at ../sysdeps/unix/syscall-template.S:84
2018-10-27 15:37:35.045  Error: > 84    ../sysdeps/unix/syscall-template.S: No such file or directory.
2018-10-27 15:37:35.045  Error: >   Id   Target Id         Frame
2018-10-27 15:37:35.045  Error: > * 1    Thread 0x7632f000 (LWP 3600) "domoticz" 0x76e30d10 in nanosleep () at ../sysdeps/unix/syscall-templat
e.S:84
2018-10-27 15:37:35.045  Error: >   2    Thread 0x75f42430 (LWP 3604) "SQLHelper" syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
2018-10-27 15:37:35.045  Error: >   3    Thread 0x7573a430 (LWP 3605) "PluginMgr" syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
2018-10-27 15:37:35.045  Error: >   4    Thread 0x74c23430 (LWP 3606) "domoticz" 0x76c7f294 in epoll_wait () at ../sysdeps/unix/syscall-templa
te.S:84
2018-10-27 15:37:35.045  Error: >   5    Thread 0x74423430 (LWP 3607) "domoticz" 0x76c7f294 in epoll_wait () at ../sysdeps/unix/syscall-templa
te.S:84
etc etc

After reboot, it sometimes works fine, but I have to do a cold boot more often

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 04 November 2018 12:03
by BertB
Does nobody recognize the problem?

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Wednesday 09 January 2019 8:42
by haweidner
Hi,

i'm googling around to get some hints to solve my issue with this:

2019-01-08 23:47:19.558 Error: mainworker seems to have ended or hung unexpectedly (last update 301.000000 seconds ago)
2019-01-08 23:47:19.588 Error: Domoticz(13598) is exiting due to watchdog triggered...
2019-01-08 23:47:19.646 Error: Failed to start gdb, will use backtrace() for printing stack frame

This occurs after every start of domoticz, so my domoticz is not usable any more.

I started a discussion over there: https://www.domoticz.com/forum/viewtopi ... =6&t=26443

Also found this "Domoticz crashing with no reason" on https://www.domoticz.com/forum/viewtopic.php?t=25870

Maybe there is some dependencies. But not sure ... i'm not really a guy with programming experience ...

Regards
Harald

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 06 October 2019 8:43
by lost
Hello,

Got the same log yesterday, after stopping my PI3 (shutdown -h now) to make a SD image (I do so after every upgrade, after a while when everything looks OK... and upgraded to last stable about 1 month ago):

Code: Select all

Restart here:
2019-10-05 11:02:29.174  Status: Domoticz V4.10717 (c)2012-2019 GizMoCuz
2019-10-05 11:02:29.187  Status: Build Hash: b38b49e5, Date: 2019-05-09 13:04:08

... all inits OK, z-wave mainly... up to here ():

2019-10-05 11:02:39.674  Status: EventSystem: Script event triggered: /home/domo/domoticz/scripts/lua/script_device_alarmPanel.lua
2019-10-05 11:38:47.116  Status: OpenZWave: Received timeout notification from HomeID: 3664304520, NodeID: 9 (0x09)
2019-10-05 11:38:47.327  Error: mainworker seems to have ended or hung unexpectedly (last update 2179.000000 seconds ago)
2019-10-05 11:38:47.328  Error: Domoticz(701) is exiting due to watchdog triggered...
2019-10-05 11:38:48.913  Status: User: Admin initiated a switch command (77/PresenceGlobal/On)

2019-10-05 11:38:55.411  (GlobalPresence) Lighting 1 (PresenceGlobal)
2019-10-05 11:38:55.435  Status: LUA: Security : DISARM (Presence)
2019-10-05 11:38:55.440  (Z-Stick) Light/Switch (Alarm Type)
2019-10-05 11:38:55.455  Status: EventSystem: Script event triggered: /home/domo/domoticz/scripts/lua/script_device_alarmPanel.lua
2019-10-05 11:38:55.463  (Z-Stick) Light/Switch (Alarm Level)
2019-10-05 11:38:55.464  (Z-Stick) General/Alarm (Alarm Type: Burglar 7 (0x07))
2019-10-05 11:38:55.491  (Z-Stick) Light/Switch (MvtSalon)
2019-10-05 11:38:55.516  (Z-Stick) Light/Switch (Sensor)
2019-10-05 11:38:55.529  (Domoticz Internal) Security (SecPanel)
2019-10-05 11:38:55.536  Error: Stack frame for all threads:

2019-10-05 11:38:55.536  Error: > [New LWP 702]
2019-10-05 11:38:55.536  Error: > [New LWP 705]
2019-10-05 11:38:55.536  Error: > [New LWP 706]
2019-10-05 11:38:55.537  Error: > [New LWP 711]
2019-10-05 11:38:55.537  Error: > [New LWP 712]

... Lot of New LWP errors...

2019-10-05 11:38:55.538  Error: > [Thread debugging using libthread_db enabled]
2019-10-05 11:38:55.538  Error: > Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
2019-10-05 11:38:55.538  Error: > 0x76f00d60 in nanosleep () at ../sysdeps/unix/syscall-template.S:84
2019-10-05 11:38:55.538  Error: > 84	../sysdeps/unix/syscall-template.S: Aucun fichier ou dossier de ce type.
2019-10-05 11:38:55.538  Error: >   Id   Target Id         Frame 
2019-10-05 11:38:55.538  Error: > * 1    Thread 0x763fee30 (LWP 701) "domoticz" 0x76f00d60 in nanosleep () at ../sysdeps/unix/syscall-template.S:84
2019-10-05 11:38:55.538  Error: >   2    Thread 0x76011430 (LWP 702) "Watchdog" 0x76f0151c in __waitpid (pid=1012, stat_loc=0x76010598, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29

...
=> 2179s???

All time stopped for image???

Regards.

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 06 October 2019 9:19
by waaren
lost wrote: Sunday 06 October 2019 8:43 Got the same log yesterday, after stopping my PI3 (shutdown -h now)
2019-10-05 11:38:47.327 Error: mainworker seems to have ended or hung unexpectedly (last update 2179.000000 seconds ago)
It's time offset related. Domoticz starts before the time is properly set and see a time from the past (kept by fake-hwclock). When the time is adjusted to the time of an NTP server domoticz is not aware what happened and assume it was dead for xxxx seconds. Watchdog barks and stops domoticz.

Several ways to deal with this (delay domoticz start after shutdown / boot, install hardware clock, ntp-wait, etc..)

I chose ntp-wait method:

Code: Select all

sudo apt install ntp
Modified /etc/init.d/domoticz.sh to add ntp-wait in the do_start() function.

Code: Select all

#
# Function that starts the daemon/service
#
do_start()
{
        #wait for network time to complete
        ntp-wait
        if [[ $? -ne 0 ]];then
            echo ntp-wait did return with an error condition
            exit 1
        fi


Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 06 October 2019 18:48
by lost
waaren wrote: Sunday 06 October 2019 9:19
lost wrote: Sunday 06 October 2019 8:43 Got the same log yesterday, after stopping my PI3 (shutdown -h now)
2019-10-05 11:38:47.327 Error: mainworker seems to have ended or hung unexpectedly (last update 2179.000000 seconds ago)
It's time offset related. Domoticz starts before the time is properly set and see a time from the past (kept by fake-hwclock). When the time is adjusted to the time of an NTP server domoticz is not aware what happened and assume it was dead for xxxx seconds. Watchdog barks and stops domoticz.

Several ways to deal with this (delay domoticz start after shutdown / boot, install hardware clock, ntp-wait, etc..)

I chose ntp-wait method:

Code: Select all

sudo apt install ntp
Modified /etc/init.d/domoticz.sh to add ntp-wait in the do_start() function.

Code: Select all

#
# Function that starts the daemon/service
#
do_start()
{
        #wait for network time to complete
        ntp-wait
        if [[ $? -ne 0 ]];then
            echo ntp-wait did return with an error condition
            exit 1
        fi

OK, makes sense. NTP is already installed but if Domoticz starts before...
Thanks a lot, I'll modify my startup script!

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Monday 07 October 2019 10:48
by Jan Jansen
waaren wrote: Sunday 06 October 2019 9:19
I chose ntp-wait method
I want to prevent further problems. Does it have to look like this?

Code: Select all

# Function that starts the daemon/service
#
do_start()
{
        #wait for network time to complete
        ntp-wait
        if [[ $? -ne 0 ]];then
            echo ntp-wait did return with an error condition
            exit 1
        fi

        # Return
        #   0 if daemon has been started
        #   1 if daemon was already running
        #   2 if daemon could not be started
        start-stop-daemon --chuid $USERNAME --start --quiet --pidfile $PIDFILE --exec $DAEMON --test > /dev/null \
                || return 1
        start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON -- \
                $DAEMON_ARGS \
                || return 2
}
Thanks in advance!

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Monday 07 October 2019 11:49
by waaren
Jan Jansen wrote: Monday 07 October 2019 10:48 I want to prevent further problems. Does it have to look like this?
Yes, but before you do that check if ntp is installed on your system by inspecting the results of commands:

Code: Select all

sudo ntptime | grep returns
sudo ntp-wait 2> /dev/null && echo 'ntp_wait result is OK' || echo 'ntp_wait result is NOT ok'


Re: Error at reboot Webserver ends and watchdog trigger

Posted: Thursday 07 November 2019 17:12
by HJSK
Hi, thanks I had also same problem, but (re)install NTP did the job.

Domoticz V4.10717 (c)2012-2019 GizMoCuz
Build Hash: b38b49e5, Date: 2019-05-09 13:04:08
Raspberry Pi B3 with raspbian Buster

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 01 March 2020 19:50
by roblom
waaren wrote: Sunday 06 October 2019 9:19
lost wrote: Sunday 06 October 2019 8:43 Got the same log yesterday, after stopping my PI3 (shutdown -h now)
2019-10-05 11:38:47.327 Error: mainworker seems to have ended or hung unexpectedly (last update 2179.000000 seconds ago)
It's time offset related. Domoticz starts before the time is properly set and see a time from the past (kept by fake-hwclock). When the time is adjusted to the time of an NTP server domoticz is not aware what happened and assume it was dead for xxxx seconds. Watchdog barks and stops domoticz.

Several ways to deal with this (delay domoticz start after shutdown / boot, install hardware clock, ntp-wait, etc..)

I chose ntp-wait method:

Code: Select all

sudo apt install ntp
Modified /etc/init.d/domoticz.sh to add ntp-wait in the do_start() function.

Code: Select all

#
# Function that starts the daemon/service
#
do_start()
{
        #wait for network time to complete
        ntp-wait
        if [[ $? -ne 0 ]];then
            echo ntp-wait did return with an error condition
            exit 1
        fi

When I try to install the ntp i get the folowing:

Code: Select all

pi@P1-Logger:~ $ sudo apt install ntp
Pakketlijsten worden ingelezen... Klaar
Boom van vereisten wordt opgebouwd
De statusinformatie wordt gelezen... Klaar
ntp is reeds de nieuwste versie (1:4.2.8p12+dfsg-4).
0 opgewaardeerd, 0 nieuw geïnstalleerd, 0 te verwijderen en 0 niet opgewaardeerd.
2 niet volledig geïnstalleerd of verwijderd.
Na deze bewerking zal er 0 B extra schijfruimte gebruikt worden.
Wilt u doorgaan? [J/n] j
Instellen van owserver (3.2p3+dfsg1-2) ...
Job for owserver.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status owserver.service" and "journalctl -xe" for details.
invoke-rc.d: initscript owserver, action "start" failed.
● owserver.service - Backend server for 1-wire control
   Loaded: loaded (/lib/systemd/system/owserver.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/owserver.service.d
           └─override.conf
   Active: activating (auto-restart) (Result: protocol) since Sun 2020-03-01 19:38:47 CET; 477ms ago
     Docs: man:owserver(1)
  Process: 11238 ExecStart=/usr/bin/owserver -c /etc/owfs.conf (code=exited, status=0/SUCCESS)
 Main PID: 11238 (code=exited, status=0/SUCCESS)

mrt 01 19:38:47 P1-Logger systemd[1]: owserver.service: Service RestartSec=100ms expired, scheduling restart.
mrt 01 19:38:48 P1-Logger systemd[1]: owserver.service: Scheduled restart job, restart counter is at 1.
mrt 01 19:38:48 P1-Logger systemd[1]: Stopped Backend server for 1-wire control.
dpkg: fout bij verwerken van pakket owserver (--configure):
 subproces van pakket owserver werd script post-installation geïnstalleerd gaf de foutwaarde 1 terug
dpkg: vereistenproblemen verhinderen de configuratie van owfs:
 owfs is afhankelijk van owserver (>= 3.2p3+dfsg1-2); maar:
  Pakket owserver is nog niet geconfigureerd.

dpkg: fout bij verwerken van pakket owfs (--configure):
 vereistenproblemen - blijft ongeconfigureerd
Fouten gevonden tijdens verwerken van:
 owserver
 owfs
E: Sub-process /usr/bin/dpkg returned an error code (1)
"systemctl status owserver.service" gives

Code: Select all

pi@P1-Logger:~ $ systemctl status owserver.service
● owserver.service - Backend server for 1-wire control
   Loaded: loaded (/lib/systemd/system/owserver.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/owserver.service.d
           └─override.conf
   Active: failed (Result: exit-code) since Sun 2020-03-01 19:46:57 CET; 2min 35s ago
     Docs: man:owserver(1)
  Process: 13016 ExecStart=/usr/bin/owserver -c /etc/owfs.conf --foreground (code=exited, status=1/FAILURE)
 Main PID: 13016 (code=exited, status=1/FAILURE)

mrt 01 19:46:57 P1-Logger systemd[1]: owserver.service: Service RestartSec=100ms expired, scheduling restart.
mrt 01 19:46:57 P1-Logger systemd[1]: owserver.service: Scheduled restart job, restart counter is at 5.
mrt 01 19:46:57 P1-Logger systemd[1]: Stopped Backend server for 1-wire control.
mrt 01 19:46:57 P1-Logger systemd[1]: owserver.service: Start request repeated too quickly.
mrt 01 19:46:57 P1-Logger systemd[1]: owserver.service: Failed with result 'exit-code'.
mrt 01 19:46:57 P1-Logger systemd[1]: Failed to start Backend server for 1-wire control.
Any idea's?

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 01 March 2020 20:05
by waaren
roblom wrote: Sunday 01 March 2020 19:50

Code: Select all

mrt 01 19:46:57 P1-Logger systemd[1]: owserver.service: Failed with result 'exit-code'.
Any idea's?
I don't think owserver (one wire) is related to NTP.

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 01 March 2020 21:46
by roblom
Sounds unlogical to me too but why do I get the errors when I try to install ntp?
Both problems (1wire and the crash after a restart) are since the mandatory update to buster.

Re: Error at reboot Webserver ends and watchdog trigger

Posted: Sunday 01 March 2020 22:39
by waaren
roblom wrote: Sunday 01 March 2020 21:46 Sounds unlogical to me too but why do I get the errors when I try to install ntp?
Both problems (1wire and the crash after a restart) are since the mandatory update to buster.
If you read the messages you will see that NTP is already installed. No further action is required for that one.
You see the other messages because the packet manager (apt) you use while trying to install NTP is informing you on the incomplete state of all packages (where applicable) you will see the same messages if you install any other package using apt.