High Availability Domoticz cluster?

Use this forum to discuss possible implementation of a new feature before opening a ticket.
A developer shall edit the topic title with "[xxx]" where xxx is the id of the accompanying tracker id.
Duplicate posts about the same id. +1 posts are not allowed.

Moderators: leecollings, remb0

Post Reply
niceandeasy
Posts: 102
Joined: Thursday 28 January 2016 22:25
Target OS: Raspberry Pi / ODroid
Domoticz version: 3.8153
Location: NL
Contact:

High Availability Domoticz cluster?

Post by niceandeasy »

Okay, the idea is not new:
pepijn wrote:
CopyCatz wrote: Hmm when my conrad antenna comes in I probably will be able to control pepijn's devices hehe :twisted:
Cool let's build a high availabilty domoticz cluster :)
lol

Reading stories about failing SD cards and all other things that could go wrong on a Raspberry PI, I'd like to ask: ..Any progress on the cluster?

It's not like that I really need it, but it would be really awesome to have it: two Raspberries, both running Domoticz, complete with RF tranceivers and all. One active, one hot-spare. And I could imagine that with such a feature, Domoticz could be deployed at sites that really need high up-times.
Config and script file directories can be replicated by scripts, the database might be synced by scripts but that might not be necessary, since both Domoticz instances will receive the same RF signals. So (partial) sync is only needed when one has been offline for a while, and, of course, initial sync, to duplicate the first and active into the hot-spare. (insider knowledge of the database is needed)
In normal operation, both Domoticz instances will have their own IP address. When the active fails, the hot-spare will need to activate itself, using the IP-address of the failed one, and possibly shutdown the OS of the failed one, if only Domoticz crashes. Even this can be scripted. The hot-spare will need to monitor the active. Can also be scripted.
If GPIO is in use... well, most inputs can be connected to both Domoticz RPIs, outputs need some buffering electronics parts. Can also be done.
And probably there will be some considerations when writing LUA code for running on clustered Domoticz systems: needs to be well documented.

There are two issues, that need some Domoticz-tinkering, for this:
- The hot-spare Domoticz instance should not transmit any RF signals: two simultaneous RF sends will jam eachother. This needs to be implemented in Domoticz itself. Could be a system switch or something like that.
- The active Domoticz needs to forward incoming network commands, like certain JSON API commands and maybe IP device state info, to the hot-spare. Probably this is already possible via de remote shared port for remote Domoticz clients. I didn't investigate that, yet.


I could imagine all needed scripting bundled in a single add-on package that comes with a custom html page that adds the controls, cluster settings and status info. The best solution would be that it's a feature of Domoticz itself.
sijones
Posts: 70
Joined: Wednesday 15 October 2014 14:16
Target OS: Linux
Domoticz version: Git
Location: UK
Contact:

Re: High Availability Domoticz cluster?

Post by sijones »

I really can't see a use for this, the code is written well and pretty robust. Certainly for me I have no issues apart from the odd occasion where i do an update and find a bug, but this doesn't take long and if following the release not beta you shouldn't hit this problem.

If you're wanting mission critical then use the proper hardware and not a hobby board which is what the raspberry pi is, yes it's useful and nice but if you want no chance of failure or limited failure by SD card then look at the many other ARM boards that have a SATA port.

I use a wandboard, quad core, 2GB RAM, 60GB SSD, it has plenty of processing power / ram, and storage is reliable and never misses a beat.

This board has outlasted an SD card in a Pi so I know it's working well!
User avatar
nayr
Posts: 354
Joined: Tuesday 11 November 2014 18:42
Target OS: Linux
Domoticz version: github
Location: Denver, CO - USA
Contact:

Re: High Availability Domoticz cluster?

Post by nayr »

yeah step 1 is going to be run away from Raspberry hardware, and onto something more rugged and suitable..

The easiest way to accomplish what your looking for is simply run domoticz out on a VM cluster that can live-migrate the instance around.. it can be done pretty easy with a robust and redundant file storage backend.. you could have hardware on each VM host that was bridged to it when ran locally, not sure how well that would live-migrate but worst case domoticz might have to restart the hardware after migrating.

My 55g aquarium runs on a BeagleBoneBlack with an external RTC, onboard Lipo battery to safely shutdown, and the OS off onboard flash memory.. I had to make my own circuitry that was robust and fault tolerant as the beagle surely did not provide that.. but its been running just fine for a while now and has not flooded my house or flushed my fish down the drain.. I think its about time I replaced its breadboard with a custom one off circuit board but I am afraid to upgrade it let alone touch it... but I will sooner or later, just for piece of mind.. I torture tested this board for over a month solid on a bench before using it, just to ensure quality.. high availability was not a goal but high uptime and stability was.. if it got hung with a valve open it would be a household tragedy.

If you want fault tolerance go into master-slave mode, at least you can compartmentalize things so a failure of one node does not mean a system wide automation outage.. if you use standard hardware for slaves and keep a spare around you can be back online in pretty quick order if you have a good backup system.. I am considering splitting my Home Theatre automation out to its own hardware now so it can still operate in standalone without network access.. in fact right now im moving off IP control and onto serial terminal access to prepare for the move... I already have a spare beagle idle on the network with OS and Domoticz keeping its self updated.. I just gotta drop the last known database into it, sync scripts to it and replace it with a failed node or turn it into my next node and make a new spare.
Debian Jessie: CuBox-i4 (Primary) w/Static Routed IP and x509 / BeagleBone with OpenSprinkler / BeagleBone Planted Aquarium / 3x Raspbery Pi2b GPIO Slaves
Elemental Theme - node-domoticz-mqtt - Home Theatre Controller - AndroidTV Simple OSD Remote - x509 TLS Auth
User avatar
nayr
Posts: 354
Joined: Tuesday 11 November 2014 18:42
Target OS: Linux
Domoticz version: github
Location: Denver, CO - USA
Contact:

Re: High Availability Domoticz cluster?

Post by nayr »

another thought, with Z-Wave you can have secondary controller on an identical domoticz instance and simply use an external mechanism to enable the event system when the primary disappears.. there are built in tools in linux to have a hotspare instantly take over a host IP when the original goes offline so any external HTTP/MQTT calls would hit the hotspare instantly. (see UCARP or Heartbeat)

the secondary would not be broadcasting commands unless it has the events to do so, so both could share the Z-Wave network fine without too much trouble.. infact it could help if there was enough physical separation the secondary controller could provide extra routing coverage (ie act as a repeater while waiting to take over)

once I get the home theatre off my master server the only hardware it will have directly would be the Z-Wave controller, everything else would come to it via network slaves so I could possibly setup a highly available master server with a fully operational hotspare waiting for an outage to take over without much trouble I'd expect.. I could even put it on my VM server since it'll need so little ram and storage.

All it would take is a real simple app that would sync the hotspare with the primary using the MQTT interface.. just take everything not a z-wave device coming out of the primary and redirect it back into the spare, design the app to simply enable the event system on the hotspare when the IP address is assigned locally and turn it back off when the primary takes back the IP.. You'd have to have a pretty static setup though, making changes to one master would have to be done manually to the second one or once again you have something external shove a fresh database at your hotspare from time to time...
Debian Jessie: CuBox-i4 (Primary) w/Static Routed IP and x509 / BeagleBone with OpenSprinkler / BeagleBone Planted Aquarium / 3x Raspbery Pi2b GPIO Slaves
Elemental Theme - node-domoticz-mqtt - Home Theatre Controller - AndroidTV Simple OSD Remote - x509 TLS Auth
niceandeasy
Posts: 102
Joined: Thursday 28 January 2016 22:25
Target OS: Raspberry Pi / ODroid
Domoticz version: 3.8153
Location: NL
Contact:

Re: High Availability Domoticz cluster?

Post by niceandeasy »

Thanks for the thoughts on high availability.
I admit that cluster considerations for Domoticz should not be high on the developer's list. There are more important things to implement and there's plenty of other ways to get high reliability, especially in setups where problems can end up in small disasters, like, in aquarium control.
Next to a rugged control board, it's good practice to design output circuitry and hardware in a way that it cannot cause a fault in case of power failure or loss of control.
Blind and curtain control motors rely on end stop switches, instead of tearing themselves off the wall, waiting until a Domotica system tells them to stop.
Without a pump running, water may start flowing in the other direction because of gravity. Use one-way valves and valves that are normally closed (or open) when not powered. Use an overheat shut-off thermostat inline with a heater power supply. Use an extra water leakage sensor, hardwired to the pump and valve control relays.
For me, I have an extra Raspberry for experimenting and backup, some preloaded SD-cards and backups on some NAS devices. My Domoticz is not involved in mission-critical control.
...yet...
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest