Type: New Feature
Fix Version/s: Exasol 6.1.0
Before a spare node had to be running all the time to provide automatic fail-safety capabilities in case of a node failure.
Starting with 6.1 on AWS failover capabilities are also available if:
- no spare/standby node is configured
- a spare node was configured but suspended (not running)
In case an active node fails and a runnig spare node is available the spare node automatically replaces the failed node as usual (standard behaviour also before Exasol 6.1),
New behaviour with Exasol 6.1:
In case an active node fails and no running spare node can be detected the failed node will be reprovisioned by the cloud failover plugin (that means the virtual machine will be terminated and replaced by another VM).
If after this process step the node still fails the system checks whether a cold spare node is available. A cold spare node is a spare node that was configured and then suspended thus causing no costs for a running VM instance (storage costs still apply). In case a cold standby node can be found it is started and the standard fail-over mechanism is triggered.
Prerequisites for cold standby nodes and node reprovisioning:
- EXAoperation (including the failover-plugin) need direct access to the relevant AWS services (i.e. not through a proxy)
- Please note that a cold standby node can only be used and activated in case the remaining cluster has quorum when a node fails. That means a cluster consisting of Management Node and at least 3 Data Nodes is required to use the cold standby node functionality with one configured cold standby node.
- If no standby node is configured, at least the Management Node and 2 Data Nodes are required to leverage the reprovisioning functionality in case of a node failure as described in the last section.
Advantages/Disadvantages of a hot spare node
A running hot spare node provides the shortest service interruption in case of a node failure. On the other hand it causes the highest costs.