*How failover works
Technical Article | TA-20201002-TP-36 VDG Sense | Tutorials | Failover |
- 1 Introduction
- 1.1 Timeouts
- 1.2 Failover procedures
- 1.3 Storage
Introduction
A ’Failover’ server, or ’Hot Standby’ server is used to monitor one or more servers in the same installation. The failover server actively checks if the monitored servers are still functioning. When it detects a server is not responding anymore, it will automatically take-over the server using the same settings of the ’failed’ server, thus emulating its configuration. When the original server is back online, the failover server will shutdown automatically.
Both procedures are completely transparent for the end-user. The user will see that a server has failed (because there is no video), but in the meantime the failoverserver is starting up. The viewer station will be notified that the video needs to be requested from the failover server. This is done automatically without the need for user interaction; the video returns automatically.
Timeouts
The Failover server will use the following default timeout values to indicate if the failover procedure should be started:
Type | Default Timeout value | Description |
---|---|---|
Connected | 5 minutes | A network connection has been made, but the first startup message has not been received |
Started | 5 minutes | The startup message has been received, but the first keep-alive message has not been received |
Keep alive | 30 seconds | No new keep-alive message is received |
Normal shutdown | 15 minutes | The server has been shutdown normally, but has not restarted yet |
Abnormal shutdown | 5 minutes | The server has exited with an error and has not restarted yet |
The values can be modified in the failover settings.
Failover procedures
Assuming the Failover server is actively checking one or more servers and has a recent copy of all monitored servers and the failover does not receive keep-alive messages.
Slave server fails
Clients will see ‘Server Connection Lost’ in video panels
After 30 seconds the Failover server will take over the failed server by loading the corresponding settings.
Automatically inform management server to change the IP address of the failed server with that of the failover server
Automatically inform all connected clients to relogin on the management server to update the serverlist
Clients are logged in and cameras of failed server are displayed
Failover server constantly checks if failed server is back online
Slave server is restored
Slave server is started, on startup videodata stored on failover server is read
Failover server stops takeover procedure
Automatically inform management server that the failover server is offline and restored server is online
Automatically inform all connected clients to relogin on the management server to update the serverlist
Clients are logged in and cameras of restored server are displayed
Failover server constantly checks monitored servers
Management server fails
Clients will see ‘Server Connection Lost’ in video panels
After 30 seconds the Failover server will take over the failed server by loading the corresponding settings.
Automatically inform slave servers to change the management server address to that of the failover server
Automatically inform connected clients to relogin on the new management server
Clients are logged in and cameras of failed server are displayed
Failover server constantly checks if failed server is back online
Management server is restored
Management server is started, on startup videodata stored on failover server is read and stored events during failover period are synchronized with the database
Failover server stops takeover procedure
Automatically inform slave servers to change the management server address to that of the restored management server
Automatically inform all connected clients to relogin on the restored management
Clients are logged in and cameras of restored server are displayed
Failover server constantly checks monitored servers
Storage
The failover server can monitor multiple servers at the same time, but can only take over one server at a time. The videodata for each monitored server is stored in a separate folder. The name of each folder is the IP address of the monitored server. This folder is automatically shared to provide the monitored server access to the videodata. Videodata is never synchronized with the monitored server.
The (online)monitored server is always owner of all its videodata, locally and on the failover server. This means when the failover server is offline the monitored server manages its own videodata on the failover server. The storage space on the failover server should be seen as replacement storage space, not an addition.
For example, camera channels are set to max 10 days of recording and during the first three days monitored server failed and the failover server was storing video:
Date (day) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Storage duration | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
|
|
|
Data on original server | F | F | F | S | S | S | S | S | S | S | W |
|
|
|
Data on failover server | S | S | S |
|
|
|
|
|
|
|
|
|
|
|
S:Stored, F:Failed, D:Deleting, W:Writing
The server is constantly checking if the maximum storage duration is reached. This means that on day 11-14 it starts deleting videodata on the failover server:
Date (day) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Storage duration |
|
|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Data on original server | F | F | F | D | S | S | S | S | S | S | S | S | S | W |
Data on failover server | D | D | D |
|
|
|
|
|
|
|
|
|
|
|
S:Stored, F:Failed, D:Deleting, W:Writing
For the operator this process is completely transparent