Introduction

LeanXcale is a distributed scalable database system. It consists of multiple processes with different functions:

lxmeta: It coordinates the whole database, keeps database metadata, and takes care of some essential functions for transaction management.
lxqe: The SQL query engine that processes SQL statements. Client applications connect to them via standard drivers such as JDBC, ODBC, …
kvms: The metadata store manager for the relational storage system, KiVi.
kvds: The data store manager for the relational storage system, KiVi.
spread: Communication bus for leanxcale components. It provides reliable group communication.

From the above there are two single-instance processes: lxmeta and kvms. lxqe and kvds can have multiple instances. lxqe typically one instance per host, and kvds multiple instances per host. Data is stored in kvds instances and we provide high availability by means of active-active replication. This means that data is written to all replicas as part of the same transaction. lxqe performs sql query processing, but also performs transaction management. In particular, it is in charge of transaction logging and has a component named logger with this function. loggers provide high availability through active-active replication as well. The metadata servers, lxmeta and kvms are also replicated to attain full high availability for the database manager as a whole.

Active-active replication is materialized as pairs of processes, that is, there is a process and its mirror. Since we implement active-active replication, writes are performed on the pair of process, that is, both of them as part of the same transaction, so data cannot be lost in case of single component failures, unlike it happens in other databases with master-slave replication. In the following sections we describe what is relevant for installing leanxcale database with high availability and to administrate it.

Installation

The installation only requires to identify pairs of hosts. Basically, you identify for each host its mirror.

host host1
host host2
	mirror host1

In this example we install in two hosts and one is a mirror of the other. host1 has all components and host2 has all components as well, each a mirror of a component in host1.

Administration

High availability is attained by means of active-active replication (also called synchronous replication). This means that each write is made to each pair of replicas as part of the same transaction.

This means that if one of the replicas fails the other one will remain operational without any loss of data as it would happen in a database with master-slave (also called asynchronous replication). If one replica fails, the system remains operational, and we say that the system is working in degraded mode (since now there is now high availability, and a failure of the replica still alive will mean that the database is down).

When the failed replica is recovered it will be in sync with the live replica (same data) and the system will become fully operational, that is, highly available again.

Backups are performed with the same command as to perform backups without high availability. The backup for a fully operational highly available system backups one of the replicas (since both are identical). The backup of a degraded system (one replica up and one replica down) will backup the data of the live replica. It should be noted that if the system is shutdown in degraded mode, one the live replica contains the up-to-date data. We call it single (as opposed to married when both replicas are alive). The backup in the case of a degraded system will backup the data of the single replica.

If the system was stopped in degraded mode, it can only be restarted with the single replica.