1. Introduction

LeanXcale is an ultrascalable distributed database. It is built of several components that can scale out to cope with your workload. Speaking of a database you can get better performance using a bare metal installation, but Docker has significant advantages to deploy a technology so this document is targeted at explaining how to deploy LeanXcale using docker and start working with it.

Since being distributed, LeanXcale can be deployed in very different ways and getting the most efficient depends on the workload. LeanXcale provides some interesting elasticity capabilities, but this is not currently automated. LeanXcale team is working on an Elastic Manager that can automatically adapt your deployment to the workload. Right now, for the sake of simplicity, the deployment is based on a single Docker image in which you can start just one component or a set of components.

This document assumes you have already pulled LeanXcale docker image and you are ready to run it in your machines. Different instances of the docker image running different components can be orchestrated through Kubernetes, Docker Compose, …​ but we are assuming you are familiar with those technologies, so the document is focused on the general docker concepts.

To get into the details and try to take the most of the deployment you should read the Architecture and Concepts documents and then go to the section Docker Deployment Details.

2. Installing LeanXcale

2.1. Dependencies

In order for the installation to be able to execute, the host and all target machines must meet some prerequisites that need to be ensured before the installation takes place. These are summarized as follows:

  • The machines can use either an Ubuntu, CentoOS, or RHEL Linux operating system.

  • You need the following (very common) components to be installed:

    • Ansible (>= 2.6)

    • JAVA release 8

    • Python 2 (>= 2.7)

    • bash

    • screen

    • libpthread

    • numactl (unless you don’t want to take advantage of NUMA advantages)

    • nc (netcat) To check that ports are bound

  • All the machines should have the LeanXcale user created (USER) and the base folder for the install (BASEDIR). This folder must be owned by USER.

  • All the machines should have SSH connectivity and the right ssh keys are set for the user controlling the cluster (USER) so It can get access to any machine in the cluster without any further authentication mechanism. This way SCRIPTs can deploy, configure, start and stop remotely not having to ask for a myriad of passwords for each action.

2.2. Unpacking

The first step for installation is choosing a machine from the cluster to be the master or orchestrator of the cluster (Usually It is a good idea to have at least two machines that can play this role). Connect to that machine and set BASEDIR environment variable preferably in your .profile.

export BASEDIR=/ssd/lxdb

Now you can unpack the TAR installation package in the base directory in the MASTER server and define the BASEDIR:

cd $BASEDIR
tar xvfz LeanXcale_v{Version}.tgz
source ./env.sh

Now the master server of the cluster is ready to be configured so you can later deploy the installation to the rest of the servers.

3. Simple Configuration: the inventory file

For configuring the cluster the basic information you need to know is just the hostnames of the machines you want to deploy on.

With that information you can just configure the servers of the cluster in the inventory file

vim conf/inventory

Following you can see an example of a simple configuration file:

[all:vars]
#BASEDIR for the installation is taken from the environment variable $BASEDIR
BASEDIR="{{ lookup('env','BASEDIR') }}"
#FOLDER to store BACKUPs in
BACKUPDIR=/ssd/leandata/backup/
#Login account for the cluster
USER="{{ lookup('env','USER') }}"
# If {{USER}} has SUDO permissions add SUDO=sudo. Otherwise you may need to
# create some folders and grant access to the USER for some actions to work
SUDO="sudo"
#SUDO="echo"

# If you want to use NUMA say yes
NUMA=no
USEIP=yes
HA=no
HAPROXY=yes

netiface=eth0
forcenkvds=1
forcememkvds=2G
forcencflm=1
forcecflm=1

[defaults]
#The following line tells the configuration SCRIPT to get the values from the
#machine. All resources will be allocated to LeanXcale.
sizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.

#Metadata servers. Currently, you can configure only one
[meta]
bladeMETA sockets=- ansible_connection=local

#Datastore Servers. You can have multiple data store instances
[datastores]
bladeDATA1 sockets=0-1:0-1
bladeDATA2 sockets=0-1:0-1

3.1. Default Resource Allocation

The line starting with sizing defines the default resource allocation. The line assumes that all machines in the cluster have the same HW configuration. However, the parameters in the line could be defined for each machine if you need to override the default sizing. The line has the following parameters:

  • mem: Memory available in the machine to run the components (in GB). If not set it defaults to take all the memory in each machine and distribute it for LeanXcale.

  • sockets: Lists of sockets in the machine to be used to run the components. Values like 0-2 or 0,1,2 or 0,4 are allowed. In DataStore machines, there should be two socket lists definition: The socket list for KVDS and the socket list for QE. Those list must be separated by colon ':'

    Again, if not set all sockets in the machine will be used.

  • ncores: This is the number of physical cores to be used in each machine to run LeanXcale components. Take care that this is physical cores so hyperthreads should not be counted on.

  • nthreadscore: Number of hyperthreads per core

  • fslog: Folder where the transaction logging files are located. These are not the components logs, but the transaction logging that guarantees database durability. If SUDO is not granted, the folder with RW permissions for `USER`should have been created by the administrator

  • fsdata: Parent Folder where database is storing data. If section [DSFs] is available, then those folders will be used.

It is highly recommended to separate transaction redo logging into different disks from database data.

An example of this configuration is the following:

...
sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fslog=/txnlogs fsdata=/lxdata
...

This means: The machines have 128GB available and 2 sockets (0 to 1), there are 12 physical cpus in total and there is no hyperthreading.

Then, redo logging will be written to the filesystem mounted at /txnlogs and data will go to /lxdata.

3.2. Default Component Deployment

Components are typically classified into 2 classes:

  • Metadata components:

    • Zookeeper

    • Configuration Manager

    • Snapshot Server

    • Commit Sequencer

    • KiVi Metadata Server

  • Datastore components:

    • Query Engine

    • Transactional Loggers

    • Kivi Datastore Server

  • Conflict Managers: This are not in either of the two classes, but for the sake of simplicity are - by default - deployed with metadata components unless High Availability is set up.

Giving this classification, the metadata components are deployed in the metadata server while datastore components will be distributed considering the number of datastore servers and their HW capacity.

3.3. Completing the Bare Metal Installation

Once you have finished changing the configuration file you can complete the installation. This last step will build your detailed configuration file including the resource allocation, and deploy the installation in the rest of the servers of the cluster.

From this point the cluster is installed and set up to be started.

3.4. Advanced Configuration

There are quite a few detail parameters that can be set to fine tune your installation. Those parameters are further discussed in [Annex I. Advanced Configuration].

  • Resource allocation

  • Filesystems

  • Detailed component deployment

4. High Availability

LeanXcale can be set-up as a high availability cluster. High availability means that in case of a single component failure or single machine failure the system will keep working non-stop.

You can further define the configuration parameters so the system can keep working in case of two-machine failure or even stronger high availability situations.

4.1. Configuring High Availability

Configuring High Availability is pretty simple, It just requires that you set the following configuration parameter in the inventory file:

HA=yes

4.2. Implications

The implications of configuring this parameter are:

  • The install SCRIPT will check there are at least three machines configured for the cluster.

There has to be at least 2 machines defined as metadata servers and 2 machines defined as datastore servers. One machine can act both as metadata and datastore and that’s the reason there has to be at least 3 machines in the cluster.

The following configuration could be a valid HA configuration:

[all:vars]
BASEDIR="{{ lookup('env','BASEDIR') }}"
BACKUPDIR=/ssd/leandata/backup/
USER="{{ lookup('env','USER') }}"

HA=yes
netiface=eth0

[defaults]
sizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.

[meta]
metaserver1
dataandmetaserver1

[datastores]
dataandmetaserver1
dataserver2
  • Components will be installed as follows:

    • Zookeeper: Zookeeper is the master for keeping the global configuration, health information and arbitration. The install SCRIPT will configure a Zookeeper cluster with three Zookeeper members replicated in different machines.

    • Transaction Loggers: The loggers are components that log all transaction updates (called writesets) to persistent storage to guarantee durability of transactions. The loggers will be configured and replicated in groups so there is a different logger component on a different machine. Therefore the logging information will be replicated in order to recover information in case of a crash.

    • Conflict Manager: Conflict Managers are in charge of detecting write-write conflicts among concurrent transactions. They are high available "per se", because If one Conflict Manager fails the conflict key buckets are transferred to another Conflict Manager. No special configuration is needed for High Availability except guaranteeing that there are at least 2 Conflict Managers configured in 2 different machines.

    • Commit Sequencer: The commit sequencer is in charge of distributing commit timestamps to the Local Transactional Managers. Two Commit Sequencers will be configured, but only one will be acting as master and the other as follower. The follower will take over in case the master fails.

    • Snapshot Server: The Snapshot Sever provides the most fresh coherent snapshot on which new transactions can be started. As in the case of the Commit Sequencer, 2 snapshot servers will be started one will be acting as master and the other as follower. The follower will take over in case the master fails.

    • Configuration Manager: The configuration manager handles system configuration and deployment information. It also monitors the other components. Two configuration managers will be started and one will be the master configuration manager.

    • Query Engine: The Query Engine parses SQL queries and transforms them into a query plan which will derive in a set of actions to the datastores. The query engines are usually configured in a cluster in which any of them can take part of the load so High availability has no special requirement on them.

    • Datastores: There will usually be several datastores in different machines. The important point in datastores is replication.

4.3. Replication

So far, High availability configuration ensures the components are configured in a way that no component failure or machine crash will cause the system to fail.

While components can be available data replication is about data availability so, If a machine or a disk crashes there is another copy of the data so you can keep working on that copy of the data. Data replication can be enabled regardless whether you want also to have high availability or not.

LeanXcale provides a full synchronous replication solution and you can configure as many replicas as you want. Basically, you can define a mirroring datastore configuration in the inventory file:

...
[datastores]
mirror-1:
  server1
  server2
mirror-2:
  server3
  server4

This doesn’t mean every server needs to be mirrored. You may also have unmirrored servers to keep data that doesn’t need to be replicated.

Besides, there are 2 kinds of replicas:

  • Replicas to minimize the risk of losing data and being able to recover service as soon as possible.

  • Replicas to improve performance. This are usually small tables that are used really frequently (usually read) so the application can benefit from having several copies that can be read in parallel.

5. Authentication, SSL & Permissions

There are two options to configure LeanXcale authentication

  • LDAP based authentication. You can set up an LDAP server just for LeanXcale, but this is most interesting for integrating LeanXcale in your organization’s LDAP (or Active Directory) and provide some kind of Single Sign On or at least the chance to use a common password within the organization.

  • Open shared access. This is not to be used in production except for shared data. Access level can be set in the firewall based on IP rules, but all users accessing will be granted access and they will be able to use a user schema. This is very easy to set-up for development and testing environments.

Permissions are granted through roles.

Communications can be configured so SSL is used for connections between components though It is recommended to run all components behind a firewall and use JDBC over SSL and the clients' connections over SSL.