MariaDB vs LeanXcale
LeanXcale is a versatile database that combines SQL functionality with NoSQL characteristics. This allows for a range of benefits in flexibility, scalability and maintenance.
In this article we compare performance for LeanXcale with that of MariaDB, using fair workload scenarios. We will show how you can experience marked improvements when choosing LeanXcale over MariaDB.
In addition, LeanXcale provides KiVi, a distributed storage engine which provides additional benefits for all workloads.
1.1. Benchmarking Tool
YCSB, the Yahoo Cloud Serving Benchmark, an open-source specification and program suite commonly used to compare performance of NoSQL database systems.
The comparison is made by generating emulated workload with YCSB and letting the YCSB clients connect to the databases over JDBC.
For a more realistic application, we will take a look at behavior for clusters, to show how LeanXcale shines as the needs of the application grows. And we will show how LeanXcale and MariaDB clusters behave against the same workloads.
2. Machine Specifications
In the benchmark comparison we are using 4 Amazon AWS instances, 1 for the YCSB clients generating the workload, and 3 for the databases.
YCSB clients use a lot of memory, so YCSB Client processes are running on a separate machine than the ones used for the databases cluster in order to do not disturb the metrics.
YCSB Execution planYCSB uses a 2 stage execution plan, with a load stage (clients only execute INSERTS statements in order to get a full working set) followed by a run stage (UPDATES, GETS and SCANS).
All machines used are hosted at Amazon AWS. We have requested spot instances and built the database clusters by hand. The machines that we are using are “r3.xlarge”, so they have the following specs:
*Instance Family *
Instance Storage (GB)
1 x 80
3. Data Model
The data model obviously will be the same for both databases because the process (YCSB Clients) that will run the benchmark is the same just modifying the connection URL.
The benchmark will be based on a table with 1 string field as primary key, a 10 more string fields filled with random characters.
CREATE TABLE usertable (YCSB_KEY varchar(100) PRIMARY KEY, FIELD0 varchar(255), FIELD1 varchar(255), FIELD2 varchar(255), FIELD3 varchar(255), FIELD4 varchar(255), FIELD5 varchar(255), FIELD6 varchar(255), FIELD7 varchar(255), FIELD8 varchar(255), FIELD9 varchar(255));
3.1. Isolation Level
LeanXcale DB supports two isolation levels: Read Committed (default) and Snapshot Isolation as to provide read consistent view of the database to all transactions. For this benchmark we are using the default Read Committed isolation level.
MariaDB will be set up with Read Committed as as well, which by default implements Repeatable Reads.
The workloads used in the benchmark have been selected to give a fair representation of behavior for both LeanXcale and MariaDB, and are not weighted in any way towards or against either.
No auto-commitThe auto-commit function for both LeanXcale and MariaDB is disabled. This is because we want to be able to manage operations in batch to improve performance.
In the benchmark, we use 5 workload scenarios, 4 one-operation scenarios, and 1 mixed-operations scenario.
INSERTstatements with batch size of 1000 rows:
SELECTstatements based on the primary key, getting
UPDATEstatements based on the primary key, updating
SELECTstatements from a random key till the end of the table, uniform distribution around
MagnitudeAs you can see, we are using an order of magnitude fewer inserts in MariaDB than in LeanXcale. The reason for this is that with 400000 rows, MariaDB would take too long to finish for a reasonable test. And if we were to use 40000 rows in LeanXcale, it would finish so quickly to show any meaningful information.
In the benchmark we collect Throughput (operations per second) and Average Latency (µs) of the operations launched.
All metrics are measured from the YCSB client side for fairness.
6.1. MariaDB Setup
The MariaDB setup follows the standard installation process, and is set up with the previously described data model.
The MariaDB JDBC driver (Java 8 connector) is named
mariadb-java-client-2.4.1.jar. We’ll put that file in the
In order to get fair results, the isolation level for MariaDB needs to match the LeanXcale, so MariaDB is configured to use
transaction-isolation = READ-COMMITTED in the
In order to get the best possible performance out of MariaDB, the memory usage distribution needs to be optimized for the the loads that we are going to execute, so we’ll set
key_buffer_size = 2G and
innodb_buffer_pool_size = 5G in the
server.cnf file. The
innodb_buffer_pool_size is specially useful for insert intensive loads.
MariaDB Galera Cluster is a synchronous multi-master cluster for MariaDB with XtraDB/InnoDB storage engines. It is installed and managed separately from MariaDB.
Each MariaDB instance needs to have the
galera.cnf file indicating the IP of the machine and all other machines in the cluster.
To start the cluster, the master needs to be started first with
galera_new_cluster on the master node, and then the slave nodes can be started.
7. LeanXcale Setup
LeanXcale is aimed towards clusters out of the box, so there is no special installation of a clustering tool. All that is required is to update the configuration inventory file to contain the names of the machines on which you are going to install the database.
The LeanXcale Java JDBC driver can be downloaded from the Drivers page, and is placed in the
7.2. Automatic clustering
When the driver is in place there is no need to change anything else. All the cluster configuration and load distribution is automated by LeanXcale.
Chart LegendIn the charts below, the Test column shows Number of Clients - Number of Threads. For example: 23/50 means 23 clients with 50 threads per client.
If we look at the throughput data as graphs, the picture becomes rather clear. As the load increases, it becomes increasingly important how the resources in a cluster are used.
8.1.1. READ Performance
READ operation shows how the performance of MariaDB remains static, with a slight dropoff as large client numbers create more overhead.
For LeanXcale, the case is clearly different: Performance continues to increase. As soon as we add clients, the performance goes up, and we can maintain stable perforamnce by increasing the number of threads.
8.1.2. UPDATE Performance
UPDATE operation, the picture is similar to
READ. MariaDB provides a reasonably stable performance, with a general slight increase in performance as clients and threads are added.
Meanwhile, LeanXcale peformance keeps rising as we add clients and threads.
8.1.3. READ_UPDATE Performance
In the case of the
READ_UPDATE mixed operation, unsurprisingly, we get a similar outcome as for
READ with some
UPDATE aspects. MariaDB performance is reasonably stable, but does not improve for increased loads.
LeanXCale performance increases as we add clients and maintain stability with more threads, and at very high loads we experience a slight dropoff.
8.1.4. LOAD Performance
LOAD operation workload shows a different pattern from the other cases.
For both databases, increasing the number of threads per client causes a performance decrease. But there is an important difference.
MariaDB has a performance peak at 10 threads, and this peak remains constant over increased number of clients.
LeanXcale on the other hand, also reaches a peak at 10 threads, but this peak increases as we increase the number of clients.
In practice this means that while the MariaDB cluster reaches a bottleneck, the LeanXcale cluster continues to be able to accept more load.
8.1.5. SCAN Performance
SCAN operation workload the performance gap between the databases is huge. This is an architectural problem.
MariaDB performance not only is low, it also is a flat line, because the load is not automatically distributed over the nodes. Even though we have manually distributed nodes over the machines, we still cannot increase performance.
LeanXcale on the other hand does all the balancing and load distribution work for itself, which allows us to keep the same familiar profile: as we add clients and threads, performance keeps going up.
8.2. Latency Performance
8.2.1. LOAD Latency
Load: Latency at this stage in both databases behaves in a very similar way, which says a lot about LeanXcale because it can reach great Throughput marks without increasing the latency.
8.2.4. READ_UPDATE Latency
Read, Update, Read_Update: At this stage the MariaDB latency is increasing with the number of clients-threads, besides LeanXcale latency keep constant. This is due to the fact that MariaDB is overloaded and while is not able to increase the throughput, latency is growing more and more.
8.2.5. SCAN Latency
Scan: The behaviour at previous scenarios now is shown much clear and multiplied by one magnitude order. MariaDB latency grows till points where is practically not usable, while LeanXcale keeps a constant controlled latency. This is related with the flow control mechanism and the scan-streaming processes implemented by LeanXcale.
9. The impact of architecture
As should be clear from the previous discussion, the differences between MariaDB and LeanXcale are an effect of how resources are managed and used.
9.1. Balancing and load distribution
9.1.1. MariaDB - master/slave architecture
A major drawback of the MariaDB architecture is that it uses a Master-Slave relationship. This means there is no method to automatically distribute load between the nodes. This has to be handled by the application.
In order to balance the load, you can manually spread the load by distributing the clients uniformly over the available machines. This does solve the balancing problem, but does not provide an increase in performance.
9.1.2. LeanXcale - automatic management
Since LeanXcale has the ability of add and remove nodes dynamically, load is automatically distributed. In addition to this, LeanXcale performance continues to increase as the number of clients is increased. Also, as threads are added to each client, the performance remains stable.
9.2. Cluster management
9.2.1. MariaDB - Galera clusters
A significant drawback is that a Galera Cluster configuration is static, so you need to stop and restart the database in order to re-size the cluster or make other changes.
There are additional complexities for starting and stopping Galera Clusters, since they are dependendent on the machine order. This may involve needing to modify configuration files if the cluster has gone to an inconsistent state.
9.3.1. MariaDB - master/slave
With the master/slave architecture, MariaDB, the mechanism becomes “do one write and send the information to other machines the cluster”. This mechanism generates a huge overhead.
The architecture also means that aggregated queries cannot be processed in parallel at cluster level.
10. KiVi - LeanXcale Direct Datastore API
By now, the benefits of using LeanXcale should be obvious. However, this is only the case of using a standard JDBC connection.
With KiVi, the LeanXcale Direct Datastore API, you can experience even greater performance increases as resources are added.
11. Total Cost of Ownership
When calculating the total cost of ownership of your database solution it is easy to ignore the impact of performance and latency. Hoever, the actual running costs over time can be very severely impacted by how the database operates.
In this benchmark, the different costs for MariaDB and LeanXcale are in definite contrast:
12. Time To Market
When developing your solution you want to move quickly from concept to market delivery. The time to market for any new solution is a very important, yet often overlooked aspect of the development decision process. Moving from development to scaling up to a finished product needs to be a smooth transition. Steep learning curves or slow deployment in the initial development can be as detrimental as complicated migrations or basic technology changes at a late stage, and both can have severely detrimental effects on the bottom line.
The ideal database provides a reasonable learning curve, with a versatile set of options and the possobility of deeper sopistication and specialization. You want to be able to do quickly, and then be able to scale up to production levels of complexity without too much of context switching for the developers.
12.1. TTM conclusions
LeanXcale lets you use the same database for prototyping, initial development, all the way to finished product, deployment and future upscaling. You can keep a single architecture, avoid migrations, and keep your team focused on a reduced set of tools. All of this while still keeping the needed flexibility and ability to change the solution according to market needs.
13. KiVi - Conclusions
When using KiVi, we skip transactions management and query language processing software. This means that we lose the isolation level, which is very useful for bulk load/read scenarios. The increase in performance is notable.
13.1. Thread peaks
While performance does grow continuously, there are local peaks (at 10 threads) as we increase the number of clients. This means that we are still far away from saturation point for the given machines.
14. The upsides of LeanXcale
LeanXcale provide major benefits with strong impact on important areas of interest:
a) Short time to market: With LeanXcale, you only need one database for all your needs. This means simpler architecture, fewer technologies to master while being able to easily adapt to changed requirements and workload needs.
b) Low total cost of ownership: We squeeze every single cycle of CPU, so you get better performance of the same machine, paying less at the end of the month for the same (better) service.
c) Extreme scalability: When you achieve success, LeanXcale will be ready to handle any workload, and you won’t need to change your architecture to fit even the most extreme performance demands.
The great flexibility and versatility of LeanXcale makes it perfect for implementation in a startup, without sacrificing the needs of major enterprises. Your database solution can be created quickly and allows for expansion, modification and growth in pace with your other development.
There is no need to make separate decisions for your quick-and-dirty test implementation and your finalized enterprise product. LeanXcale will follow you all the way.