DynamoDB vs LeanXcale

1. Introduction

LeanXcale is a versatile database that combines SQL functionality with NoSQL characteristics. This allows for a range of benefits in flexibility, scalability and maintenance.

NoSQL databases are widely used and key-value datastore is probably the most used among all the different types. The key-value databases are designed for storing, retrieving and managing associative arrays.

The records are stored and retrieved using a key that uniquely identifies the record and it used to quickly find the data within the database. These databases are used in many scenarios, such as cache matching, real time prizing or IoT.

Amazon DynamoDB is a popular key-value database allowing an easy to use system that is fully managed and integrated within Amazon Web Services (AWS).

1.1. LeanXcale KiVi

LeanXcale provides a key-value database engine called KiVi. KiVi is a low-latency key value datastores that grants fast bulk insertion, update, get and scan operations. It has been designed and implemented after many years of research on distributed systems.

1.2. Benchmark

In this article we compare performance for LeanXcale using KiVi, a Java API to connect directly to the datastore engine with that of DynamoDB, using fair workload scenarios. We will show how you can experience marked improvements when choosing LeanXcale over DynamoDB.

1.3. Benchmarking Tool

YCSB, the Yahoo Cloud Serving Benchmark, an open-source specification and program suite commonly used to compare performance of NoSQL database systems.

1.4. Method

The comparison is made by generating emulated workload with YCSB and letting the databases connect over JDBC.

1.5. Setup

As this is a benchmark article and not a tutorial, we will not go through in detail how to set up the databases and clusters. We’ll point out relevant differences where needed.

2. Data Model

The data model obviously will be the same for both databases because the process (YCSB Clients) that will run the benchmark are the same just modifying the connection URL.

The benchmark will be based on a table with 1 string field as primary key, a 10 more string fields filled with random characters.

Example 1. Table creation
CREATE TABLE usertable (YCSB_KEY varchar(100) PRIMARY KEY,

FIELD0 varchar(255),

FIELD1 varchar(255), FIELD2 varchar(255),

FIELD3 varchar(255), FIELD4 varchar(255),

FIELD5 varchar(255), FIELD6 varchar(255),

FIELD7 varchar(255), FIELD8 varchar(255),

FIELD9 varchar(255));

2.1. Isolation Level

LeanXcale DB supports two isolation levels: Read Committed (default) and Snapshot Isolation as to provide read consistent view of the database to all transactions. For this benchmark we are using the default Read Committed isolation level.

3. Workloads

The workloads used in the benchmark have been selected to give a fair representation of behavior for both LeanXcale and DynamoDB, and are not weighted in any way towards or against either.

No auto-commit
The auto-commit function for LeanXcale is disabled. This is because we want to be able to manage operations in batch to improve performance.

3.1. Scenarios

In the benchmark, we use 5 workload scenarios, 4 one-operation scenarios, and 1 mixed-operations scenario.

3.1.1. Single

  • Load: pure INSERT statements with batch size of 1000 rows:

    • LeanXcale: 400000 rows

    • DynamoDB: 40000 rows

  • Read: pure SELECT statements based on the primary key, getting 50000 rows.

  • Update: pure UPDATE statements based on the primary key, updating 50000 rows.

  • Scan: pure SELECT statements from a random key till the end of the table, uniform distribution around 1000 rows

3.1.2. Mixed

  • Read + Update: mixed SELECT with primary key, and UPDATE statements, 25000 rows for each one.

3.2. Clients

4. Metrics

In the benchmark we collect Throughput (operations per second) and Average Latency (┬Ás) of the operations launched.

All metrics are measured from the YCSB client side for fairness.

5. Setup

5.1. DynamoDB Setup

In order to get a fair benchmark, we need to configure DynamoDB in order to have a comparable comparable performance to the KiVi installation:

\$text(write operations) = 35000 text( ops/sec)\$

\$text(read operations) = 49000 text( ops/sec)\$

5.1.1. Dimensioning Amazon DynamoDB

DynamoDB has an option to deploy an instance with a specific provisioned read and write capacity blocking the auto-scalability functionalities of the system. This option is used to get a DynamoDB instance with enough capacity to run the YCSB benchmark.

In order to find the most appropriated values for the provisioned capacity, we run KiVi on the c5d.xlarge AWS instances. This instance has 4 vCPUs and 8GB of ram; however, only 6GB are assigned to KiVi and the rest is reserved to the operating system.

We run the YCSB client on c5.2xlarge AWS instances (8vCPUs and 16GB of memory) using different YCSB client instance and thread numbers. The workload used for this benchmark is 49,99% read, 50% write and 0.01% scan operations, retrieving 2.000 tuples each scan. In YCSB, each tuple has an average size of 1,1KB, and the key is a string. The evaluation starts populating the database with 400.000 tuples, and the benchmark runs 500.000 operations in total. The benchmark is run with 1, 2, 4, 6 and 10 different YCSB client instances and 1, 2, 5, 10, 20 and 30 threads.

Read

49,99%

Update

20%

Scan

0,01%

Insert

10%

Read and Modify

20%

With this, we can compute the DynamoDB capacity required to run the YSCB benchmark.

\$text(write operations) = 35000 text( ops/sec) * 50% = 17500 text(ops/sec)\$

The YSCB tuple size is 1,1KB, this means that DynamoDB needs at least 2 times the number of operations computed before.

5.2. LeanXcale Setup

LeanXcale is aimed towards clusters out of the box, so there is no special installation of a clustering tool. All that is required is to update the configuration inventory file to contain the names of the machines on which you are going to install the database.

5.2.1. JDBC

The LeanXcale JDBC Driver is bundled by default with the distribution package elasticdriver-<version>-jar-with-dependencies.jar, and is placed in the ycsb-<version>/jdbc-binding/lib directory.

5.2.2. Automatic clustering

When the driver is in place there is no need to change anything else. All the cluster configuration and load distribution is automated by LeanXcale.

5.2.3. YCSB clients

When the cluster is up and runnning, YCSB Clients can be started, using the DynamoDB JDBC client as interface.

6. Results

6.1. Throughput results

The following chart shows the throughput obtained with KiVi and DynamoDB using the YCSB benchmark.

DynamoDB READ Throughput

Throughput DynamoDB

KiVi READ Throughput

Throughput KiVi

We can see that, even with the DynamoDB provisioned capacity, YCSB is not able to run with more than 16500 YCSB ops / sec in the best scenario. We have also check that the capacity of the DynamoDB instance is not fully used. That is due to the latency that YCSB has.

With Kivi, we got more than 35.000 YCSB ops/sec. With KiVi we get the maximum performance of the system with YCSB just with 5 threads per clients, this is due to its low latency. The YCSB clients run synchronous operations for each thread. Consequently, if the operation takes less time, it will take that thread less time to run the next operation. With DynamoDB we get the maximum latency with 10 or even 20 threads per client.

6.2. Latency - YCSB Response time

The second objective of this benchmark is to compare the latency of the operations with DynamoDB and with Kivi. We will show the average response time, the 95-percentile and the 99th percentile. With the 99th percentile we will see if the system is stable and it does not give high latencies depending on current load of the system.

This will show the stability of both systems. In these charts we can see that the percentiles of both systems are good. In both cases, the 99thpercentile is around 4 or 5 times more than the response time. In addition, the percentile 95th is similar to the average in most cases. That means that most of the queries will have a normal latency, given by the average.

However, the latency of KiVi is much better than DynamoDB with all the queries.

6.2.1. READ operation

DynamoDB READ Response Time

Response time READ DynamoDB

KiVi READ Response Time

Response Time  READ  KiVi

For read operations, KiVi is around 10 times faster than DynamoDB, giving response times lower than 0,5ms in all cases. DynamoDB has one-digit millisecond latencies, but KiVi is responding in less than 50us.

\$text(read operations) = 49000 text( ops/sec)\$

6.2.2. INSERT operations

DynamoDB INSERT Response Time

Response time INSERT DynamoDB

KiVi INSERT Response Time

Response Time  INSERT  KiVi

Something similar happens with insert and updates. Both operations in YCSB are managed like inserts for Key-Value databases, in fact, the response time is similar for insert and update operations.

The response time of KiVi is really low, this is due to the cache system that KiVi has. KiVi pre-stores all the new tuples before traversing the tree and putting them it their right place.

6.2.3. SCAN operations

DynamoDB SCAN Response Time

Response time SCAN DynamoDB

KiVi SCAN Response Time

Response Time  SCAN  KiVi

KiVi is around 10 times faster than DynamoDB for scan operations. All the scans chose a random key to start and read the following 2.000 tuples of the database. The main reason for this is that DynamoDB is not optimized for scan operations, while KiVi is ready for scan operations even applying filters or aggregations.

7. Total Cost of Ownership

When calculating the total cost of ownership of your database solution it is easy to ignore the impact of performance and latency. Hoever, the actual running costs over time can be very severely impacted by how the database operates.

In this benchmark, the different costs for DynamoDB and LeanXcale are in definite contrast:

Table 1. Operating cost
DB ops cost/month

DynamoDB

~49000 READ/s 35000 WRITES/s

$21665

LeanXcale

35000

$514

7.1. TCO conclusions

As LeanXcale provides greater efficency for much better performance, the cost is decidedly lower for the same or better service levels.

Due to the performance effiencies, LeanXCale is about 42 times cheaper than a DynamoDB solution.

8. Time To Market

When developing your solution you want to move quickly from concept to market delivery. The time to market for any new solution is a very important, yet often overlooked aspect of the development decision process. Moving from development to scaling up to a finished product needs to be a smooth transition. Steep learning curves or slow deployment in the initial development can be as detrimental as complicated migrations or basic technology changes at a late stage, and both can have severely detrimental effects on the bottom line.

The ideal database provides a reasonable learning curve, with a versatile set of options and the possobility of deeper sopistication and specialization. You want to be able to do quickly, and then be able to scale up to production levels of complexity without too much of context switching for the developers.

8.1. TTM conclusions

LeanXcale lets you use the same database for prototyping, initial development, all the way to finished product, deployment and future upscaling. You can keep a single architecture, avoid migrations, and keep your team focused on a reduced set of tools. All of this while still keeping the needed flexibility and ability to change the solution according to market needs.

8.2. Total cost of ownership

A very important factor to take into consideration of any databse solution is the impact of performance on cost.

8.2.1. DynamoDB

The cost of DynamoDB for the provisioned capacity we need is 19,613.47$/month including everything needed to have DynamoDB operative.

8.3. KiVi

The cost of KiVi depends mainly on the AWS instance used. For this benchmark, we have used the c5d.xlarge that has 100GB of storage. The total cost of the instance is 142,8$/month. DynamoDB offers a replicated system.

KiVi has a costless (regarding latency) replication, but it is required to have another instance for that purpose. LeanXcale is a license-based product. It cost with AWS is 75% extra to the price of the AWS instance. In total, the KiVi deployment costs 499,97 $/month.

9. The upsides of LeanXcale

LeanXcale provide major benefits with strong impact on important areas of interest:

a) Short time to market: With LeanXcale, you only need one database for all your needs. This means simpler architecture, fewer technologies to master while being able to easily adapt to changed requirements and workload needs.

b) Low total cost of ownership: We squeeze every single cycle of CPU, so you get better performance of the same machine, paying less at the end of the month for the same (better) service.

c) Extreme scalability: When you achieve success, LeanXcale will be ready to handle any workload, and you won’t need to change your architecture to fit even the most extreme performance demands.

9.1. Suitability

The great flexibility and versatility of LeanXcale makes it perfect for implementation in a startup, without sacrificing the needs of major enterprises. Your database solution can be created quickly and allows for expansion, modification and growth in pace with your other development.

There is no need to make separate decisions for your quick-and-dirty test implementation and your finalized enterprise product. LeanXcale will follow you all the way.