Codementor Events

Comparing In-Memory Databases: Redis vs. MongoDB (Percona Memory Engine)

Published Jun 20, 2017
Comparing In-Memory Databases: Redis vs. MongoDB (Percona Memory Engine)

ScaleGrid is the only MongoDB and Redis hosting solution that lets you manage MongoDB and Redis instances on both public clouds and on premise from a single central console. Try us free for 30 days.

Originally, this post was featured in June on the ScaleGrid blog.

In this post, we will compare two of the most popular NoSQL in memory databases: Redis & MongoDB (with memory storage engine) .

Redis is a popular in memory database structure store. It is primarily used as a very fast data structure store, a cache or a message broker among other things. Being in-memory, it is the data store of choice when response times trumps everything else.

MongoDB is an on-disk document store that provides a JSON interface to data and has a very rich query language. Known for its speed, efficiency, and scalability, it is the most popular NoSQL database currently. However, being an on-disk database, it can’t compare favorably to an in-memory database like Redis in terms of absolute performance. But with the availability of the in memory storage engines for MongoDB, a more direct comparison becomes feasible.

Percona Memory Engine for MongoDB

Starting version 3.0, MongoDB provides an API to plug in a storage engine of your choice. A storage engine, from the MongoDB context, is the component of the database that is responsible for managing how data is stored, both in memory and on disk. MongoDB supports an in-memory storage engine, however, it’s currently limited to Enterprise edition of the product. In 2016 Percona released an in memory engine for MongoDB Community Edition called the Percona Memory Engine for MongoDB that is open source. Like MonogDB’s in-memory engine, it too is a variation of the WiredTiger storage engine but with no persistence to disk.

Advantages of Redis as a Cache

With an in-memory MongoDB storage engine in place, we have a level playing field between Redis and MongoDB. However, what is the need to compare the two? Let’s look at the advantages of each of them as a caching solution.

Let’s look at Redis first.

  • A well-known caching solution that excels at it.

  • Redis isn’t a plain cache solution – it provides advanced data structures that provide a lot of powerful ways to save and query data which can’t be achieved with a vanilla key-value cache.

  • Redis is fairly simple to setup, use and learn.

  • Redis provides persistence should you choose to set it up. So cache warming in case of crashes is hassle free.

Some disadvantages for Redis are: it doesn’t have inbuilt encryption on wire, RBAC, a seamless, mature clustering solution and can be a pain to deploy in large scale cloud deployments.

Advantages of MongoDB as a Cache

  • MongoDB is a more traditional database with advanced data manipulation features (think aggregations and map-reduce) and a rich query language.
    SSL, RBAC, and scale-out built in.

  • If you are already using MongoDB as your primary database, then your operational and development costs drop as there would be just one database to learn and manage.

  • Look at this post from Peter Zaitsev on where the MongoDB in-memory engine might be a good fit.

One significant disadvantage for MongoDB with an in-memory engine is that it offers no persistence until it is deployed as a replica set with persistence configured on the read replica(s).

In this post, we will focus on the quantifying the performance differences between Redis and MongoDB. A qualitative comparison and operational differences will be covered in subsequent posts.

Redis vs In-Memory MongoDB : Performance TL;DR

  • Redis performs considerably better for reads for all sorts of workload and better for writes as the workloads increase.

  • Even though MongoDB utilizes all the cores of the system, it gets CPU bound comparatively early. While it had compute available, it was better at writes than Redis.

  • Both of databases are eventually compute bound. And even though Redis is single threaded, it (mostly) gets more done with running on one core than MongoDB does while saturating all the cores.

  • Redis, for non-trivial data sets, uses a lot more RAM compared to MongoDB to store the same amount of data.

Configuration

The tool we used to measure performance was YCSB. We have been using YCSB to compare and benchmark performance of MongoDB on various cloud providers and for various configurations in the past. We assume a basic understanding of YCSB workloads and features in the test rig description.

  • Database instance type - AWS EC2 c4.xlarge featuring 4 cores, 7.5 GB memory, and enhanced networking to ensure we don’t have network bottlenecks.

  • Client Machine - AWS EC2 c4.xlarge in the same VPC as the database servers.

  • Redis – version 3.2.8 with AOF and RDB turned off. Standalone.

  • MongoDB – Percona Memory Engine based on MongoDB version 3.2.12. Standalone.

  • Network Throughput: Measured via iperf as recommended by AWS:
    Screen Shot 2017-06-19 at 8.33.54 PM.png

Workload Details

  1. Insert Workload: 100 % Write – 2.5 million records
  2. Workload A: Update heavy workload – 50%/50% Reads/Writes – 25 million operations
  3. Workload B: Read mostly workload – 95%/5% Reads/Writes – 25 million operations

Client Load

Throughput and latency measured over incrementally increasing loads generated from the client. This was done by increasing the number of YCSB client load threads, starting at 8 and growing in multiples of 2

Results

Workload B Performance

Since the primary use case for in-memory databases is cache, let’s look at Workload B first.

Here are the throughput/latency numbers from the 25 million operations workload. The ratio of reads:writes was 95:5. This would be a representative cache reading workload.
Screen Shot 2017-06-19 at 8.35.58 PM.png

Observations during the run

  • For MongoDB, CPU was saturated by 32 threads onwards. Greater than 300% usage with single digit idle %ages.

  • For Redis, CPU utilization never crossed 95%. So Redis was consistently doing considerably better than MongoDB while running on a single thread, while MongoDB was saturating all the cores of the machine.

  • For Redis, at 128 threads, runs failed often with read timeout exceptions.

Workload A Performance

Here are the throughput/latency numbers from the 25 million operations workload. The ratio of reads:writes was 50:50.

Screen Shot 2017-06-19 at 8.37.13 PM.png

Observations during the run

  • For MongoDB, CPU was saturated by 32 threads onwards. Greater than 300% usage with single digit idle %ages.

  • For Redis, CPU utilization never crossed 95%.

  • For Redis, by 64 threads and above, runs failed often with read timeout exceptions.

Insert Workload Performance

Finally, here are the throughput/latency numbers from the 2.5 million record insertion workload. The number of records was selected to ensure that total memory used in case of Redis did not exceed 80% (since Redis is the memory hog, see Appendix B).

Screen Shot 2017-06-19 at 8.38.48 PM.png

Observations during the run

  • For MongoDB, CPU was saturated by 32 threads onwards. Greater than 300% usage with single digit idle %ages.

  • For Redis, CPU utilization never crossed 95%.

Appendices

A: Single Thread Performance

I had a strong urge to find this out – even though it is not very useful in real world conditions: who would be better when applying the same load to each of them from a single thread. That is, how would a single-threaded application perform?

Screen Shot 2017-06-19 at 8.39.55 PM.png

B: Database Size
The default format of records inserted by YCSB are: each record is of 10 fields and each field is 100 bytes. Assuming each record to be around 1KB, the total expected size in memory would be upwards of 2.4GB. There was a stark contrast in the actual sizes as seen in the databases.

MongoDB
Screen Shot 2017-06-19 at 8.40.52 PM.png

So the space taken is ~2.7GB. This is pretty close to what we expected.

Redis

Let’s look at Redis now.

Screen Shot 2017-06-19 at 8.41.39 PM.png

At peak usage, Redis seems to be taking around 5.72G of memory i.e. twice as much memory as MongoDB takes. Now, this comparison may not be perfect because of the differences in the two databases. But this difference in memory usage is too large to ignore. YCSB inserts records in a hash in Redis. An index is maintained in a sorted set. Since an individual entry is larger than 64, the hash is encoded normally, thus there is no saving in space there. Redis performance comes at the prices of increased memory footprint.

This, in our opinion, can be an important data point in choosing between MongoDB and Redis – MongoDB might be interesting for users who care about reducing their memory costs.

C: Network Throughput

An in-memory database server is liable to either be compute bound or network I/O bound. It was thus important throughout the entire set of these tests to ensure that we were never getting network bound. Measuring network throughput while running application throughput tests adversely affects overall throughput measurement. Thus we ran subsequent network throughput measurements using iftop at the thread counts at which highest write throughputs were observed. This number was found to be around 440 Mbps for both Redis and MongoDB at their respective peak throughput. Given our initial measurement of the maximum network bandwidth to be around 1.29 Gbps, we are certain that we never hit the network bounds. In fact, it only supports the inference that if Redis were multi core, we might get much better numbers.

Discover and read more posts from ScaleGrid.io
get started