Codementor Events

How do filter data 100 X faster in Django using Redis

Published Dec 29, 2018
How do filter data 100 X faster in Django using Redis

Lately I’ve been optimising an app’s performance built on Django and DRF. The performance was slow as multiple database calls were being made to the same table that contained near static data (data changed once in a week or so). A good solution would have been to cache the entire queryset for the model however this was not very suitable as each DB call had a different set of filters being applied on the model.

Factors contributing to bad performance

  1. Multiple database calls adds up to network latency along with putting a burden on the database servers.

  2. Lack of ability to perform filters on queryset if the entire queryset was to be cached using Redis.

Model Mixin to our rescue

We’ll create a mixin for our Django models which adds the power to cache querysets and apply relatively simple in memory filters on these querysets.

Note : This is a good option only if the table has a couple of thousand rows and the data is relatively constant for a period of time. For larger tables (10k+ rows), its always better to let the powerful database servers and engines handle the filtering stuff.

Let’s start with assuming a simple Django app (name ‘school_management’) is set up along with redis (How to setup Redis).

Create a files in the app directory named model_mixins.py with the following content.

Let’s create a quick models to use this mixin. You can find the model code here

post_save.connect(clear_student_cache, sender=Student)
post_delete.connect(clear_student_cache, sender=Student)

These lines uses Django signals to take care of clearing the cached data if the model’s data is update. Here’s the listener that does the invalidation.

Here’s a quick reference to query the cached data set using the mixin along with its corresponding Django query.

# Get all objects
q_db = Student.objects.all()
q_cache = Student.get_all_from_cache()

# Simple attribute Filter
q_db = Student.objects.filter(name='John')
q_cache = Student.filter_from_cache(name='John')

# Filter using list
q_db = Student.objects.filter(age__in=[15,20,25])
q_cache = Student.filter_from_cache(age=[15,20,25])

# Filter using foreign key reference
q_db = Student.objects.filter(school__name='MIT')
q_cache = Student.filter_related_from_cache(school={"name": "MIT"})

# Chaining Filters
q_db = Student.objects.filter(
    age__in=[10,20], school__name='MIT'
)
q_cache = Student.filter_from_cache(age=[10,20])
q_cache = Student.filter_related_from_cache(
    q, school={"name": "MIT"}
)

Conclusion

Using the above model mixin helps to cache model queryset and perform in memory filters. This boosts the performance significantly, if the cached model’s data is not updated very frequently and the volume of data is not huge.

I would love to hear you feedback on this approach, if it helped you boost your performance, if there can be a potential improvement to this method, etc.

Discover and read more posts from Aakash Kumar Das
get started
post comments2Replies
Diego Gonzalez
4 years ago

There are a lot of mistakes in code, but is good to start

Aakash Kumar Das
4 years ago

Thanks for the feedback Diego, would be happy to connect with you to learn more about the mistakes you spotted.