My latest work project has involved writing a custom Django API from scratch. Due to the numerous business logic and front-end requirements, something like Django Restful Framework wasn’t really a great option. I learned a great deal about the finer points of Django performance while delivering an API capable of delivering thousands of results quickly.
I’ve consolidated some of my tips below.
Be careful using model managers, especially when working with Django Prefetch
data. You will incur additional lookup queries for the operations that your manager performs, as well as any other operations your manager performs on the data (exclude
, order_by
, filter
, etc.).
Do everything you can with properly written Models, queries, and prefetch objects. Once you start using Python, you will significantly impact the performance of your application.
Django is fast. Databases are fast. Python is slow.
select_related
and prefetch_related
Learning to use select_related
and prefetch_related
will save you a ton of time and debugging. It will also improve your query speeds! As I mentioned above, be careful mixing Model managers with these utilities - also, whenever you begin introducing multiple relationships in a query, you will want to use distinct()
and order_by()
. Having said that…
distinct()
gotchasIf you are using advanced Django queries that span multiple relationships, you may notice that duplicate rows are returned. No problem, we’ll just call .distinct()
on the queryset, right?
If you only call distinct()
, and you forget to call order_by()
on your queryset, you will still receive duplicate results! This is a known Django “thing” - beware.
"When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order."
- Django Docs
You can’t fix what you don’t measure. Make sure DEBUG=True
in your Django settings.py
file, and then drop this snippet into your code to output the queries being run.