DjangoTricks: January 2015

Once you have your Django projects running, you come to situations, when you need to optimize for performance. The rule of thumb is to find the bottlenecks and then to take action to eliminate them by more idiomatic Python code, database denormalization, caching, or other techniques.

What is a bottleneck? Literally it refers to the top narrow part of a bottle. In engineering, bottleneck is a case where the performance or capacity of an entire system is limited by a single or small number of components or resources.

How to find these parts of your code? The most trivial way is to check the current time before specific code execution and after that code execution, and then count the time difference:

from datetime import datetime
start = datetime.now()
# heavy execution ...
end = datetime.now()
d = end - start  # datetime.timedelta object
print d.total_seconds()  # prints something like 7.861985

However, measuring code performance for Django projects like this is inefficient, because you need a lot of such wrappers for your code until you find which part is the most critical. Also you need a lot of manual computation to find the critical parts.

Recently I found line_profiler module that can inspect the performance of the code line by line. By default, to use line_profiler for your functions, you should decorate them with @profile decorator and then to execute the script:

$ kernprof -l some_script_to_profile.py

This script will execute your script, analize the decorated function, and will save results to a binary file that can later be inspected with:

$ python -m line_profiler some_script_to_profile.py.lprof

That's quite complicated, but to use line_profiler for Django views, you can install django-devserver which replaces the original development server of Django and will output the performance calculations immediately in the shell like this:

[30/Jan/2015 02:26:40] "GET /quotes/json/ HTTP/1.1" 200 137
    [sql] 1 queries with 0 duplicates
    [profile] Total time to render was 0.01s
    [profile] Timer unit: 1e-06 s

          Total time: 0.001965 s
          File: /Users/archatas/Projects/quotes_env/project/inspirational/quotes/views.py
          Function: quote_list_json at line 27

          Line #      Hits         Time  Per Hit   % Time  Line Contents
          ==============================================================
              27                                           def quote_list_json(request):
              28         1            2      2.0      0.1      quote_dict_list = []
              29         2         1184    592.0     60.3      for quote in InspirationQuote.objects.all():
              30         1            1      1.0      0.1          quote_dict = {
              31         1            1      1.0      0.1              'author': quote.author,
              32         1            1      1.0      0.1              'quote': quote.quote,
              33         1          363    363.0     18.5              'picture': quote.get_medium_picture_url(),
              34                                                   }
              35         1            1      1.0      0.1          quote_dict_list.append(quote_dict)
              36
              37         1           42     42.0      2.1      json_data = json.dumps(quote_dict_list)
              38         1          370    370.0     18.8      return HttpResponse(json_data, content_type="application/json")

The most interesting data in this table is the "% Time" column, giving an overview in percentage which lines of the Django view function are the most time-consuming. For example, here it says that I should pay the most attention to the QuerySet, the method get_medium_picture_url() and the HttpResponse object.

To setup line profiling, install line_profiler and django-devserver to you virtual environment:

(myproject_env)$ pip install line_profiler
(myproject_env)$ pip install django-devserver

Then make sure that you have the following settings in your settings.py or local_settings.py:

# settings.py
INSTALLED_APPS = (
    # ...
    'devserver',
)

MIDDLEWARE_CLASSES = (
    # ...
    'devserver.middleware.DevServerMiddleware',
)

DEVSERVER_MODULES = (
    'devserver.modules.sql.SQLRealTimeModule',
    'devserver.modules.sql.SQLSummaryModule',
    'devserver.modules.profile.ProfileSummaryModule',

    # Modules not enabled by default
    'devserver.modules.profile.LineProfilerModule',
)

DEVSERVER_AUTO_PROFILE = True  # profiles all views without the need of function decorator

When you execute

(myproject_env)$ python manage.py runserver

it will run the development server from django-devserver and for each visited view, it will show the analysis of code performance. I have tested this setup with Django 1.7, but it should work since Django 1.3.

Do you know any more useful tools to check for performance bottlenecks?

DjangoTricks

2015-01-30

How to Find the Performance Bottlenecks in Your Django Views?