2017-03-01

Tracking the Results of Cron Jobs

Every Django website needs some automatic background tasks to execute regularly. The outdated sessions need to be cleaned up, search index needs to be updated, some data needs to be imported from RSS feeds or APIs, backups need to be created, you name it. Usually, if not all the time, those regular tasks are being set as cron jobs. However, when some task is run in the background, by default, you don't get any feedback whether it was successfully completed, or whether it crashed on the way. In this post I will show you how I handle the results of cron jobs.

In a Django project, all those tasks are usually implemented as management commands. For each such command I write a short bash script, that will call the management command with specific parameters and will print the verbose output to a log file.

Let's say my project structure is like this on a remote server:

/home/myproject
├── bin
├── include
├── lib
├── public_html
├── backups
├── project
│   └── myproject
├── scripts
└── logs

A virtual environment is created in the home directory of myproject linux user. The Django project itself is kept under project directory. The scripts directory is for my bash scripts. And the logs directory is for the verbose output of the bash scripts.

For example, for the default clearsessions command that removes outdated sessions, I would create scripts/cleanup.sh bash script as follows:

#!/usr/bin/env bash
SECONDS=0
PROJECT_PATH=/home/myproject
CRON_LOG_FILE=${PROJECT_PATH}/logs/cleanup.log

echo "Cleaning up the database" > ${CRON_LOG_FILE}
date >> ${CRON_LOG_FILE}

cd ${PROJECT_PATH}
source bin/activate
cd project/myproject    
python manage.py clearsessions --verbosity=2 --traceback >> ${CRON_LOG_FILE}  2>&1

echo "Finished." >> ${CRON_LOG_FILE}
duration=$SECONDS
echo "$(($duration / 60)) minutes and $(($duration % 60)) seconds elapsed." >> ${CRON_LOG_FILE}

To run this command every night at 1 AM, you could create a file myproject_crontab with the following content:

MAILTO=""
00 01 * * * /home/myproject/scripts/cleanup.sh

Then register the cron jobs with:

$ crontab myproject_crontab

By such a bash script, I can track:

  • At what time the script was last executed.
  • What is the verbose output of the management command.
  • If the management command broke, what was in the traceback.
  • Whether the command finished executing or hung up.
  • How long it took to run the command.

In addition, this gives me information whether the crontab was registered and whether the cron service was running at all. As I get the total time of execution in minutes and seconds, I can decide how often I can call the cron job regularly so that it doesn't clash with another cron job.

When you have multiple Django management commands, you can group them thematically into single bash script, or you can wrap them into individual bash scripts. After putting them into the crontab, the only thing left is manually checking the logs from time to time.

If you have any suggestions how I could even improve this setup, I would be glad to hear your opinion in the comments.

Here is the Gist of the scripts in this post. To see some examples of custom Django management commands, you can check Chapter 9, Data Import and Export in my book Web Development with Django Cookbook - Second Edition.


Cover photo by Redd Angelo