2013-12-18

How to Export Your Data as CSV, XLS, or XLSX

There are times, when you need to export the data from your database to different formats. For example, you want to create some diagrams in Office program for a presentation. In this post I will show you how to create admin actions which export selected items as files for a spreadsheet application (like MS Excel, OpenOffice Calc, LibreOffice Calc, Gnumeric, or Numbers). I will cover the mostly used formats: Comma Separated Values (CSV), Binary Excel (XLS), and Office Open XML (XLSX).

First of all, have a look at the model we will be dealing with. It's a simple model with title, description, and - of course - the id.

# models.py
from django.db import models
class MyModel(models.Model):
    title = models.CharField(max_length=100)
    description = models.TextField(blank=True)

In the admininstration options, we'll define three admin actions: export_csv, export_xls, and export_xlsx.

# admin.py
from django.contrib import admin
from models import MyModel

# ... export functions will go here ...

class MyModelAdmin(admin.Admin):
    actions = [export_csv, export_xls, export_xlsx]

admin.site.register(MyModel, MyModelAdmin)

Now let's create functions for each of those actions!

Comma-Separated Values Format

CSV is the most common import and export format for spreadsheets and databases. It's a textual format which one could easily create or parse himself, but there is also a python built-in library csv for handy data manipulation.

def export_csv(modeladmin, request, queryset):
    import csv
    from django.utils.encoding import smart_str
    response = HttpResponse(mimetype='text/csv')
    response['Content-Disposition'] = 'attachment; filename=mymodel.csv'
    writer = csv.writer(response, csv.excel)
    response.write(u'\ufeff'.encode('utf8')) # BOM (optional...Excel needs it to open UTF-8 file properly)
    writer.writerow([
        smart_str(u"ID"),
        smart_str(u"Title"),
        smart_str(u"Description"),
    ])
    for obj in queryset:
        writer.writerow([
            smart_str(obj.pk),
            smart_str(obj.title),
            smart_str(obj.description),
        ])
    return response
export_csv.short_description = u"Export CSV"

As you can see, HttpResponse is a file-like object and we used it to write data to.

Excel Binary File Format

XLS is the main spreadsheet format which holds data in worksheets, charts, and macros. We are going to use xlwt library to create a spreadsheet. There is analogous library xlrd to read XLS files. Note, that this format allows to have only 256 columns.

def export_xls(modeladmin, request, queryset):
    import xlwt
    response = HttpResponse(mimetype='application/ms-excel')
    response['Content-Disposition'] = 'attachment; filename=mymodel.xls'
    wb = xlwt.Workbook(encoding='utf-8')
    ws = wb.add_sheet("MyModel")
    
    row_num = 0
    
    columns = [
        (u"ID", 2000),
        (u"Title", 6000),
        (u"Description", 8000),
    ]

    font_style = xlwt.XFStyle()
    font_style.font.bold = True

    for col_num in xrange(len(columns)):
        ws.write(row_num, col_num, columns[col_num][0], font_style)
        # set column width
        ws.col(col_num).width = columns[col_num][1]

    font_style = xlwt.XFStyle()
    font_style.alignment.wrap = 1
    
    for obj in queryset:
        row_num += 1
        row = [
            obj.pk,
            obj.title,
            obj.description,
        ]
        for col_num in xrange(len(row)):
            ws.write(row_num, col_num, row[col_num], font_style)
            
    wb.save(response)
    return response
    
export_xls.short_description = u"Export XLS"

Here we created one worksheet, filled it with data, marked the first row in bold, and made the lines in the other cells wrapped. Also we set the width for each column. We'll do the same in the next format too.

Office Open XML Format

XLSX (a.k.a. OOXML or OpenXML) is a zipped, XML-based file format developed by Microsoft. It is fully supported by Microsoft Office 2007 and newer versions. OpenOffice 4.0, for example, can only read it. There is a python library openpyxl for reading and writing those files. This format is great when you need more than 256 columns and text formatting options.

def export_xlsx(modeladmin, request, queryset):
    import openpyxl
    from openpyxl.cell import get_column_letter
    response = HttpResponse(mimetype='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
    response['Content-Disposition'] = 'attachment; filename=mymodel.xlsx'
    wb = openpyxl.Workbook()
    ws = wb.get_active_sheet()
    ws.title = "MyModel"

    row_num = 0

    columns = [
        (u"ID", 15),
        (u"Title", 70),
        (u"Description", 70),
    ]

    for col_num in xrange(len(columns)):
        c = ws.cell(row=row_num, column=col_num)
        c.value = columns[col_num][0]
        c.style.font.bold = True
        # set column width
        ws.column_dimensions[get_column_letter(col_num+1)].width = columns[col_num][1]

    for obj in queryset:
        row_num += 1
        row = [
            obj.pk,
            obj.title,
            obj.description,
        ]
        for col_num in xrange(len(row)):
            c = ws.cell(row=row_num, column=col_num)
            c.value = row[col_num]
            c.style.alignment.wrap_text = True

    wb.save(response)
    return response

export_xlsx.short_description = u"Export XLSX"

Conclusion

So whenever you need to get your Django project data to some spreadsheet application, there are several ways to do that. If you are planning to import the data to some other database, CSV is probably the best, as it is simple, straightforward, and requires no third-party libraries. However, if you need your data with nice formatting and maybe some statistical formulas, you should export XLS or XLSX format. The maximum amount of columns in XLS format is limited to 256, whereas XLSX format allows more columns, but is not fully supported by all spreadsheet applications.

6 comments:

  1. Can you give some pointers on how to import data into database from CSV file?

    ReplyDelete
  2. Excellent, thank you! One small correction though, for the Excel Binary Format code, I believe "for col_num in xrange(len(row)):" should be "for col_num in xrange(len(columns)):" instead (the line below col_styles = []).

    Thanks again dude.

    ReplyDelete
  3. These snippets were created based on the code for Django 1.3, but they should also work on Django 1.6.

    ReplyDelete
  4. In section "Office Open XML Format", if you want to create a new XLSX file, you should modify the code line
    c = ws.cell(row=row_num, column=col_num)
    to
    c = ws.cell(row=row_num+1, column=col_num+1)
    Otherwise, you are able to get the error "Invalid column 0 index".

    ReplyDelete