2009-02-19

Abstract Models and Dynamicly Assigned Foreign Keys

Model inheritance enables creating extensible apps. You can define a reusable core app which includes base.py with abstract models and models.py with models extending the abstract ones and inheriting all the features. In the specific project you can either use the core app directly, or create a specific app which models extend from the base abstract models of the core app and additionally introduce new features.

This is a quick example skipping all the unrelated parts like settings, urls, and templates:


  • core_project
    • apps
      • player
        • base.py
          from django.db impport models
          class PlayerBase(models.Model):
          name = models.CharField(max_length=100)
          class Meta:
          abstract = True

        • models.py
          from core_project.apps.player.base import PlayerBase
          class Player(PlayerBase):
          pass


  • specific_project

    • apps

      • player

        • models.py
          from core_project.apps.player.base import PlayerBase
          class Player(PlayerBase):
          points = models.IntegerField()



The concept works fine until you need to use foreign keys or many-to-many relations in the abstract models. As Josh Smeaton has already noticed, you can't set foreign keys to abstract models as they have no own database tables and they know nothing about the models which will extend them.

Let's say, we have the following situation: GameBase and MissionBase are abstract models and the model extending MissionBase should receive a foreign key to the model extending GameBase.



Thanks to Pro Django book by Marty Alchin, I understood how the models get created in the background. By default, all python classes are constructed by the type class. But whenever you use __metaclass__ property for your classes, you can define a different constructor. Django models are classes constructed by ModelBase class which extends the type class.

In order to solve the problem of foreign keys to the models extending the abstract classes, we can have a custom constructor extending the ModelBase class.

base.py
# -*- coding: utf-8 -*-
from django.db import models
from django.db.models.base import ModelBase
from django.db.models.fields import FieldDoesNotExist

class GameMissionCreator(ModelBase):
"""
The model extending MissionBase should get a foreign key to the model extending GameBase
"""
GameModel = None
MissionModel = None
def __new__(cls, name, bases, attrs):
model = super(GameMissionCreator, cls).__new__(cls, name, bases, attrs)
for b in bases:
if b.__name__=="GameBase":
cls.GameModel = model
elif b.__name__=="MissionBase":
cls.MissionModel = model
if cls.GameModel and cls.MissionModel:
try:
cls.MissionModel._meta.get_field("game")
except FieldDoesNotExist:
cls.MissionModel.add_to_class(
"game",
models.ForeignKey(cls.GameModel),
)
return model

class GameBase(models.Model):
__metaclass__ = GameMissionCreator

title = models.CharField(max_length=100)
class Meta:
abstract = True

class MissionBase(models.Model):
__metaclass__ = GameMissionCreator

title = models.CharField(max_length=100)
class Meta:
abstract = True


models.py
# -*- coding: utf-8 -*-
from base import *

class Game(GameBase):
pass

class Mission(MissionBase):
pass


GameMissionCreator is a constructor of GameBase, MissionBase, Game, and Mission classes. When it creates a class extending GameBase, the game model is registered as a property. When it creates a class extending MissionBase, the mission model is registered as a property. When both models are registered, a foreign key is added dynamically from one model to the other.

One drawback of this constructor-class example is that if there are more than one classes extending GameBase or MissionBase, then the code won't function correctly.

Anyway, the example shown illustrates the possible solution and gives a direction for further development of the idea.

2008-12-15

Clarification for Confusion: Creating New Django Fields

Let's say you've just published your first Django-based project. You've learned a lot, you know how to create a new app, views, forms, middleware, and template tags. That's cool, but there is still much space for making your code cleaner and better-organized. One of such improvements could be joining multiple fields into a single field.

To be clear, there are several types of fields in Django web development and all of them are related to each other. First of all, there are the fields of database tables in the lowest level. Then there are model fields representing the database fields in python. Form fields go on top in the abstraction level. Then there are bound fields which bound the form fields with data from the database (or somewhere else). And finally there are HTML fields which represent the rendered bound fields and which can be spiced up with Javascript for creating new widgets.

Now imagine, that you have a Contact model which includes two phone numbers where each of them has three parts. They are the country code, area code, and the serial number. In the simplest case you would have something like this:

COUNTRY_CODE_CHOICES = getattr(settings, "COUNTRY_CODE_CHOICES", (
("+44", "United Kingdom (+44)"),
("+49", "Germany (+49)"),
("+370", "Lithuania (+370)"),
))

class Contact(models.Model):
user = models.OneToOneField(User)
website = models.URLField(_("Website"))
primary_phone_country = models.CharField(
_("Primary Country Code"),
max_length=4,
choices=COUNTRY_CODE_CHOICES,
)
primary_phone_area = models.CharField(
_("Primary Area Code"),
max_length=4,
)
primary_phone_number = models.CharField(
_("Primary Number"),
max_length=10,
)
secondary_phone_country = models.CharField(
_("Secondary Country Code"),
max_length=4,
choices=COUNTRY_CODE_CHOICES,
blank=True,
)
secondary_phone_area = models.CharField(
_("Secondary Area Code"),
max_length=4,
blank=True,
)
secondary_phone_number = models.CharField(
_("Secondary Number"),
max_length=10,
blank=True,
)


But it seems to be such a mess when it could be something like this:

class Contact(models.Model):
user = models.OneToOneField(User)
website = models.URLField(_("Website"))
primary_phone = PhoneField(_("Primary Phone"))
secondary_phone = PhoneField(_("Secondary Code"), blank=True)


So the question now is how to create a model field that would be saved as three separate database fields and would be shown as three bound fields when editing. Let's have a look what information is available about custom model fields. There is a page in the documentation which might help us here. Also there are articles by David Cramer and Michael Elsdörfer which might help to solve this problem. In addition, I found a snippet by Myles Braithwaite which also deals about custom model fields.

Combining the ideas from the given sources with the analysis of the Django core code, I defined the PhoneField in the following way:

# -*- coding: utf-8 -*-
from django.db import models
from django.conf import settings
from django.utils.translation import ugettext_lazy as _

PHONE_COUNTRY_CODE_CHOICES = getattr(settings, "PHONE_COUNTRY_CODE_CHOICES", (
("+44", _("United Kingdom (+44)")),
("+49", _("Germany (+49)")),
("+370", _("Lithuania (+370)")),
))

class PhoneField(models.Field):
def __init__(self, verbose_name=None, country_code_choices=PHONE_COUNTRY_CODE_CHOICES, *args, **kwargs):
super(PhoneField, self).__init__(*args, **kwargs)
self.country_code_choices = country_code_choices
self.verbose_name = verbose_name

def contribute_to_class(self, cls, name):
self.name = name
if self.verbose_name is None and name:
self.verbose_name = name.replace('_', ' ')
# creating three model fields on the fly
models.CharField(
_("%s Country Code") % self.verbose_name,
max_length=4,
choices=self.country_code_choices,
blank=self.blank,
).contribute_to_class(cls, "%s_country" % name)
models.CharField(
_("%s Area Code") % self.verbose_name,
max_length=4,
blank=self.blank,
).contribute_to_class(cls, "%s_area" % name)
models.CharField(
_("%s Number") % self.verbose_name,
max_length=10,
blank=self.blank,
).contribute_to_class(cls, "%s_number" % name)
# when accessing the phone field by original model field name,
# we'll manage tuples of country code, area code, and number
setattr(cls, self.name, PhoneFieldCreator(self))

class PhoneFieldCreator(object):
def __init__(self, field):
self.field = field

def __get__(self, obj, type=None):
if obj is None:
raise AttributeError('Can only be accessed via an instance.')
country = obj.__dict__.get("%s_country" % self.field.name, None)
area = obj.__dict__.get("%s_area" % self.field.name, None)
number = obj.__dict__.get("%s_number" % self.field.name, None)
return (country and area and number) and (country, area, number) or None

def __set__(self, obj, value):
if isinstance(value, tuple) and len(value) == 3:
setattr(obj, "%s_country" % self.field.name, value[0])
setattr(obj, "%s_area" % self.field.name, value[1])
setattr(obj, "%s_number" % self.field.name, value[2])


Note that the instances of the model with the PhoneField can be filtered by any part of the phone number. In addition, the field value can be accessed as a tuple of three parts of the number. For example:

>>> Contact.objects.filter(primary_phone_country="+370")[0].primary_phone
(u'+370', u'628', u'12345')


Now the phone number can be presented in different ways in templates depending on the business requirements or local flavor:


{{ obj.primary_phone|join:"" }}
{{ obj.primary_phone|join:" " }}
{{ obj.primary_phone|join:"-" }}
{{ obj.primary_phone.0 }} ({{ obj.primary_phone.1 }}) {{ obj.primary_phone.2 }}
{{ obj.primary_phone_country }} ({{ obj.primary_phone_area }}) {{ obj.primary_phone_number }}

2008-10-21

Django Poor Man's Cron

As I have promised in the last blog entry, I am sharing a link to an experimental conceptual app Poor Man's Cron which tends to exploit various bots to implement scheduled tasks.

Criticisms, praises, thanks, and suggestions are very welcome!

2008-10-16

Exploiting Spammers, Search Engines, and Other Bots

Django is a great platform for building web projects in a clean and manageable way. But when you want to find its relationship with poverty for a blog post on Blog Action Day, it becomes a difficult puzzle. But still that's a great challenge!

Django has a large community all over the world and someone who is gold at statistics could write a cool analytical report about djangoers' relations with poverty in the world or the user activity on Blog Action Day site using the maps and numbers provided in the links.

The Django community or the social groups mentioned above are just small parts of the big network. Looking at the bigger image of the world, internet, and the future, you might realize that everybody is connected to somebody and something, all those connections get defined in social networks, and the web is like a living and evolving organism. We are kind of going towards a machine that we can use and it can use us (See the Matrix).

We can look at a newly created site based on Django as at a small node in the web. The site together with its visitors forms a network. Different APIs and inter-site communication extend the network even more. One part of this network is various bots, i.e. search engine indexers and comment spambots. Thank to the error-reporting mechanism in Django, recently I found out that you can find bugs in large-scale projects not only by manual browsing or test cases. Bugs might be detected by visits of spammers and search engines. Broken pages or unexpected usage are reported by email to the administrator of the site. Moreover, I started brainstorming how to exploit spammers instead of fighting against them.

And then I remembered the Poor Man's Cron module for Drupal. The module is an alternative for projects which have no cron jobs supported. It runs scheduled tasks in approximately regular intervals of time, triggered by page views. Executing scheduled tasks via the page views for people might be annoying if that takes much time. But I wouldn't mind to waste a few seconds of machines. On one hand this idea supports the poverty theme for the Blog Action Day, because this kind of module is dedicated to those who can't afford getting a server with cron jobs. On another hand, we can punish spammers using their time for our needs. When the tasks are quite time consuming and takes 5 or more minutes to execute, let's use the time of comment spammers. Spammers might be recognized by captchas, Akismet filters, and similar technologies. When more precise intervals between executions are necessary, let's exploit the feeds-subscriber calls and search indexers. Those can be recognized by request.META['HTTP_USER_AGENT'].

I shall point you to the experimental conceptual code of this idea soon.



Similar projects:
http://code.google.com/p/django-cron/
http://www.djangosnippets.org/snippets/1126/

How would you suggest to exploit spambots? What drawbacks do you see from the suggested approach?



P.S. To those who reside in Berlin! Today at 19:00 there will be Djangoers' meetup at newthinking store, Tucholskystraße 48, 10117 Berlin.

2008-09-13

Gotchas about Fixtures for Initial Data

One part of Django testing environment is fixtures. They are pre-prepared datasets to fill into the database before running tests. Django uses a separate database for running tests, but the fixtures from the files initial_data.* are also loaded into the main database when you synchronize it.

I make fixtures for my application like this:
python manage.py dumpdata --format=json --indent=4 myapp > apps/myapp/fixtures/initial_data.json 


The initial data is read out from apps/myapp/fixtures/initial_data.json and written to the main or the test database when I synchronize the database
python manage.py syncdb

or when I run the tests
python manage.py test myapp


Fixtures are great for deploying standard data like predefined categories, lists of countries and languages, default flatpages, default navigation, default user groups, and so on. However, you should be very cautious with them while developing.

When I create new models, it's common practice to me to sync db so that the new database tables are created and the database schema reflects the models. Just after creation of new tables all fixtures called initial_data from all applications will be loaded. N.B. The fixtures from initial_data overwrite all existing data while synchronizing database. So if you have some important data that differs from the defaults, better make some database backup before syncing or use sql command to print out the SQL statements and execute them for the database manually:
python manage.py sql myapp_with_new_models


You might have pre_save signal handlers or custom save methods (check an example below) which should recognize newly created objects and do something special with them, i.e. prepare PDF reports, generate images, send emails, index for global text search, or something else. Usually in such cases I checked the existence of the primary key: the object is new if it has no primary key. But this is wrong when you use fixtures, because fixtures come with primary keys. N.B. The object is new only if there is no object in the database which primary key equals to the primary key of the current object.

class MyModel(models.Model):
...
def save(self, *args, **kwargs):
is_new = True
pk = self._get_pk_val()
model = type(self)
if pk and model._default_manager.filter(pk=pk):
is_new = False
# something before saving
super(model, self).save(*args, **kwargs)
# something after saving


aka

class MyModel(models.Model):
...
def save(self, *args, **kwargs):
is_new = True
if self.id and MyModel.objects.filter(id=self.id):
is_new = False
# something before saving
super(MyModel, self).save(*args, **kwargs)
# something after saving


Another alternative for storing default data would be custom sql located at apps/myapp/sql/mymodel.sql, but I haven't tried that yet and probably won't.

BTW, happy programmer day!

2008-09-03

A Note on Python Paths

This time I decided to share some knowledge about Python paths which seemed a little bit confusing to me in the beginning of diving into Python. I am working with Django in different platforms like Mac OS X, Windows, and Linux, therefore the common patterns how to activate new python modules in all of those environments should be familiar to me.

Python modules are either *.py files or directories containing __init__.py. When defining paths to python modules, you will usually need to deal with the latter ones. A module is meant to be under python path if you can run python and import that module.

For example, if you can run the following, then django is under your python path.
python
>>> import django


Stay tuned to get deeper into python paths.

Installing modules



If a module is installable, usually all you need to do is to extract its setup directory, cd to it, and run
python setup.py install

This will copy the module into the site-packages directory of the current python installation. It might be that you have multiple Python versions on your computer. According to django documentation, you can find the currently used site-packages by
python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

Or you can use PEAK EasyInstall for installing python modules even faster.

But sometimes you will need the latest and greatest versions of your modules directly from version control system. To make them accessible from python you should either check them out directly to site-packages (very messy and inflexible) or keep them somewhere else and do some additional magic.

Sym-linking



You can create symbolic links (symlinks) in unix-based systems like Linux or Mac OS X. A symlink is like a shortcut to a file or directory. If you create a symlink in site-packages which points to a python module which is located somewhere else, it will work as if the module was copied into site-packages.

To create a symlink, type the following in a console/terminal:
ln -s <source> <target>

For example, if you want python to access django which is under /Library/Subversion/django_src/trunk/django, you need to write something like this (considering that your site-packages are at /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/)
ln -s /Library/Subversion/django_src/trunk/django /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django

To delete the symlink, simply remove it (this won't delete the original module):
rm /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django


But as I've already mentioned, this works only in unix-based environments and you can't use shortcuts in Windows for the same purpose.

*.pth files



Python supports *.pth files which contain the paths to the parent directories of your python modules one per line. Those files should be placed in site-packages and can be called whatever you want them to call. For example, you can create a file my-project.pth and write
/Library/Subversion/django_src/trunk
/Library/Subversion/myproject_src/trunk

or
C:\Subversion\django_src\trunk
C:\Subversion\myproject\trunk

into it. Then django and your project files will be importable in python.

However, you might have no permissions to create files under site-packages or you might need to activate different locations of python modules for different projects.

PYTHONPATH variable



The other way is to set additional paths for python just before running the externally kept modules. This is done by setting the python paths to the environment variable PYTHONPATH. Note again that python paths point not to the modules themselves, but to their parent directories!

The syntax slightly differs among different platforms.

Linux and Mac OS X:
# checking value
echo $PYTHONPATH
# setting value
export PYTHONPATH="/Library/Subversion/django_src/trunk"
# appending to the existing value
export PYTHONPATH="$PYTHONPATH;/Library/Subversion/django_src/trunk"

Windows:
# checking value
echo %PYTHONPATH%
# setting value
set PYTHONPATH="C:\\Subversion\\django_src\\trunk"
# appending to the existing value
set PYTHONPATH="%PYTHONPATH%;C:\\Subversion\\django_src\\trunk"


Multiple paths can be separated by a colon (";").

PYTHONPATH can be used in scripts and webserver configuration files, but it is not very comfortable in daily use.

Adding paths to sys.path



For the projects that you develop and which should run as standalone applications, you can set the required python paths relatively inside your python code.

Note that all python paths which you set in the PYTHONPATH variable or *.pth files as well as the path of default python libraries and the path of site-packages get listed in python variable sys.path. When you import a module, it is loaded from the first location which contains the required module. So if you have two paths to different django versions in your python paths and you import django, the django version from the first location will be used.

You can read the list of loaded python paths like this:
>>> import sys
>>> sys.path


You can also freely modify it, for example:
>>> import sys
>>> sys.path.append("/Library/Subversion/django_src/trunk")
>>> import django


And this is an example, how to get and use paths relative to the currently loaded file:
import os, sys

SVN_PATH = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
DJANGO_PATH = os.path.join(SVN_PATH, "django_src", "trunk")
PROJECT_PATH = os.path.join(SVN_PATH, "myproject", "trunk")

sys.path += [DJANGO_PATH, PROJECT_PATH]



I hope this introduction was useful for developers and made the big picture about the paths clearer.

Some more related information can be found at the official python documentation.

2008-08-08

MultipleSubmitButton Widget for ChoiceField

Recently I published a snippet with a widget rendering a choice field as a series of submit buttons.

So the {{ form.do }} field from the following form:

SUBMIT_CHOICES = (
('save', _("Save")),
('save-add', _("Save and Add Another")),
)

class TestForm(forms.Form):
do = forms.ChoiceField(
widget=MultipleSubmitButton,
choices=SUBMIT_CHOICES,
)



will be rendered as:

<ul>
<li><button type="submit" name="do" value="save">Save</button></li>
<li><button type="submit" name="do" value="save-add">Save and Add Another</button></li>
</ul>


Can somebody enhance this widget so that it supports iteration through different choices and getting specific buttons by indexes in the template? My trials failed, but maybe you will succeed!

2008-07-16

EuroPython 2008

I started writing this post at the airport just before flying back from Vilnius to Berlin. EuroPython 2008 - the three-days conference and sprints afterwards - gave me loads of information, new relations, and better understanding of the big image of Python developers' community.

EuroPython

Some highlights from the conference follow.
  • During video conference the author of Python programming language Guido van Rossum encouraged using Python 2.6 just after the release, but said that Python 3.0 still shouldn't be used for production. The good future-proof practices are inheriting from object for classes, using xrange(), sorted(), and zip() functions, "//" for floor divisions, and dict.iterkeys(). The full porting to 3.0 guide will be probably written and published in the python.org site.

  • Clone Digger is a project from Google Summer of Code 2008 for finding similar pieces of code which could be optimized by moving that to parameterized functions.

  • Restrictive Python is a project which blocks the user from executing dangerous functions or methods or accessing files which might harm the system. You can try Restrictive Python out in the simulated terminal.

  • Python is an interpreted language and if you want to deploy a project writen in Python, you have to be sure that the end user will have Python interpreter installed there. O you can create an executable file from Python project using one of the following tools: Py2exe, PyInstaller, or BBFreeze.

  • There is a combined installer of Python, SQLite, and Django for Windows and it is called Instant Django.

  • Django DB External Scheme Evolution Branch is a separate application for doing DB scheme changes easily without manual DB field editing.

  • Honza Král showed the essentials of the new-forms admin to which I am going to port our Django projects the next days. Directly from him I learned that new forms will still have validators which will work only to check the value of that one specific field (not related to other fields). If some multiple field validation is necessary, the clean method for the form should be overwritten. All the global field-unrelated form errors will be passed to {{ form.non_field_errors }} variable.

  • Christian Scholtz presented PYOGP which is Python-based environment for testing the Open Grid Protocol used for exchanging avatars and other objects among virtual realms like different regions of Second Life.

  • Ricardo Quesada and Lucio Torre told us about cocos2d - a framework made in Python for presentations and sprite-based 2D games.

  • If you ever need to export some data from a database to Excel Sheet or import from Excel, you can do that not only in CSV (Comma Separated Values) format. You can also use the original XLS format with one of the following libraries: xlrd, xlwt, or pyExcelerator. I successfully used the latter one last week.

  • CVS and SVN are not the only version control management systems in the world :D. The others like Mercurial, Git and Bazaar are also worthy checking. These are all distributed systems which mean that they don't have one centralized source. Bazaar is written in Python and has API for manipulating information about revisions, committers, and other version-control-related stuff.


Some other useful links:


Others blogged about EuroPython 2008 as well:

2008-06-02

Django Meetup Postmortem

Thanks to the organizer Jannis, last Friday German djangoers met successfully, got some beer at Schleusenkrug in Tiergarten, Berlin, and became better aquainted with each other. We discussed the latest actualities like licenses, project deployment, business models, the future of Django, who from the gods of Django said what in which blog, and other geeky stuff.

Some djangoers on the left side of the table

Some djangoers on the right side of the table

If you want to get in touch with German djangoers, you can join the #django-de IRC channel on irc.freenode.net. Also you can subscribe to their blogs at django-de.org.

2008-05-27

Django Meetup at Linuxtag

Pssst. Don't tell anybody! There is an unofficial Django user meetup happening in Berlin this Friday at five.

datetime.datetime(2008, 5, 27, 17, 0)


The gathering point is at the main entrance of Linuxtag conference (Messe Berlin).