2009-05-05

Applying Patches in Guerrilla Way

How many times did you submit patches that have never been approved? How many times did you need to patch Django for some third-party application to work? How many times did you bumped into the wall working on real-world projects just because of some bugfixes waiting for tests or nicer solution? In all those cases, you should not panic nor run into circles, because you can still apply the existing patches in guerrilla way (or so called monkey patching).

Guerrilla Advertising might also be powerful, but it's always controversial

Guerrilla patches overwrite existing code not on a disk, but in memory at execution time. Everything is an object in Python. Numbers, strings, functions, methods, classes, instances, modules, etc. are objects. Each object might be conceived as an attribute of a module where it is defined. In addition, properties and methods of classes can be conceived as attributes of the class. The principle of guerrilla patching is loading a module or class which attributes need to be changed, and assigning new values to those attributes before the module or class is used. All that has to happen somewhere in the beginning of execution. Usually, I put or import guerrilla patches in models.py of some app, because all models are loaded in one of the first steps of execution. So if you need to change a full class (DjangoClass), you would import the module (djangomodule) where the class resides, create another class (NewDjangoClass) and assign it to the attribute of the module by the name of the original class (djangomodule.DjangoClass = NewDjangoClass). If you need to change just one method (django_method) of a class (DjangoClass), then you would import the class, write a new function which takes the object to which it will be assigned as the first argument (def new_django_method(self, *args, **kwargs)) and assigned it to the class by the name of the original function (DjangoClass.django_method = new_django_method).

Let's have a look at an example. You might need nested fieldsets for more organized grouping, suggested in the ticket #10590. That allows you to create fieldset definitions as items of the 'fields' value of a parent fieldset.

As you might see from the applied patch nested_fieldsets.diff, the main change of the patch is changing most of the code for classes Fieldset and Fieldline residing in django/contrib/admin/helpers.py, also changing the function flatten_fieldsets in django/contrib/admin/util.py, introducing a new template tag and finally modifying a template.

The file with guerrilla patches should import helpers, util, and all other modules that will be used in the overwritten objects. Browse through the code below and check the differences with the original patch to understand the concept better.


from django import forms
from django.contrib.admin import helpers
from django.contrib.admin import util
from django.contrib.admin import options
from django.utils.safestring import mark_safe
from django.forms.formsets import all_valid
from django.conf import settings

### Guerilla patches for nested fieldsets

def flatten_fieldsets(fieldsets):
"""Returns a list of field names from an admin fieldsets structure."""
field_names = []
for name, opts in fieldsets:
for field in opts['fields']:
if isinstance(field, (tuple, list)):
if len(field)==2 and isinstance(field[1], dict):
# it's a nested fieldset
field_names.extend(flatten_fieldsets((field,)))
else:
# it's a tuple of field names
field_names.extend(field)
else:
# it's a field name
field_names.append(field)
return field_names

options.flatten_fieldsets = util.flatten_fieldsets = flatten_fieldsets

class Fieldset(object):
is_fieldset = True
def __init__(self, form, name=None, fields=(), classes=(), description=None, level=0):
self.form = form
self.name, self.fields = name, fields
self.classes = u' '.join(classes)
self.description = description
self.level = level

def _media(self):
if 'collapse' in self.classes:
return forms.Media(js=['%sjs/admin/CollapsedFieldsets.js' % settings.ADMIN_MEDIA_PREFIX])
return forms.Media()
media = property(_media)

def __iter__(self):
for field in self.fields:
if (len(field)==2 and isinstance(field[1], dict)):
# nested fieldset
yield Fieldset(self.form,
name=field[0],
fields=field[1].get("fields", ()),
classes=field[1].get("classes", ()),
description=field[1].get("description", ()),
level=self.level + 1,
)
else:
# field name or a tuple of field names
yield helpers.Fieldline(self.form, field)

helpers.Fieldset = Fieldset

class InlineFieldset(Fieldset):
def __init__(self, formset, *args, **kwargs):
self.formset = formset
super(InlineFieldset, self).__init__(*args, **kwargs)

def __iter__(self):
fk = getattr(self.formset, "fk", None)
for field in self.fields:
if fk and fk.name == field:
continue
if (len(field)==2 and isinstance(field[1], dict)):
# nested fieldset
yield Fieldset(self.form,
name=field[0],
fields=field[1].get("fields", ()),
classes=field[1].get("classes", ()),
description=field[1].get("description", ()),
level=self.level + 1,
)
else:
# field name or a tuple of field names
yield helpers.Fieldline(self.form, field)

helpers.InlineFieldset = InlineFieldset



Further, the template can be overwritten putting your own template path before the django admin path in settings.TEMPLATE_DIRS. The missing template tag can be defined in your own app. Just don't forget to load the library in the template ({% load mytemplatetags %}). CSS might be overwritten by including a custom CSS file to the bottom of HEAD section in a modified admin/change_form.html template.

That's all what you need and you'll be controlling the situation without branching Django. You have full power, but don't overuse this technique as it might be difficult to manage the updates, especially when you are guerrilla-patching undocumented features or helping functions which change backwards-incompatibly in the future. Still guerrilla-patched code in a third-party app is better than a requirement to patch Django itself.

Guerrilla patching might also be used in a similar way to do some slight modification to other third party apps which are not scalable by design. For example, you can add additional fields or modify methods of third-party apps which will be updatable separately.

Tomorrow (Wednesday) evening I am leaving the wonderful Prague and EuroDjangoCon. If you are here as well and want to discuss some concepts which I introduced in this blog, that's the last chance to meet me live. :D

2009-04-12

Design Pattern for Reusable Apps: Getting into Details

This article is an attempt to explain a technique for creating reusable and overridable apps that were partly mentioned in my last post.

First of all, let's remember the presentation by James Bennett about reusable apps from DjangoCon 2008:



Also the upgraded version Real World Django by Jacob Kaplan-Moss presented at PyCon 2009 should be checked.

Imagine, that you have a bunch of predefined apps somewhere under your python path. Some of them have models with relations to models from other apps. You will surely want to reuse those apps for different projects. You'll want to be able to activate the necessary set of apps. Probably sometimes you'll want to use an extended version of an existing app. As shown in the presentations given above, reusable apps should allow you easy overrides. Although all apps should be either under python path directly or under a package, nothing should be set in stone. The method get_model() should be used to import models from installed apps. But apps consist of more components than just models. There are urls, views, forms, template tags, context processors, middlewares, etc.

Let's have a look at a simple example. Two reusable apps could be events and venues where events would contain an Event model with a foreign key to Venue model from the venues app. In order to make both models overridable, I'll put the definitions in abstract base classes in the files base.py. The files models.py will import from base.py and create the leaf classes which will be the models to use further in forms and elsewhere.


  • venues

    • __init__.py

    • base.py
      from django.db import models

      class VenueBase(models.Model):
      title = models.CharField(...)
      street_address = models.CharField(...)
      postal_code = models.CharField(...)
      city = models.CharField(...)
      country = models.CharField(...)

      class Meta:
      abstract = True


    • models.py
      from venues.base import *

      class Venue(VenueBase):
      pass





  • events

    • __init__.py

    • base.py
      from django.db import models

      Venue = models.get_model("venues", "Venue")

      class EventBase(models.Model):
      title = models.CharField(...)
      venue = models.ForeignKey(Venue)
      from_date = models.DateField(...)
      to_date = models.DateField(...)

      class Meta:
      abstract = True



    • models.py
      from events.base import *

      class Event(EventBase):
      pass






Venues (as well as events) will probably have some urls, views, and forms which should be extensible too.


  • venues

    • ...

    • forms.py
      from django.db import models
      from django.forms import ModelForm
      Venue = models.get_model("venues", "Venue")

      class VenueForm(ModelForm):
      class Meta:
      model = Venue



    • views.py

      from django.db import models
      from django.utils import importlib
      from django.shortcuts import render_to_response

      # importing the form depending on the path of installed app
      venues_app = models.get_app("venues")
      venues_forms = importlib.import_module(venues_app.__name__[:-6] + "forms")
      VenueForm = venues_forms.VenueForm

      def add_venue(request):
      if request.method == "POST":
      form = VenueForm(request.POST, request.FILES)
      if form.is_valid():
      form.save()
      else:
      form = VenueForm()
      return render_to_response("venues/add_venue.html", {
      "form": form,
      })

      def change_venue(request, pk):
      ...



    • urls.py

      from django.conf.urls.defaults import *
      from django.db import models
      from django.utils import importlib

      # importing the form depending on the path of installed app
      venues_app = models.get_app("venues")
      venues_views = importlib.import_module(venues_app.__name__[:-6] + "views")

      urlpatterns = patterns('',
      ...
      url(r'^add/$', venues_views.add_venue, name='add_venue'),
      url(r'^(\d+)/$', venues_views.change_venue, name='change_venue'),
      ...
      )






You might ask why such a strange and tricky importing is done in the views and urls. If we know the path to the app, it is not going to change or something. Even James Bennett himself didn't see any advantages of importing forms and views dynamically when closing ticket #10703 for Django:
I don't really see the utility of this – a properly-written Django application is just a Python module, and is importable the same as any other Python module. It's not like that module is suddenly going to have a different path (and if it does, you're doing something wrong).


The answer hides in the overriding apps.

The combo of current events and venues might fit most of your projects. But then you might need a specific venues app with additional features for a specific project. Note, that you probably don't want to change or recreate the events app just because of the relation to the venues. The app and model names can be still the same, just put the new app under your new project and include the new app into INSTALLED_APPS instead of the original app. Project name will serve as a namespace for distinguishing the original app and the overridden app.

If you want to extend just the model, but not the other components, your specific app might look like this:


  • myproject

    • venues

      • __init__.py

      • models.py
        from venues.base import VenueBase

        class Venue(VenueBase):
        description = models.TextField(...)



      • forms.py
        from venues.forms import *



      • views.py
        from venues.views import *



      • urls.py
        from venues.urls import *







If the model is OK as is, but you need an additional behavior for the form, like saving an image for each venue, you can extend the form instead:


  • myproject

    • venues

      • __init__.py

      • models.py
        from venues.models import *



      • forms.py
        from django import forms
        from venues.forms import VenueForm as VenueFormBase

        class VenueForm(VenueFormBase):
        image = forms.ImageField(...)
        def save(self, *args, **kwargs):
        super(VenueForm, self).save(*args, **kwargs)
        ...



      • views.py
        from venues.views import *



      • urls.py
        from venues.urls import *







If models and forms are alright, but you want to add an additional view, you can do that too:


  • myproject

    • venues

      • __init__.py

      • models.py
        from venues.models import *



      • forms.py
        from venues.forms import *



      • views.py
        from venues.views import *

        def delete_venue(request, pk):
        ...



      • urls.py
        from venues.urls import *

        urlpatterns += patterns('',
        url(r'^(\d+)/delete/$', venues_views.delete_venue, name='delete_venue'),
        )







As you see from the case examples above, you don't have to duplicate any code of the original app. Still you can now enhance or overwrite just specific parts that you need very flexibly. The events could be extended in the same manner. If the original extensible apps live under a specific package, then the imports in the extending app should be changed appropriately.

Ufff.. That's about all what I wanted to explain this time. If you have any questions, suggestions or disagree with some concepts, please write a comment bellow.

If you reside in Berlin or going to be here next Wednesday by accident or so, don't miss the Django meetup at Schleusenkrug at 19:30.

Also have a nice Easter Holiday!

Django Easter Egg

2009-02-19

Abstract Models and Dynamicly Assigned Foreign Keys

Model inheritance enables creating extensible apps. You can define a reusable core app which includes base.py with abstract models and models.py with models extending the abstract ones and inheriting all the features. In the specific project you can either use the core app directly, or create a specific app which models extend from the base abstract models of the core app and additionally introduce new features.

This is a quick example skipping all the unrelated parts like settings, urls, and templates:


  • core_project
    • apps
      • player
        • base.py
          from django.db impport models
          class PlayerBase(models.Model):
          name = models.CharField(max_length=100)
          class Meta:
          abstract = True

        • models.py
          from core_project.apps.player.base import PlayerBase
          class Player(PlayerBase):
          pass


  • specific_project

    • apps

      • player

        • models.py
          from core_project.apps.player.base import PlayerBase
          class Player(PlayerBase):
          points = models.IntegerField()



The concept works fine until you need to use foreign keys or many-to-many relations in the abstract models. As Josh Smeaton has already noticed, you can't set foreign keys to abstract models as they have no own database tables and they know nothing about the models which will extend them.

Let's say, we have the following situation: GameBase and MissionBase are abstract models and the model extending MissionBase should receive a foreign key to the model extending GameBase.



Thanks to Pro Django book by Marty Alchin, I understood how the models get created in the background. By default, all python classes are constructed by the type class. But whenever you use __metaclass__ property for your classes, you can define a different constructor. Django models are classes constructed by ModelBase class which extends the type class.

In order to solve the problem of foreign keys to the models extending the abstract classes, we can have a custom constructor extending the ModelBase class.

base.py
# -*- coding: utf-8 -*-
from django.db import models
from django.db.models.base import ModelBase
from django.db.models.fields import FieldDoesNotExist

class GameMissionCreator(ModelBase):
"""
The model extending MissionBase should get a foreign key to the model extending GameBase
"""
GameModel = None
MissionModel = None
def __new__(cls, name, bases, attrs):
model = super(GameMissionCreator, cls).__new__(cls, name, bases, attrs)
for b in bases:
if b.__name__=="GameBase":
cls.GameModel = model
elif b.__name__=="MissionBase":
cls.MissionModel = model
if cls.GameModel and cls.MissionModel:
try:
cls.MissionModel._meta.get_field("game")
except FieldDoesNotExist:
cls.MissionModel.add_to_class(
"game",
models.ForeignKey(cls.GameModel),
)
return model

class GameBase(models.Model):
__metaclass__ = GameMissionCreator

title = models.CharField(max_length=100)
class Meta:
abstract = True

class MissionBase(models.Model):
__metaclass__ = GameMissionCreator

title = models.CharField(max_length=100)
class Meta:
abstract = True


models.py
# -*- coding: utf-8 -*-
from base import *

class Game(GameBase):
pass

class Mission(MissionBase):
pass


GameMissionCreator is a constructor of GameBase, MissionBase, Game, and Mission classes. When it creates a class extending GameBase, the game model is registered as a property. When it creates a class extending MissionBase, the mission model is registered as a property. When both models are registered, a foreign key is added dynamically from one model to the other.

One drawback of this constructor-class example is that if there are more than one classes extending GameBase or MissionBase, then the code won't function correctly.

Anyway, the example shown illustrates the possible solution and gives a direction for further development of the idea.

2008-12-15

Clarification for Confusion: Creating New Django Fields. Part 1

Let's say you've just published your first Django-based project. You've learned a lot, you know how to create a new app, views, forms, middleware, and template tags. That's cool, but there is still much space for making your code cleaner and better-organized. One of such improvements could be joining multiple fields into a single field.

To be clear, there are several types of fields in Django web development and all of them are related to each other. First of all, there are the fields of database tables in the lowest level. Then there are model fields representing the database fields in python. Form fields go on top in the abstraction level. Then there are bound fields which bound the form fields with data from the database (or somewhere else). And finally there are HTML fields which represent the rendered bound fields and which can be spiced up with Javascript for creating new widgets.

Now imagine, that you have a Contact model which includes two phone numbers where each of them has three parts. They are the country code, area code, and the serial number. In the simplest case you would have something like this:

COUNTRY_CODE_CHOICES = getattr(settings, "COUNTRY_CODE_CHOICES", (
("+44", "United Kingdom (+44)"),
("+49", "Germany (+49)"),
("+370", "Lithuania (+370)"),
))

class Contact(models.Model):
user = models.OneToOneField(User)
website = models.URLField(_("Website"))
primary_phone_country = models.CharField(
_("Primary Country Code"),
max_length=4,
choices=COUNTRY_CODE_CHOICES,
)
primary_phone_area = models.CharField(
_("Primary Area Code"),
max_length=4,
)
primary_phone_number = models.CharField(
_("Primary Number"),
max_length=10,
)
secondary_phone_country = models.CharField(
_("Secondary Country Code"),
max_length=4,
choices=COUNTRY_CODE_CHOICES,
blank=True,
)
secondary_phone_area = models.CharField(
_("Secondary Area Code"),
max_length=4,
blank=True,
)
secondary_phone_number = models.CharField(
_("Secondary Number"),
max_length=10,
blank=True,
)


But it seems to be such a mess when it could be something like this:

class Contact(models.Model):
user = models.OneToOneField(User)
website = models.URLField(_("Website"))
primary_phone = PhoneField(_("Primary Phone"))
secondary_phone = PhoneField(_("Secondary Code"), blank=True)


So the question now is how to create a model field that would be saved as three separate database fields and would be shown as three bound fields when editing. Let's have a look what information is available about custom model fields. There is a page in the documentation which might help us here. Also there are articles by David Cramer and Michael Elsdörfer which might help to solve this problem. In addition, I found a snippet by Myles Braithwaite which also deals about custom model fields.

Combining the ideas from the given sources with the analysis of the Django core code, I defined the PhoneField in the following way:

# -*- coding: utf-8 -*-
from django.db import models
from django.conf import settings
from django.utils.translation import ugettext_lazy as _

PHONE_COUNTRY_CODE_CHOICES = getattr(settings, "PHONE_COUNTRY_CODE_CHOICES", (
("+44", _("United Kingdom (+44)")),
("+49", _("Germany (+49)")),
("+370", _("Lithuania (+370)")),
))

class PhoneField(models.Field):
def __init__(self, verbose_name=None, country_code_choices=PHONE_COUNTRY_CODE_CHOICES, *args, **kwargs):
super(PhoneField, self).__init__(*args, **kwargs)
self.country_code_choices = country_code_choices
self.verbose_name = verbose_name

def contribute_to_class(self, cls, name):
self.name = name
if self.verbose_name is None and name:
self.verbose_name = name.replace('_', ' ')
# creating three model fields on the fly
models.CharField(
_("%s Country Code") % self.verbose_name,
max_length=4,
choices=self.country_code_choices,
blank=self.blank,
).contribute_to_class(cls, "%s_country" % name)
models.CharField(
_("%s Area Code") % self.verbose_name,
max_length=4,
blank=self.blank,
).contribute_to_class(cls, "%s_area" % name)
models.CharField(
_("%s Number") % self.verbose_name,
max_length=10,
blank=self.blank,
).contribute_to_class(cls, "%s_number" % name)
# when accessing the phone field by original model field name,
# we'll manage tuples of country code, area code, and number
setattr(cls, self.name, PhoneFieldCreator(self))

class PhoneFieldCreator(object):
def __init__(self, field):
self.field = field

def __get__(self, obj, type=None):
if obj is None:
raise AttributeError('Can only be accessed via an instance.')
country = obj.__dict__.get("%s_country" % self.field.name, None)
area = obj.__dict__.get("%s_area" % self.field.name, None)
number = obj.__dict__.get("%s_number" % self.field.name, None)
return (country and area and number) and (country, area, number) or None

def __set__(self, obj, value):
if isinstance(value, tuple) and len(value) == 3:
setattr(obj, "%s_country" % self.field.name, value[0])
setattr(obj, "%s_area" % self.field.name, value[1])
setattr(obj, "%s_number" % self.field.name, value[2])


Note that the instances of the model with the PhoneField can be filtered by any part of the phone number. In addition, the field value can be accessed as a tuple of three parts of the number. For example:

>>> Contact.objects.filter(primary_phone_country="+370")[0].primary_phone
(u'+370', u'628', u'12345')


Now the phone number can be presented in different ways in templates depending on the business requirements or local flavor:


{{ obj.primary_phone|join:"" }}
{{ obj.primary_phone|join:" " }}
{{ obj.primary_phone|join:"-" }}
{{ obj.primary_phone.0 }} ({{ obj.primary_phone.1 }}) {{ obj.primary_phone.2 }}
{{ obj.primary_phone_country }} ({{ obj.primary_phone_area }}) {{ obj.primary_phone_number }}


Let's check how to create model fields which have single database table field, but multiple form fields in the Part 2.

2008-10-21

Django Poor Man's Cron

As I have promised in the last blog entry, I am sharing a link to an experimental conceptual app Poor Man's Cron which tends to exploit various bots to implement scheduled tasks.

Criticisms, praises, thanks, and suggestions are very welcome!

2008-10-16

Exploiting Spammers, Search Engines, and Other Bots

Django is a great platform for building web projects in a clean and manageable way. But when you want to find its relationship with poverty for a blog post on Blog Action Day, it becomes a difficult puzzle. But still that's a great challenge!

Django has a large community all over the world and someone who is gold at statistics could write a cool analytical report about djangoers' relations with poverty in the world or the user activity on Blog Action Day site using the maps and numbers provided in the links.

The Django community or the social groups mentioned above are just small parts of the big network. Looking at the bigger image of the world, internet, and the future, you might realize that everybody is connected to somebody and something, all those connections get defined in social networks, and the web is like a living and evolving organism. We are kind of going towards a machine that we can use and it can use us (See the Matrix).

We can look at a newly created site based on Django as at a small node in the web. The site together with its visitors forms a network. Different APIs and inter-site communication extend the network even more. One part of this network is various bots, i.e. search engine indexers and comment spambots. Thank to the error-reporting mechanism in Django, recently I found out that you can find bugs in large-scale projects not only by manual browsing or test cases. Bugs might be detected by visits of spammers and search engines. Broken pages or unexpected usage are reported by email to the administrator of the site. Moreover, I started brainstorming how to exploit spammers instead of fighting against them.

And then I remembered the Poor Man's Cron module for Drupal. The module is an alternative for projects which have no cron jobs supported. It runs scheduled tasks in approximately regular intervals of time, triggered by page views. Executing scheduled tasks via the page views for people might be annoying if that takes much time. But I wouldn't mind to waste a few seconds of machines. On one hand this idea supports the poverty theme for the Blog Action Day, because this kind of module is dedicated to those who can't afford getting a server with cron jobs. On another hand, we can punish spammers using their time for our needs. When the tasks are quite time consuming and takes 5 or more minutes to execute, let's use the time of comment spammers. Spammers might be recognized by captchas, Akismet filters, and similar technologies. When more precise intervals between executions are necessary, let's exploit the feeds-subscriber calls and search indexers. Those can be recognized by request.META['HTTP_USER_AGENT'].

I shall point you to the experimental conceptual code of this idea soon.



Similar projects:
http://code.google.com/p/django-cron/
http://www.djangosnippets.org/snippets/1126/

How would you suggest to exploit spambots? What drawbacks do you see from the suggested approach?



P.S. To those who reside in Berlin! Today at 19:00 there will be Djangoers' meetup at newthinking store, Tucholskystraße 48, 10117 Berlin.

2008-09-13

Gotchas about Fixtures for Initial Data

One part of Django testing environment is fixtures. They are pre-prepared datasets to fill into the database before running tests. Django uses a separate database for running tests, but the fixtures from the files initial_data.* are also loaded into the main database when you synchronize it.

I make fixtures for my application like this:
python manage.py dumpdata --format=json --indent=4 myapp > apps/myapp/fixtures/initial_data.json 


The initial data is read out from apps/myapp/fixtures/initial_data.json and written to the main or the test database when I synchronize the database
python manage.py syncdb

or when I run the tests
python manage.py test myapp


Fixtures are great for deploying standard data like predefined categories, lists of countries and languages, default flatpages, default navigation, default user groups, and so on. However, you should be very cautious with them while developing.

When I create new models, it's common practice to me to sync db so that the new database tables are created and the database schema reflects the models. Just after creation of new tables all fixtures called initial_data from all applications will be loaded. N.B. The fixtures from initial_data overwrite all existing data while synchronizing database. So if you have some important data that differs from the defaults, better make some database backup before syncing or use sql command to print out the SQL statements and execute them for the database manually:
python manage.py sql myapp_with_new_models


You might have pre_save signal handlers or custom save methods (check an example below) which should recognize newly created objects and do something special with them, i.e. prepare PDF reports, generate images, send emails, index for global text search, or something else. Usually in such cases I checked the existence of the primary key: the object is new if it has no primary key. But this is wrong when you use fixtures, because fixtures come with primary keys. N.B. The object is new only if there is no object in the database which primary key equals to the primary key of the current object.

class MyModel(models.Model):
...
def save(self, *args, **kwargs):
is_new = True
pk = self._get_pk_val()
model = type(self)
if pk and model._default_manager.filter(pk=pk):
is_new = False
# something before saving
super(model, self).save(*args, **kwargs)
# something after saving


aka

class MyModel(models.Model):
...
def save(self, *args, **kwargs):
is_new = True
if self.id and MyModel.objects.filter(id=self.id):
is_new = False
# something before saving
super(MyModel, self).save(*args, **kwargs)
# something after saving


Another alternative for storing default data would be custom sql located at apps/myapp/sql/mymodel.sql, but I haven't tried that yet and probably won't.

BTW, happy programmer day!

2008-09-03

A Note on Python Paths

This time I decided to share some knowledge about Python paths which seemed a little bit confusing to me in the beginning of diving into Python. I am working with Django in different platforms like Mac OS X, Windows, and Linux, therefore the common patterns how to activate new python modules in all of those environments should be familiar to me.

Python modules are either *.py files or directories containing __init__.py. When defining paths to python modules, you will usually need to deal with the latter ones. A module is meant to be under python path if you can run python and import that module.

For example, if you can run the following, then django is under your python path.
python
>>> import django


Stay tuned to get deeper into python paths.

Installing modules



If a module is installable, usually all you need to do is to extract its setup directory, cd to it, and run
python setup.py install

This will copy the module into the site-packages directory of the current python installation. It might be that you have multiple Python versions on your computer. According to django documentation, you can find the currently used site-packages by
python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

Or you can use PEAK EasyInstall for installing python modules even faster.

But sometimes you will need the latest and greatest versions of your modules directly from version control system. To make them accessible from python you should either check them out directly to site-packages (very messy and inflexible) or keep them somewhere else and do some additional magic.

Sym-linking



You can create symbolic links (symlinks) in unix-based systems like Linux or Mac OS X. A symlink is like a shortcut to a file or directory. If you create a symlink in site-packages which points to a python module which is located somewhere else, it will work as if the module was copied into site-packages.

To create a symlink, type the following in a console/terminal:
ln -s <source> <target>

For example, if you want python to access django which is under /Library/Subversion/django_src/trunk/django, you need to write something like this (considering that your site-packages are at /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/)
ln -s /Library/Subversion/django_src/trunk/django /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django

To delete the symlink, simply remove it (this won't delete the original module):
rm /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/django


But as I've already mentioned, this works only in unix-based environments and you can't use shortcuts in Windows for the same purpose.

*.pth files



Python supports *.pth files which contain the paths to the parent directories of your python modules one per line. Those files should be placed in site-packages and can be called whatever you want them to call. For example, you can create a file my-project.pth and write
/Library/Subversion/django_src/trunk
/Library/Subversion/myproject_src/trunk

or
C:\Subversion\django_src\trunk
C:\Subversion\myproject\trunk

into it. Then django and your project files will be importable in python.

However, you might have no permissions to create files under site-packages or you might need to activate different locations of python modules for different projects.

PYTHONPATH variable



The other way is to set additional paths for python just before running the externally kept modules. This is done by setting the python paths to the environment variable PYTHONPATH. Note again that python paths point not to the modules themselves, but to their parent directories!

The syntax slightly differs among different platforms.

Linux and Mac OS X:
# checking value
echo $PYTHOPATH
# setting value
export PYTHONPATH="/Library/Subversion/django_src/trunk"
# appending to the existing value
export PYTHONPATH="$PYTHONPATH;/Library/Subversion/django_src/trunk"

Windows:
# checking value
echo %PYTHOPATH%
# setting value
set PYTHONPATH="C:\\Subversion\\django_src\\trunk"
# appending to the existing value
set PYTHONPATH="%PYTHOPATH%;C:\\Subversion\\django_src\\trunk"


Multiple paths can be separated by a colon (";").

PYTHONPATH can be used in scripts and webserver configuration files, but it is not very comfortable in daily use.

Adding paths to sys.path



For the projects that you develop and which should run as standalone applications, you can set the required python paths relatively inside you python code.

Note that all python paths which you set in the PYTHONPATH variable or *.pth files as well as the path of default python libraries and the path of site-packages get listed in python variable sys.path. When you import a module, it is loaded from the first location which contains the required module. So if you have two paths to different django versions in your python paths and you import django, the django version from the first location will be used.

You can read the list of loaded python paths like this:
>>> import sys
>>> sys.path


You can also freely modify it, for example:
>>> import sys
>>> sys.path.append("/Library/Subversion/django_src/trunk")
>>> import django


And this is an example, how to get and use paths relative to the currently loaded file:
import os, sys

SVN_PATH = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
DJANGO_PATH = os.path.join(SVN_PATH, "django_src", "trunk")
PROJECT_PATH = os.path.join(SVN_PATH, "myproject", "trunk")

sys.path += [DJANGO_PATH, PROJECT_PATH]



I hope this introduction was useful for developers and made the big picture about the paths clearer.

Some more related information can be found at the official python documentation.

2008-08-08

MultipleSubmitButton Widget for ChoiceField

Recently I published a snippet with a widget rendering a choice field as a series of submit buttons.

So the {{ form.do }} field from the following form:

SUBMIT_CHOICES = (
('save', _("Save")),
('save-add', _("Save and Add Another")),
)

class TestForm(forms.Form):
do = forms.ChoiceField(
widget=MultipleSubmitButton,
choices=SUBMIT_CHOICES,
)



will be rendered as:

<ul>
<li><button type="submit" name="do" value="save">Save</button></li>
<li><button type="submit" name="do" value="save-add">Save and Add Another</button></li>
</ul>


Can somebody enhance this widget so that it supports iteration through different choices and getting specific buttons by indexes in the template? My trials failed, but maybe you will succeed!

2008-07-16

EuroPython 2008

I started writing this post at the airport just before flying back from Vilnius to Berlin. EuroPython 2008 - the three-days conference and sprints afterwards - gave me loads of information, new relations, and better understanding of the big image of Python developers' community.

EuroPython

Some highlights from the conference follow.
  • During video conference the author of Python programming language Guido van Rossum encouraged using Python 2.6 just after the release, but said that Python 3.0 still shouldn't be used for production. The good future-proof practices are inheriting from object for classes, using xrange(), sorted(), and zip() functions, "//" for floor divisions, and dict.iterkeys(). The full porting to 3.0 guide will be probably written and published in the python.org site.

  • Clone Digger is a project from Google Summer of Code 2008 for finding similar pieces of code which could be optimized by moving that to parameterized functions.

  • Restrictive Python is a project which blocks the user from executing dangerous functions or methods or accessing files which might harm the system. You can try Restrictive Python out in the simulated terminal.

  • Python is an interpreted language and if you want to deploy a project writen in Python, you have to be sure that the end user will have Python interpreter installed there. O you can create an executable file from Python project using one of the following tools: Py2exe, PyInstaller, or BBFreeze.

  • There is a combined installer of Python, SQLite, and Django for Windows and it is called Instant Django.

  • Django DB External Scheme Evolution Branch is a separate application for doing DB scheme changes easily without manual DB field editing.

  • Honza Král showed the essentials of the new-forms admin to which I am going to port our Django projects the next days. Directly from him I learned that new forms will still have validators which will work only to check the value of that one specific field (not related to other fields). If some multiple field validation is necessary, the clean method for the form should be overwritten. All the global field-unrelated form errors will be passed to {{ form.non_field_errors }} variable.

  • Christian Scholtz presented PYOGP which is Python-based environment for testing the Open Grid Protocol used for exchanging avatars and other objects among virtual realms like different regions of Second Life.

  • Ricardo Quesada and Lucio Torre told us about cocos2d - a framework made in Python for presentations and sprite-based 2D games.

  • If you ever need to export some data from a database to Excel Sheet or import from Excel, you can do that not only in CSV (Comma Separated Values) format. You can also use the original XLS format with one of the following libraries: xlrd, xlwt, or pyExcelerator. I successfully used the latter one last week.

  • CVS and SVN are not the only version control management systems in the world :D. The others like Mercurial, Git and Bazaar are also worthy checking. These are all distributed systems which mean that they don't have one centralized source. Bazaar is written in Python and has API for manipulating information about revisions, committers, and other version-control-related stuff.


Some other useful links:


Others blogged about EuroPython 2008 as well: