As you might know, I am developing PyBazaar, a Python Developer Marketplace. For a project of that scope, I need to create hundreds or thousands of data entries to ensure that everything works as expected. Factory Boy is a tool that allows me to create model instances in batches, and this blog post is about it.
The benefits of using Factory Boy
By creating a bunch of fake entries, I can achieve the following:
- Work on list and detail representation and styling.
- Work on and try functionality like filters, sorting, and pagination.
- Check and improve performance with loads of data entries.
- Create dummy data for unit or functional tests.
Factory Boy seemed like a pretty complex package, so I want to simplify things and introduce you to all the necessary parts for creating a fake model instances.
Model preparation
At PyBazaar, I have users with profiles, job offers, and resources that can be faked in batch. The related categories are predefined and don't need to be faked.
To make it possible to distinguish between real and fake entries, I added a new boolean field is_fake
to all those models that I can create in batch:
# For testing and debugging
is_fake = models.BooleanField(_("Fake"), default=False)
Here is what the list of profiles can look like when created with Factory Boy:
The setup
The installation is pretty straightforward:
(venv)$ pip install factory-boy==3.3.0
And then in each app where you need to create fake entries, create a file factories.py
with factory classes, e.g.:
import random
import factory
from pybazaar.apps.accounts.models import User
class UserFactory(factory.django.DjangoModelFactory):
class Meta:
model = User
first_name = factory.Faker("first_name")
last_name = factory.Faker("last_name")
# ...
is_fake = True
For factory classes, I also add custom class methods delete_fake()
and recreate_batch()
so that I can quickly create entries or delete them:
class ProfileFactory(factory.django.DjangoModelFactory):
class Meta:
model = Profile
user = factory.SubFactory(UserFactory)
# ...
is_fake = True
@classmethod
def recreate_batch(cls, size, **kwargs):
cls.delete_fake()
cls.create_batch(size=size, **kwargs)
@classmethod
def delete_fake(cls):
for profile in Profile.objects.filter(is_fake=True):
profile.delete()
for user in User.objects.filter(is_fake=True):
user.delete()
Factory class attributes tell the system what values to assign to the models when creating instances. Let's explore multiple cases that we can use as values.
Assigning a static value
If it's a simple static value, you can just assign it. It will be the same for all fake entries:
is_fake = True
publishing_status = Profile.PublishingStatusChoices.PUBLISHED
Assigning a value from a list
If it's a value from a list, use the Iterator
class:
title = factory.Iterator([
"Developer",
"Software Engineer",
"Programmer",
])
experience_level = factory.Iterator(
Profile.ExperienceLevelChoices.values
)
Assigning generated value of a certain type
Factory Boy uses the Faker package to allow the creating of fake names, paragraphs, or locations. You can use those as follows:
first_name = factory.Faker("first_name")
last_name = factory.Faker("last_name")
summary = factory.Faker("paragraph")
city = factory.Faker("city")
state = factory.Faker("state")
country = factory.Faker("country_code")
Assigning an instance
If it's a foreign key and you want a random value, use this:
resource_type = factory.LazyAttribute(
lambda o: ResourceType.objects.order_by("?").first()
)
Assigning a value from a function
Similarly, you can assign a value from a function:
description = factory.LazyAttribute(
lambda o: generate_quill_content()
)
Assigning a random value
Or a random value:
is_available_for_work = factory.LazyAttribute(
lambda o: random.choice([True, False])
)
Assigning a value based on attributes or methods of the model instance
Once you define attributes like first_name
or last_name
, you can set other values depending on those:
username = factory.LazyAttribute(
lambda o: f"{o.first_name}_{o.last_name}".lower()
)
email = factory.LazyAttribute(
lambda o: f"{o.first_name}_{o.last_name}@example.com".lower()
)
Assigning a password
There is a special django.Password
class for generating password values:
password = factory.django.Password("Pa$$w0rd")
Assigning dummy images
Here is how to create and assign a dummy single-color image:
avatar = factory.django.ImageField(
width=200, height=200, color="rgb(2,132,199)"
)
Having two factories depending on each other
As we have profiles depending on users, we can define the codependence with SubFactory
class.
class ProfileFactory(factory.django.DjangoModelFactory):
class Meta:
model = Profile
user = factory.SubFactory(UserFactory)
Then, when creating profiles, the users will be created automatically for them, too.
Attaching many-to-many relations
In Django, many-to-many relationships must be added after creating a model instance. We can achieve that with the PostGeneration
class:
def attach_categories(obj, create, extracted, **kwargs):
obj.specializations.add(
*list(Specialization.objects.order_by("?")[: random.randint(3, 7)])
)
class ProfileFactory(factory.django.DjangoModelFactory):
# ...
do_afterwards = factory.PostGeneration(attach_categories)
How you call this attribute doesn't matter - it should just not clash with other field names or attributes.
A complete example
So the final factories.py
file could look like this:
import random
import factory
import json
from pybazaar.apps.accounts.models import User
from pybazaar.apps.profiles.models import Profile
from pybazaar.apps.categories.models import Specialization
class UserFactory(factory.django.DjangoModelFactory):
class Meta:
model = User
first_name = factory.Faker("first_name")
last_name = factory.Faker("last_name")
username = factory.LazyAttribute(
lambda o: f"{o.first_name}_{o.last_name}".lower()
)
email = factory.LazyAttribute(
lambda o: f"{o.first_name}_{o.last_name}@example.com".lower()
)
password = factory.django.Password("Pa$$w0rd")
is_fake = True
def generate_quill_content():
return json.dumps(
{
"delta": '',
"html": "<p>Hey there</p>",
}
)
def attach_categories(obj, create, extracted, **kwargs):
obj.specializations.add(
*list(Specialization.objects.order_by("?")[: random.randint(3, 7)])
)
class ProfileFactory(factory.django.DjangoModelFactory):
class Meta:
model = Profile
user = factory.SubFactory(UserFactory)
title = factory.Iterator(
[
"Developer",
"Software Engineer",
"Programmer",
]
)
avatar = factory.django.ImageField(
width=200, height=200, color="rgb(2,132,199)"
)
bio = factory.LazyAttribute(lambda o: generate_quill_content())
city = factory.Faker("city")
state = factory.Faker("state")
country = factory.Faker("country_code")
is_available_for_work = factory.LazyAttribute(
lambda o: random.choice([True, False])
)
experience_level = factory.Iterator(
Profile.ExperienceLevelChoices.values
)
publishing_status = Profile.PublishingStatusChoices.PUBLISHED
is_fake = True
do_afterwards = factory.PostGeneration(attach_categories)
@classmethod
def recreate_batch(cls, size, **kwargs):
cls.delete_fake()
cls.create_batch(size=size, **kwargs)
@classmethod
def delete_fake(cls):
for profile in Profile.objects.filter(is_fake=True):
profile.delete()
for user in User.objects.filter(is_fake=True):
user.delete()
Creating fake entries
Lastly, I can create the fake entries from the Django shell as follows:
>>> from pybazaar.apps.profiles.factories import ProfileFactory
>>> ProfileFactory.recreate_batch(100)
And later, when I don't need those anymore:
>>> from pybazaar.apps.profiles.factories import ProfileFactory
>>> ProfileFactory.delete_fake()
Whenever I add new fields to the models, I can easily tweak the factories and recreate the whole bunch of models in one step.
Final words
Factory Boy doesn't guarantee data validation and integrity. For example, city, state, and country will be three separate random values that don't match a real location. However, that is sufficient to test your website's basic look and feel or performance.
Cover image by Google DeepMind