How to avoid import-time database access in Django?

John Lehmann

My Django app has a number of categories for things which I store in a Category model. I reference these frequently in code, and so I've found it useful to have a module with references ("constants") to these categories and groups of them, so typos will fail fast. This also provides the benefit of caching. And finally, it's the actual model so it has all the related functionality. It looks something like this:

def load_category(name):
  return Category.objects.get(name=name)

DOGS = load_category("dogs")
CATS = load_category("cats")

However, this results in import-time database access and causes various issues. After adding a new category with a reference like this, I must run a data migration before ./manage.py will function. I just hit a new problem while switching to using Django's test framework, which is that these load from the default (e.g., dev or prod) database rather than the test one as explicitly mentioned in this warning.

If your code attempts to access the database when its modules are compiled, this will occur before the test database is set up, with potentially unexpected results. For example, if you have a database query in module-level code and a real database exists, production data could pollute your tests. It is a bad idea to have such import-time database queries in your code anyway - rewrite your code so that it doesn’t do this.

What's the best pattern for obtaining the benefits of these references while avoiding the import-time database access?

One possible solution is a proxy pattern which returns a pseudo-Category which forwards all the model's functionality but does not access the database until it's necessary. I'd like to see how others have solved this problem with this approach or another solution.

(Related but different question: Django test. Finding data from your production database when running tests?)

Final Approach

The approach by @kevin-christopher-henry's worked well for me. However, in addition to fixing these declared references, I also had to delay access to the references from other code. Here I found two approaches helpful.

First, I discovered Python Lazy Object Proxy. This simple object takes a factory function as input, which is lazily executed to produce the wrapped object.

MAP_OF_THINGS = Proxy(lambda: {
        DOG: ...
        CAT: ...
})

A similar way of accomplishing the same thing was pushing code into factory functions decorated with memoize so they'd only be executed once.

NOTE: I initially tried to use the Proxy object above as a direct solution to my problem of lazy access to model objects. However, despite being very good imitations, when querying and filtering on these objects I got:

TypeError: 'Category' object is not callable

Sure enough, Proxy returns True for callable (even though docs say this doesn't guarantee it's callable). It seems that Django queries are just too smart and bound to find something incompatible with a phoney model.

For your application, Proxy might be good enough.

Kevin Christopher Henry

I've run into the same issue myself, and agree that it would be great to have some best practices here.

I ended up with an approach based on the descriptor protocol:

class LazyInstance:
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
        self.instance = None

    def __get__(self, obj, cls):
        if self.instance is None:
            self.instance, _ = cls.objects.get_or_create(*self.args, **self.kwargs)

        return self.instance

Then in my model classes I have some special objects:

class Category(models.Model):
    name = models.CharField()

    DOGS = LazyInstance(name="dogs")
    CATS = LazyInstance(name="cats")

So nothing happens at import time. The first time the special object is accessed, the relevant instance is looked up (and created, if necessary) and cached.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Django how to avoid hitting database again

avoid Circular import in Django

Django: How to access test database?

how to import a sheet of an excel file into the access database?

How to insert time into access database (VB Net)

How do I avoid "Access Violation" during asynchronous database connections?

How to avoid this circular import?

How to avoid data being stored in database using django rest framework

How to access PostgreSQL database in Django application

Django - import excel to database

How to avoid similar rows during excel import with django-import-export?

How to import multiple json file into mysql database at same time?

How to avoid database deadlocks

Django : How to import a data from csv to database without using form?

How to avoid this OperationError in Django?

How Read Access Database Date Time Format as Date Only

How to insert data from textbox and date/time into Access database?

How to access and retrieve real-time data from Avaya database?

How to add an HH:MM time value to an Access database?

Is it possible to avoid huge number of access tokens in database?

How to design to avoid DataBase access on main thread exception while getting my fields initialized?

How to avoid a JSON object with numbers as keys getting interpreted as an array when uploading to Firebase Real Time Database?

C#/EF - How to increment a non-ID number in a database and avoid duplicates when run at the same time

How to access a geometry (point) field in PostGIS database from Django?

How to access database from other app in Django in same project

how to implement search that can access complete database in Django

How can I change the database time-out period in django?

How to automatically delete records from database after a specific time in Django

Python flask how to avoid multi file access at the same time (mutual exclusion on files)?