Django on Kubernetes Deployment: Best practices for DB Migrations

JasonGenX

My Django deployment has x number of pods (3 currently)running a Django backend REST API server. We're still in the development/staging phase. I wanted to ask for advice regarding DB migration. Right now the pods simply start by launching the webserver, assuming the database is migrated and ready. This assumption can be wrong of course.

Can I simply put python manage.py migrate before running the server? What happens if 2 or 3 pods are started at the same time and all run migrations at the same time? would there be any potential damage or problem from that? Is there a best practice pattern to follow here to ensure that all pods start the server with a healthy migrated database?

I was thinking about this:

During initial deployment, define a Kubernetes Job object that'll run once, after the database pod is ready. It will be using the same Django container I have, and will simply run python manage.py migrate. the script that deploys will kubectl wait for that job pod to finish, and then apply the yaml that creates the full Django deployment. This will ensure all django pods "wake up" with a database that's fully migrated.

In subsequent updates, I will run the same job again before re-applying the Django deployment pod upgrades.

Now there is a question of chicken and egg and maintaining 100% uptime during migration, but this is a question for another post: How do you apply data migrations that BREAK existing container Version X when the code to work with the new migrations is updated in container Version X+1. Do you take the entire service offline for the duration of the update? is there a pattern to keep service up and running?

nima

Well you are right about the part that multiple migrate commands will run against your database by multiple pods getting started.

But this will not cause any problems. When you are going to make actual changes to your database, if the changes are already applied, your changes will be ignored. So, say 3 pods start at the same time and run the migrate command. Only One of those commands will end up applying changes to the database. Migrations normally need to lock the database for different actions (this is highly related to your DBMS). The lock will happen by one of the migrate commands (one of the pods) and other commands should wait until the work of the first one is over. After the job is done by the first one, others' commands will be ignored automatically. So each migration will happen once.

You can however, change your deployment strategy and ask kubernetes to first, spin up only 1 pod and when the first pod's health check succeeds, others will spin up too. In this case, you can be sure that the lock time for the migration, will happen only once and others will just check that migrations are already applied and ignore them automatically.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Local Development Best Practices: Java, Docker, Kubernetes

Django: Best way to merge migrations conflicts

Hibernate/JPA DB Schema Generation Best Practices

Django and VirtualEnv Development/Deployment Best Practices

Best practices: working with DB in Java

Best practices for storing kubernetes configuration in source control

Best Practices: C# working with DB

Are there any best practices using Doctrine 2 migrations on different GIT branches?

Apache Spark application deployment best practices

Django DoesNotExist Best Practices

Automatic db router for migrations in django

Best practices of Export/Import Keycloak data in Kubernetes

Managing DB migrations on Kubernetes cluster

Kubernetes best practices in pods

mongodb, express.js app deployment best practices

Working with timezone in Django - Best Practices

Best practices for deploying/updating containers with Kubernetes?

Best practices for Blazor app re-deployment

Best practices for data storage with Elasticsearch and Kubernetes

Django Rest Framework: Best practices?

best practices of accessing db from web application

DB Compound indexing best practices Mongo DB

Best practices for frontend deployment with webpack

Best practices of state management in Django

best practice for flask migrations in kubernetes

How to pull secrets from Kubernetes into GitHub action to run Django migrations for AKS deployment?

Running DB as Kubernetes Deployment or StatefulSet?

Mark specific Django migrations as fake migrations during test DB setup

Kubernetes min replica count best practices