Content

TLDR — Summary of Django Transaction Logic

This is for apps running Django with PSQL or other similar SQL (SQLite, MySQL, etc.) based databases (DBs); the underlying rules around transactions may not apply to other DBs. Under the hood, Django’s transaction management code does the following (this is copied from Django’s doc -scroll down- with my own bullet points for clarification):

In the next blog, I’ll write about what you should or shouldn’t put in a transaction to avoid outages caused by lock starvation.

Logic Diagram

Transactions & ATOMIC Operations

Transactions are mechanisms that allow ACID DB’s (PSQL, SQL, latest MongoDB, etc.) to be ATOMIC (the A in ACID). This means all writes to the DB are done as a single operation — no matter how complex. If the write succeeds, all the changes to the DB persist and are available to all connections simultaneously (no partial writes). If the write fails, all the changes are rolled back — again, there are no partial changes.

Transactions guarantee these rules:

These rules allow the user of the DB to bundle complex operations together and execute them as one operation. For example, executing a transfer from one bank account to another; if we did not have transactions, then in the middle of these two write operations, there could be a withdrawal or even a close account operation that makes the transfer invalid. Transactions allow the user some form of traffic control. We can block all other conflicting operations while the transaction is in progress.

The operations are blocked through a series of locks on tables and rows. In PSQL and other SQL variants, transactions are created with BEGIN; then the locks are acquired when an operation like select/insert/delete/alter is run and the transactions end with COMMIT or ROLLBACK. The locks are released when COMMIT or ROLLBACK is executed. Fortunately, Django allows us to create transactions without having to use these three statements (but we still need to worry about the locks; more on that in the next post).

-- Start a new transaction
BEGIN;
SELECT …
INSERT INTO …
UPDATE …
DELETE
FROM …
-- Save the changes
COMMIT;

Transactions in Django

from django.db import transaction
from app.core.models import Accounts, Fee, NotificationJob

def do_migration():
  overdrawn_accounts = Accounts.objects.filter(type='overdrawn')
  for acount in overdrawn_accounts:
    create_notification(acount)


@transaction.atomic
def create_notification(acount: Accounts):
  # $5 fee for overdraft - 500 because we never store money as float!!!
  recall = Fee.objects.create(acount=acount, description='Fee for overdraft', amount=500)
  NotificationJob.objects.create(recall=recall, notification_type='all')
  acount.status = 'awaiting_payment'
  acount.save()


def do_migration2():
  overdawn_account = Accounts.objects.filter(type='overdrawn')
  for account in overdawn_account:
    with transaction.atomic():
      recall = Fee.objects.create(acount=account, description='Fee for overdraft', amount=500)
      NotificationJob.objects.create(recall=recall, notification_type='all')
      account.status = 'awaiting_payment'
      account.save()

Django automatically wraps every DB operation in a transaction for us when run in autocommit mode (more on that below). Explicit transactions can be created using the atomic() decorator or context manager (with atomic()) — when atomic() is used, individual DB operations are NOT wrapped in a transaction or committed immediately to the DB (they are executed, but the changes to the DB are not visible to other users). Instead, the entire function is wrapped in a transaction block and committed at the end (if no errors are raised) — COMMIT is executed.

If there are errors, the DB changes are rolled back — ROLLBACK is executed. This is why transactions hold onto locks until the very end; we don’t want to release a lock on a table, have another connection change the table or read it, and then be forced to roll back what was just changed or read.

ANGRY_CUSTOMER_THRESHOLD = 3
@transaction.atomic
def create_notification(acount: Accounts):
  recall = Fee.objects.create(acount=acount, description='Fee for overdraft', amount=500)
  NotificationJob.objects.create(recall=recall, notification_type='all')

  try:
    # when this completes successfully, the changes will be available to the outer transaction
    with transaction.atomic():
      owner = acount.owner
      fees = Fee.objects.filter(owner=owner).count()
      if fees >= ANGRY_CUSTOMER_THRESHOLD:
        for fee in fees:
          fee.amount = 0
          fee.save()
        owner.status = 'angry'
        owner.save()

  # as long as we catch the error, the outer transaction will not be rolled back
  except Exception as e:
    logger.error(f'Error while removings fees for account {acount.id}: {e}')

  acount.status = 'awaiting_payment'
  acount.save()

We can also nest transactions by calling another function that is wrapped by atomic() or using the context manager within a transaction. This allows us to create savepoints where we can attempt risky operations without affecting the rest of the transaction — when an inner transaction is detected, a save point is created, the inner transaction is executed, and if the inner transaction fails, the data is rolled back to the save point and the outer transaction continues. If the outer transaction fails, all the inner transactions are rolled back alongside the outer transaction. Transaction nesting can be prevented by setting durable=True in atomic() — this will cause the transaction to raise a RuntimeError if any inner transactions are detected.

What you need to remember about nested transactions:

Django Autocommit Mode — Turning Off Automatic Transactions

DATABASES = {
  'default': {
    # True by default 
    'AUTOCOMMIT': False,
    # ...rest of configs
  }
}

By default, Django runs in autocommit mode, meaning every DB operation is wrapped in a transaction and immediately run (committed, hence autocommit) except if the operation is explicitly wrapped in a transaction by the user. This obfuscates the transaction logic of the underlying DB — some DBs, like PSQL, automatically wrap all operations in transactions while others don’t. If we have Django doing this for us, we don’t need to worry about the underlying DB logic.

We can turn off autocommit mode, but that would leave us relying on the DB to manage operations and ensure they are ATOMIC, which is risky and inconvenient for developers. It would not be fun if, while creating an API, we had to figure out which write operations are in transactions and, therefore, will be written in total and not partially.

However, we may want to turn off autocommit mode if we know the number of connections on the DB and the sequence of operations they are executing. If we figure out how to de-conflict these operations, then we can turn off autocommit mode and gain some efficiencies:

The downsides are the higher risks of data corruption. It is not worth it, but maybe there is a use case that I can’t think of.

Sources