USSD (Unstructured Supplementary Service Data) is one of the most practical ways to deliver digital healthcare services in places where smartphones, mobile data, and reliable internet access aren’t always readily available. Across Africa, it serves as the backbone for everyday healthcare interactions, booking appointments, receiving follow-ups, checking insurance eligibility, and completing basic health screenings on simple feature phones.

From the user’s side, USSD feels straightforward, but behind the scenes, building systems that handle these interactions at scale is anything but simple. Sessions expire quickly, responses must be almost immediate, and the system has to deal with unreliable network behavior coming from telecom operators. A small architectural mistake can easily translate into dropped sessions, incomplete flows, or a system that works fine in testing but struggles when traffic increases.

This article focuses on the architectural thinking behind building a high-volume USSD systems for healthcare scenarios, and the practical engineering patterns that make these systems more resilient and easier to maintain in production.

Understanding the Core Constraints of USSD

Before discussing architecture, it’s important to understand the constraints that shape every technical decision in USSD systems:

These constraints demand an architecture that prioritizes speed, fault tolerance, and horizontal scalability rather than traditional synchronous request-response patterns.

High-Level Architecture Overview

At a high level, the system is composed of:

Each component has a clearly defined responsibility. This prevents tight coupling and allows different layers of the system to scale independently.

USSD Gateway Layer

The gateway acts as the entry point into the system. It manages session lifecycle events, translates telecom protocols into something the backend can understand, and sends responses back to the user.

From an architectural standpoint, the backend never assumes delivery guarantees from the gateway. Every request is treated as potentially duplicated, delayed, or retried.

This assumption drives downstream design decisions such as:

Flow Orchestration Layer

In scalable USSD platforms, interaction flows should not be tightly coupled to backend application logic. Instead of hard-coding menus and decision paths directly into services, a dedicated flow orchestration layer is introduced to manage user interaction sequencing.

The orchestration engine is responsible for:

Separating interaction flows from backend business logic becomes increasingly important as deployments grow in size and complexity. This architectural boundary allows product and operations teams to modify user journeys without requiring changes to core application services. It also prevents complex backend processes from introducing latency into active USSD sessions.

When backend processing or data persistence is required, the orchestration layer emits events to downstream services asynchronously rather than performing blocking synchronous calls during the USSD session.

This design approach delivers several operational benefits:

While this separation introduces additional architectural components, the long-term gains in scalability, reliability, and maintainability typically outweigh the added complexity in high-volume USSD environments.

Event & State Layer

One of the most effective ways to keep USSD systems responsive is to avoid performing heavy business logic during active sessions. Instead, user interactions can trigger asynchronous events that downstream services process independently.

In this model, the system publishes interaction events to a messaging layer. Multiple domain services subscribe and process them. Shared state is updated asynchronously while the user receives a minimal response that keeps the session alive.

Different technologies can support this approach. Some teams use in-memory messaging layers for speed, while others rely on distributed queues or streaming platforms. Each option has trade-offs around durability, ordering guarantees, and operational complexity.

Despite these trade-offs, event-driven processing makes it easier to scale horizontally and prevents individual services from becoming bottlenecks.

Backend Services

Backend services are designed to be stateless and horizontally scalable:

This makes it possible to:

Handling Scale, Timeouts, and Failures

Session Timeouts

USSD sessions frequently expire mid-flow. To handle this:

High Concurrency

Redis Pub/Sub allows the system to process thousands of concurrent events with minimal overhead, while backend consumers scale horizontally based on demand.

Fault Isolation

Failures in one downstream service (for example, appointment booking) do not collapse the entire flow. The flow orchestrator continues managing the session while backend services retry or flag failures asynchronously.

Observability & Error Tracking

In distributed, high-volume systems, failures are inevitable. What matters is how quickly they are detected and diagnosed. This is why Engineers need visibility into latency trends, session drop rates, and asynchronous processing failures.

Structured logging helps trace issues across distributed services. Distributed tracing allows teams to follow a user interaction through multiple asynchronous boundaries. Real-time alerts provide early warnings when performance starts to degrade.

This makes it possible to trace failures across asynchronous boundaries, identify systemic issues early, and improve reliability over time.

Why This Architecture Scales

This architecture scales by combining:

Rather than treating USSD as a simple menu-driven channel, it is approached as a distributed systems problem, subject to the same rigor as high volume web or fintech platforms.

Closing Thoughts

USSD continues to play an important role in expanding access to healthcare services in low-connectivity regions. Building reliable platforms for this channel requires more than basic menu design. It involves thoughtful system architecture, careful handling of session state, and a realistic understanding of telecom infrastructure constraints.

The patterns described here are technology-agnostic and can be implemented with different tools depending on team preferences and operational context. What matters most is designing systems that remain responsive under load and continue working even when networks behave unpredictably.

In many ways, USSD reminds engineers that impactful technology is not always about cutting-edge tools. Sometimes it is about building dependable systems that meet people where they are and continue working under real-world conditions.