This article serves as a logical continuation of my previous piece, "The Internet of Things: Humanity’s New Nervous System".

Building upon the conceptual framework established earlier, I want to pivot from the top vision to a more granular, component-level deep dive into this transformative technology. The core objective this time is to address a critical, high-stakes decision faced by every software architect and product manager in the space:

"What is the optimal path for a new IoT project - building a proprietary solution from the ground up, or strategically leveraging the robust, existing offers in the market?"

Main Idea and Architecture

To truly appreciate the gravity of the build-vs-buy decision, let's ground our discussion in a tangible, high-stakes scenario—the modern oil and gas industry.

Imagine us as Texas oil magnates. We operate two hundred pumpjacks scattered across the vast Texan landscape, steadily extracting crude oil day after day. Central to our operation are the flow meters at each wellhead, which provide critical data on volumetric output. Monitoring these readings is paramount for process regulation and maximizing yield.

Historically, this has been a labor-intensive effort: specialized field crews, driving pickups from site to site, manually logging readings and making on-the-spot operational decisions.

Now, consider a sudden, significant expansion—the successful acquisition of an additional 100 wells. While this is a financial boon, it presents an immediate logistical crisis. Our existing field crews are already operating at capacity; integrating another 100 wells into their manual rounds is impossible, and the overhead cost of hiring and training new, large teams significantly erodes the profit margin.

This is the precise moment when technology intervenes, making the business case for IIoT (Industrial Internet of Things) undeniable.

The solution is clear: Intelligently augment the flow meter. By enabling the wellhead device not just to measure, but to continuously and securely transmit its data stream back to a central command center, we completely transform our operating model.

The core benefit is the shift from routine, inefficient site visits to data-driven, predictive dispatch. Our field crews are no longer reactive surveyors; they become strategic troubleshooters, directed only to the sites where the telemetry indicates a true need for intervention. This small, yet incredibly powerful, deployment represents the essence of modern IIoT.

But before delving into the architecture, I suggest we recall together the main components of a successful IoT infrastructure, described in the previous article.

These four pillars form the essential pipeline that transforms raw field data into actionable business intelligence:

Well then, let's get started!

Sensing Level

Closing the Device Layer (Sensing Layer) layer - equipping our pumpjacks with smart flow meters - is relatively straightforward. The modern market is saturated with suitable Commercial Off-the-Shelf (COTS) solutions, ranging from hardened industrial-grade sensors to highly customizable, cost-effective options like a Raspberry Pi solution augmented with an integrated GSM/LTE module. We can refer to this edge node simply as the "Transmitter".

What about the "Network" level? This is where we approach the core nervous system of the IoT, its synapses.

IoT Synapses (Network)

There is a vast array of methods for transmitting data from a smart sensor to the central storage and processing module. However, the most popular and robust solution, by far, remains message queuing. This preference is intrinsically linked to the nature of IoT: it is fundamentally a continuous stream of data, or telemetry. The event-driven architectural approach is ideally suited for this workload:

As you might already surmise, microservices are the second critical component in the modern IoT world. This architectural style adds essential flexibility and scalability to handle the varying loads and complex processing needs of millions of devices. Still, we'll dive into that topic in detail a bit later.

“So, which queuing solutions are best suited for these needs?”

The truth is, nearly any reliable message queue can fulfill this fundamental requirement. The most common and powerful contenders in the IoT space include:

However, the reality of the edge layer is that we are talking about smart, yet resource-constrained devices with limited processing power, memory, and bandwidth. Working directly with the "heavyweight" message queues we discussed earlier (like Kafka or standard RabbitMQ connections) can be overly challenging, draining the device's battery and resources.

This is where a special class of messaging protocols comes into play—protocols that, while perhaps not exclusively developed for IoT, have found their most widespread and impactful application within the technology.

The most prominent and successful example is MQTT (Message Queuing Telemetry Transport).

MQTT

MQTT is a lightweight, open-standard messaging protocol built on the Publish-Subscribe (Pub/Sub) pattern.

The initial version was published in 1999 by Andy Stanford-Clark from IBM and Arlen Nipper from Cirrus Link. They conceived MQTT as a method for maintaining reliable machine-to-machine (M2M) communication over networks with limited bandwidth or unpredictable connectivity.

This focus on constraint is critical: one of its very first use cases involved ensuring continuous contact between segments of an oil pipeline and central command links via satellite—a true testament to its reliability in hostile, low-resource environments.

The fundamental difference between MQTT and the standard web paradigm (like HTTP, which uses a direct Request/Response model) is architectural: devices do not communicate directly with one another.

There are only three main elements in an MQTT system:

Any data published to or received from an MQTT Broker is encoded in a binary format, as MQTT is a binary protocol. This means that to access the original, readable content - whether it's JSON, XML, or plain text - the message must first be correctly interpreted (decoded) by the subscribing client.

Notably, MQTT transmits security credentials in cleartext; otherwise, it does not support authentication or security features. To protect transmitted information from interception, brokers use an additional security layer, SSL.

As a result, our scheme smoothly transforms into:

Or we can use one topic for all devices, simply by adding the device ID to the message:

Data Processing and Storage

So, we have established our architecture:

As you correctly surmised, the Broker level is precisely where we cross the boundary from the constrained IoT sensor layer to the unconstrained data processing layer. In the processing layer, our architectural imagination is only limited by our financial budget and engineering skills.

It is now time to learn how to consume this data from the other side of the Broker.

The first problem we immediately encounter is a significant limitation of MQTT: it offers no native data persistence. Despite being a powerful tool, its lightweight nature enforces strict constraints, and one of them is the inability to store messages. In simpler terms: if there is no Subscriber actively listening when a Publisher sends a message, that message is sent into the void and lost forever.

To solve this critical gap, a common and highly effective pattern is employed: a dedicated, lightweight, and fast microservice is deployed as the primary Subscriber for all topics. This dedicated microservice acts as the crucial bridge between the low-power MQTT world and the high-throughput processing world.

“Do you remember how I mentioned the Event-Driven Design philosophy earlier? What drives our events now?”

“Another Message Queue!”

This time, we have zero constraints. The data is now off the constrained network and residing on robust cloud/server infrastructure. We can now select any robust AMQP (Advanced Message Queuing Protocol) solution, such as RabbitMQ or the more heavy-duty Apache Kafka, to handle persistence, complex routing, and scaling across multiple consumer microservices.

As a result, our architecture seamlessly receives all the main components:

When it comes to the Total Cost of Ownership (TCO) and detection rate of such solutions, a machine with 4 vCPU, 16 GB RAM, and 100 GB of storage will be sufficient, costing us $100-200 per month. Of course, you can build your own desktop machine for greater savings in the long run.

Failure point analysis

Now, let's step back and critically review the resulting architecture to identify the most critical components that could bring the entire system down.

It's clear, even to the naked eye, that the MQTT Broker is the most vulnerable and critical component.

Not only does it serve as the essential bridge between the constrained sensor layer and the powerful data processing layer, but a self-hosted broker also introduces significant limitations:

When dealing with a few dozen connected devices, setting up your own broker and gradually patching these limitations might be feasible.

However, since we are major oil producers with, say, 300 wellheads and ambitious long-term digitalization plans, we need to consider more robust alternatives. The smart move is to look toward major cloud providers and the highly available, fully managed IoT services they offer.

Cloud IoT Solutions

So what solutions do today's leading cloud providers offer us?

AWS IoT Core

The AWS platform offers a formidable suite of solutions tailored specifically for the Internet of Things (IoT) domain, providing the necessary tools for architecting, deploying, and managing large-scale device fleets.

AWS breaks down the key IoT challenges into highly focused, managed services:

For our current task - architecting a robust and scalable IoT solution - we will primarily focus on AWS IoT Core. This service is a game-changer; it virtually eliminates much of the inherent boilerplate and complexity associated with secure device connectivity.

AWS IoT Core acts as a comprehensive Registry of "Things" with a vast array of surrounding functionality. When a new device (a "Thing") is added to the Registry, the service automatically takes care of critical security steps:

The "cherry on top" for developers is the streamlined onboarding process. Upon "Thing" creation, AWS IoT Core provides a nearly ready-to-use and fully configured package for our device's transmitter or client code. With support for a wide selection of programming languages and environments via the AWS SDKs, this capability significantly accelerates the initial development and integration process.

In addition to connectivity, every created "Thing" can have a virtual representation known as a "Device Shadow".

A Device Shadow enables your application to work with a virtual representation of the device even when the physical device is not connected to the cloud. You can retrieve the last reported state, fetch various data points, and create **delayed commands** that will be automatically delivered to the device once it reconnects. This is crucial for maintaining a responsive user experience and reliable device management, regardless of connectivity status.

By modifying our architecture, we get:

And if we go even deeper, then, in essence, the entire "Date Processing and Storage" level can be moved to cloud services and even made almost completely serverless. To do this, we'll do the following:

We ultimately achieve an architecture that is not only robust but also highly optimized for cost and operations:

While our serverless architecture offers immense benefits in scalability and operational efficiency, it is not without its challenges:

Overall, within the framework of our oil mission, we can safely accept all these risks, and launch configurations can control high costs at high volumes.

Total Estimated Cost of Ownership (TCO)

“So, how much will our AWS solution cost us per month?

Let's expect about 3 million messages per month from 300 wells, since we don't need to send telemetry very frequently.

AWS IoT Core

Metric

Monthly Volume

Price per Unit (per Million)

Free Tier

Calculation

Cost

Connectivity (300 devices)

300 devices

$0.01 per device

-

$0.01 * 300

$3.00

Messaging (Inbound)

3 million msgs

$2.30 per Million

-

$2.30 * 3

$6.90

Rules (Triggered)

3 million rules

$0.15 per Million

-

$0.15 * 3

$0.45

Rules (Actioned)

3 million actions

$0.15 per Million

-

$0.15 * 3

$0.45

Total

$10.80

AWS Lambda (Data Save Service)

Metric

Monthly Volume

Price per Unit

Free Tier

Calculation

Cost

Requests (Invocations)

3 million

$0.20 per Million

1M free

$0.20 x (3 - 1)

$0.40

Duration (Assumption: 128MB, 100ms per invocation)

3M x 0.1s x 128MB = 37k GB-s

$0.0000166667 per GB-s

400k GB-s free

0

$0.00

Total

$0.40

AWS SQS

Total Requests: 3M (SendMessage) + 3M (ReceiveMessage) = 6 M

Metric

Monthly Volume

Price per Unit (per Million)

Free Tier

Calculation

Cost

Requests (API Requests)

6 million

$0.40 per Million

1M free

$0.40 x (6 - 1)

$2.00

Total

$2.00

AWS RDS

Assumption: You are using a small, entry-level instance for development or light workload: db.t4g.micro (PostgreSQL/MySQL, On-Demand) in US East (N. Virginia), 20 GB of GP3 storage with 3000 IOPS.

Metric

Monthly Volume

Price per Unit

Free Tier

Calculation

Cost

db.t4g.micro Instance

730 hours

$0.016 per hour

-

$0.016 x 730

$11.68

GP3 Storage

20 GB

$0.115 per GB-mo

-

$0.115 x 20

$2.30

IOPS

3,000

Free with 3000 on GP3

0

$0.00

Total (1 instance)

$13.98

AWS ECS (Data Analytics Service)

Assumption: Using AWS Fargate for the analytics container. A month has approximately 24 hours/day x 30.44 days/month = 730 hours.

Metric

Monthly Volume

Price per Unit

Calculation

Cost

vCPU Hours

730 hours

$0.04048 per vCPU-hour

1 x 730 x 0.04048

$29.55

Memory Hours

730 hours

$0.004445 per GB-hour

1 x 730 x 0.004445

$6.50

Total

$36.05

Total Estimated Cost of Ownership

Component

Estimated Monthly Cost

AWS IoT Core

$10.80

AWS Lambda

$0.40

AWS SQS

$2.00

AWS RDS (2 x db.t4g.micro)

$27.96

AWS ECS (Fargate)

$36.05

Total

$77.21

Microsoft Azure IoT Hub

Azure IoT Hub is a managed service that acts as a central message hub in your cloud-based Internet of Things (IoT) solution. It provides reliable, secure, and scalable communication between your IoT application and the devices connected to it. Virtually any device can be connected to IoT Hub.

The service supports several messaging patterns, including device-to-cloud telemetry, file uploads from devices, and request-reply methods for device management. IoT Hub also supports monitoring capabilities, which help you track device creation, connectivity, and failures.

IoT Hub is built to scale up to millions of simultaneously connected devices and millions of events per second to support demanding IoT workloads.

Similar to AWS, Microsoft provides support for digital twins (or "shadows"), allowing for a digital representation of your physical device in the cloud. Just like AWS IoT Core, Azure supports multiple communication protocols, such as MQTT, AMQP, and HTTPS.

In our specific case, we are primarily interested in MQTT.

It is worth noting that while the official documentation states that IoT Hub is not a full-fledged MQTT broker and recommends using Azure Event Grid for more comprehensive MQTT features, the functionality that IoT Hub provides is perfectly sufficient for the scope of our current project.

So, let’s replace the MQTT Broker in our architecture with Azure IoT Hub

A particularly appealing feature of Azure IoT Hub is the flexibility it offers in device authentication. While it supports the standard practice of using X.509 Certificates (which, like AWS, it can help manage and generate), it also offers a pragmatic alternative: Shared Access Signature (SAS) tokens. This gives developers more options for managing device security, especially in resource-constrained or heterogeneous device environments.

And just like AWS, we can make our entire cloud component almost completely serverless:

Total Estimated Cost of Ownership

Just like in the previous example, we'll take 300 wells that generate about 3 million messages per month.

Azure IoT Hub

Azure pricing is tiered. For our purposes, Standard S1 (400,000 messages/day per unit and 500,000 devices) is sufficient. The monthly cost is $25.00.

Azure Functions (Data Save Service)

Metric

Monthly Volume

Price per Unit

Free Tier

Calculation

Cost

Requests (Invocations)

3 million

$0.40 per Million

250k free

$0.40 x (3 - 0.25)

$1.10

Duration (Assumption: 128MB, 100ms per invocation)

3M x 0.1s x 128MB = 37k GB-s

$0.000026 per GB-s

100k GB-s free

0

$0.00

Total

$1.10

Azure Service Bus (ASB)

Like IoT Hub, it can be paid for in tiers. The Standard tier, at $5.00 per month, suits us perfectly, as it includes 12.5 million operations.

Azure SQL Database (Data Storage)

Metric

Monthly Volume

Price Per Unit

Free Tier

Calculation

Cost

vCore

730 hours

$0.05825 per hour

-

$0.05825 x 1 x 730

$42.52

Storage

20GB

$0.115 per GB-month

-

$0.115 x 20

$2.30

Total (1 instance)

$44.82

Azure Container Apps (Data Analytics Service)

Metric

Monthly Volume

Price Per Unit

Free Tier

Calculation

Cost

vCPU Cost

730 hours

$0.0571 per vCPU-hour

-

1 x 730 x 0.0571

$41.68

Memory Cost

2 GB

$0.0050 per GiB-hour

-

2 x 730 x 0.0050

$7.30

Total

$48.98

Total Estimated Cost of Ownership

Component

Estimated Monthly Cost

Azure IoT Hub

$25.00

Azure Functions

$1.10

Azure Service Bus

$5.00

Azure SQL Database (2 instances)

$89.64

Azure Container Apps

$48.98

Total

$169.72

Conclusions

So what conclusions can we draw from this article? Of course, any architectural solution is limited only by our imagination and experience: we can endlessly optimize and improve, replacing some components with others. The main thing is that all this makes us happy oil owners who strive for perfection.

Our analysis showed that a managed cloud solution is the best fit for the mission-critical, scalable Industrial Internet of Things (IIoT).

We clearly saw that the perceived savings and low initial costs of in-house development using a standard MQTT broker lead to huge and unjustified risks:

For any serious IIoT project that must operate 24/7, scale, and deliver business value, using managed cloud platforms is the path to success.

By choosing this, you're not just buying a service; you're buying time, reliability, and security. Instead of spending resources on reinventing the wheel, we can focus on what is most important: developing predictive models and applications that directly generate profit for the company.

The choice of cloud solutions is a strategic decision that allows a business to remain focused on its core competencies, rather than on infrastructure management.

It is important to understand that a cloud solution is not a silver bullet. The recent discontinuation of Google Cloud IoT Core in 2023 serves as additional evidence of the complexity in managing an IoT hub. Even a giant like Google acknowledged the expediency of delegating this task to specialized partners (HiveMQ, EMQX, ClearBlade), which further confirms the risks associated with an unproven or self-developed solution.

But that's another story entirely.