In my last article, we discussed the exponential growth of events in today’s data-driven world. With so many apps, smart devices, and machines all around us, the amount of data created is massive. We also explored how an orchestration platform can help deliver these events to the right applications. However, delivering events may not be enough for businesses to make an impact.

By analyzing these events to understand the behavior of the users, businesses can serve their customers better by making smarter decisions. A real-time analytics platform can help convert the event’s data into meaningful intelligence.

This article explores how to build a real-time analytics platform using AWS, evaluating possible solutions, and providing a step-by-step guide to implementing a scalable and reliable platform. Building this platform involves three steps: ingesting data, processing, and querying. Real-time analytics often focuses on trends and patterns over time - whether it’s user behavior or system performance.

Time-series data naturally organizes events in sequence, making it easy to analyze the data from moment to moment. Time-series storage aligns perfectly with this need, allowing applications to compute the metrics. AWS offers tools like SQS, Lambda, Timestream, and Quicksight that work seamlessly together to build this platform.

There are three major parts involved in building a real-time analytics platform

Timestream

Amazon Timestream, AWS’s time-series database, is designed to meet the challenges of processing and analyzing vast amounts of data efficiently. Timestream is serverless, scalable, and ideal for applications requiring real-time data analytics. Its key features include:

Implementation

The cloud formation (CFN) template for all resources needed can be found in Github Repo.

Below is the snippet for the CFN template to create the Timestream database and the table:

 EventsDatabase:
    Description: 'Timestream database to store event data'
    Type: 'AWS::Timestream::Database'
    Properties:
      DatabaseName: !Ref EventsDatabaseName
      KmsKeyId: alias/aws/timestream

  EventsTable:
    Description: 'Timestream table that stores event metrics'
    Type: 'AWS::Timestream::Table'
    DependsOn: EventsDatabase
    Properties:
      DatabaseName: !Ref EventsDatabase
      TableName: !Ref EventsTableName
      RetentionProperties:
        MemoryStoreRetentionPeriodInHours: 72
        MagneticStoreRetentionPeriodInDays: 365

Testing

Services can publish the event in the following format to SQS that triggers the whole processing flow:

{
  "order_id": "test-order-1",
  "customer_id": "test-customer-1",
  "event_type": "order_success",
  "metric_value": 1
}

Conclusion

This architecture offers a simple and efficient way to build a scalable and reliable analytics platform. There are other alternatives depending on specific needs, including AWS Kinesis Streams for event processing, Prometheus for a data store, and S3+Athena for batch processing and analytics.