This is a short series that I wanted to share for a long time about the basics of “Cost Optimization” on AWS.

Let’s start this journey with DocumentDB!

Don’t hesitate to 👏 if you liked this post ;)

Okay, to be really honest, this title is clickbait*.*

I could definitely write something like “how I made cost optimization on our AWS infrastructure by respecting some commons guidelines provided in the documentation” but it’s way less catchy, nah?

Maybe some of you guys will already know these tricks and good practices.

If you’re looking straight for the checklist that I’m suggesting, scroll here.

Understand the hell-tricky I/O cost from the DocumentDB service

If you look at their pricing page, it’s divided by 4 costs dimensions, I resume it here :

So, what’s behind I/O’s?

AWS explains that with the DocumentDB service, you don’t have to provision I/O resources in advance, which is kind of interesting, because you don’t have storage limitations and you can easily handle a pick of I/O operations. It seems fair, as you’re bill for the usage.

AWS describes in their documentation what covers I/O operations, it’s mainly all operations like find, insert, update, and delete or some features like change streams and TTL (time to live) indexes.

Well, everything that will hit the storage volume will be billed to you.

Wait, what, 0.20$ per million I/O?

Let’s make AWS lose money, right now!

There’s a phrase on AWS DocumentDB documentation that will catch your eyes (and wallet 💸) :

Once, once the data has been read from the storage volume and continues to reside in memory, subsequent reads of the same data do not incur additional I/Os.

This phrase is key to understanding what’s behind I/Os.

Which operations use less I/Os?

Queries that use an index will likely use fewer I/Os as you’re not scanning the all storage of your collection. It’ll certainly consume I/Os but way less than scanning an entire collection.

Furthermore, the RAM of your instance needs to cover your index size, it’ll allow you to not incur additional I/Os.

Please have in mind that you need to respect some principles with index usage.

Checklist ✅

Here’s my advice/checklist when you want to optimize your I/O usage and reduce your costs and improve performance.

You’ll see that I'm not a genius as I just aggregate information from the AWS DocumentDB Documentation page with some common best practices that are not strictly applicable to DocumentDB.

It’s always good to refresh our minds with principles.

Index Stats query

The query will output the field ops which is corresponding to the number of times that your index is hit. Depending on the load of you’re application, please consider removing the unused index.

Hope you’ll appreciate these tricks that I learned while working on AWS Cost Optimization for my company.

Stay tuned for another post!

Don’t hesitate to 👏 if you liked this post ;)

PS: if something seems wrong or misunderstanding, don’t hesitate to DM me.


Also Published Here