This is a continuation of a series of articles in which I briefly cover the main points of a specific topic in system architecture design. The previous article can be read here, and the complete guide you can find on my github.

The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location. This caching strategy is used to reduce latency and improve the efficiency of data retrieval across the distributed system.

Terms of Caching

Benefits of using Caching

Challenges of using Caching

Types of Caching

  1. Client Side: Caching web content in a browser or device to accelerate content retrieval.

  2. CDN (Content Delivery Network): Distributing content across multiple geographic locations to improve access speed.

  3. Load Balancer / API Gateway: Balancing incoming network traffic and requests across multiple servers and potentially caching these requests.

  4. Application: Caching within an application to improve performance and data access.

    1. CPU Cache: Stores frequently accessed data to reduce CPU access time.
      • L1: Instruction and Data
      • L2: Shared or per-core L2 cache
      • L3: Shared among multiple CPU cores
    2. In-memory Cache: Caching data within a single application process.
    3. Shared Memory Cache: Sharing cached data across different processes in the same system.
    4. Disk Cache: Caching read operations from a physical disk.
      • File System Caching: The file system may cache frequently accessed data and metadata.
      • Operating System-Level Disk Caches: Modern operating systems often employ disk caching to improve I/O performance system-wide.
      • Application-Specific Caches: Some applications implement their caching mechanisms to store and manage frequently used data.
      • Third-Party Caching Solutions: Some third-party caching solutions and libraries can be integrated into applications to provide caching capabilities.
  5. Distributed Cache: Sharing cache across multiple systems or services. Sharding techniques:

    • Key-Based Sharding: Data is partitioned and distributed across cache nodes based on the keys of cached items.
    • Range-based: Distributing data based on a range of values.
    • Hash-Based Sharding: Data is partitioned using a hash function that evenly distributes keys across multiple shards.
    • Consistent Hashing: This technique combines the benefits of key-based and hash-based sharding. It uses a consistent hashing algorithm to map keys to cache nodes, allowing for dynamic scaling without significant data redistribution.
  6. Full-text Search: Indexing and searching through documents or logs.

  7. Database: Storing frequently accessed database queries and results to speed up future requests.

Top Caching Strategies

Cache Aside

Also known as "Lazy Loading," this strategy involves loading data into the cache on demand. When an application requests data, it first checks the cache. If the data is not found (cache miss), it is fetched from the database and stored in the cache for future requests.

Pros:

Cons:

Read Through

In this approach, data is automatically loaded into the cache from the database when there is a cache miss. The application only interacts with the cache and not directly with the database for read operations.

Pros:

Cons:

Write Around

When data is written, it is written directly to the database and not to the cache. The cache is only updated when data is read.

Pros:

Cons:

Write Back (Write Behind)

Data is first written to the cache and then, after a certain amount of time or under certain conditions, written back to the database. This allows for batch updates.

Pros:

Cons:

Write Through

Data is written simultaneously to the cache and the database. This ensures data consistency between the cache and database.

Pros:

Cons:

Cache Eviction Policies

Cache eviction policies are critical in caching systems due to the limited size of caches; they ensure optimal use of available space by determining which data to retain or discard. These policies enhance overall cache performance by keeping the most relevant data accessible while maintaining data accuracy and consistency by removing outdated or less frequently used information.

The most well-known strategies include the following:

  1. First In, First Out (FIFO): Evicts the oldest items in the cache first, regardless of their usage frequency.
  2. Least Recently Used (LRU): Evicts the least recently accessed items first, assuming that items not accessed recently are less likely to be accessed in the future.
  3. Most Recently Used (MRU): Opposite of LRU, it evicts the most recently used items first. This can be useful when the most recent items are less likely to be reaccessed.
  4. Least Frequently Used (LFU): Prioritizes eviction of least frequently accessed items, assuming frequent access implies future relevance.
  5. Most Frequently Used (MFU): Eviction policy is a cache eviction strategy where the cache identifies and removes the data items that are accessed most frequently.
  6. Random Replacement (RR): Randomly selects a cache item to evict, which can be simpler to implement and effective in specific scenarios.
  7. Size-Based Eviction: Evicts items based on their size to manage the memory footprint, often used in combination with other policies.

Cache invalidation

In addition to removing infrequently accessed items, caches often contain data that becomes obsolete or stale. These outdated cache entries need to be identified and slated for removal.

The most well-known strategies include the following:

  1. Time to Live (TTL): Data is invalidated after a specified duration. When the TTL expires, the cached data is either automatically removed or marked as invalid. There are two approaches:
    • Active expiration: A background process or thread periodically scans the cache to check the TTL of cache entries.
    • Passive expiration: Checks the TTL of a cache entry at its access time.
  2. Write-Invalidate: When data is updated in the primary storage, corresponding cache entries are invalidated. This ensures consistency between the cache and the source.
  3. Change Notification: The cache listens for notifications from the data source about changes. When notified, the cache invalidates the relevant entries.
  4. Polling: The cache periodically checks the validity of its entries by comparing them with the source data.

Popular Solutions

List of some popular caching solutions widely used in various applications and systems:

  1. Redis: An in-memory data structure store used as a database, cache, and message broker and known for its performance and flexibility.
  2. Memcached: A high-performance, distributed memory object caching system primarily used for speeding up dynamic web applications by alleviating database load.
  3. Ehcache: An open-source, Java-based cache that provides fast, off-heap storage. It’s widely used in Java applications for caching.
  4. Apache Ignite: A distributed database, caching, and processing platform designed for transactional, analytical, and streaming workloads at a large scale.
  5. Hazelcast: An in-memory computing platform that provides distributed caching, messaging, and computing. Often used for performance-critical applications.
  6. Squid: A caching and forwarding HTTP web proxy. It can cache web, DNS, and other computer network lookups for people sharing network resources.
  7. CDN Solutions (like Akamai, Cloudflare): These are not traditional caching solutions but are often used for caching static and dynamic content closer to the end users in distributed networks.

In conclusion, caching is a critical component in modern computing, offering a powerful solution to enhance performance, reduce latency, and manage data efficiently across various systems and applications. From accelerating web page loading times to optimizing database queries, caching is pivotal in improving user experiences and system responsiveness. Popular solutions like Redis, Memcached, and CDN services demonstrate the versatility and adaptability of caching strategies to different needs, from small-scale applications to large, distributed architectures.