Write-Intensive Systems: Key Challenges in Distributed Systems
DEV Community

Write-Intensive Systems: Key Challenges in Distributed Systems

Disk IO and Write Amplification

Writing directly to database disks on every request can create severe latency, especially if there are random writes. Every disk has its limited IOPS; even SSDs saturate under extreme write loads.

The solution is to use techniques like LSM (Log Structured Merge Trees) that allow appending or writing data in sequential logs in memory and then flushing it into disk asynchronously. Databases like Apache Cassandra, Amazon DynamoDB, and RocksDB support this LSM data structure.

Replication Lag Problem

In highly scalable systems, we don't rely on a single node or server, but we must have multiple nodes as backups where our system can read from - for example, the Primary Node (main node for writing) and Secondary Nodes (backup nodes for reading).

For some systems, we need to have our data consistent across all our nodes. Replicating the Primary Node changes into secondary nodes can take time under heavy load, and replicas falling behind can cause stale reads and consistency issues. In that case, we choose a Sync or Async Replication strategy, or multi-leader replication, which will increase our writes but also increase the risk of update conflicts.

Hot Spots / Skewed Writes

During write-intensive workloads, when the system (database, servers, and partitions) experiences an unusually high load or demand leading to performance bottlenecks, reduced throughput, and even service failures, these problems are known as hot spots. This issue occurs when any system has disproportionate writes, such as a viral social media post or a trending stock ticker.

To solve these issues, we need to have some sharding strategies or a consistent hashing technique to reduce the hotspot problem.

Consistency in Distributed Writes (CAP Theorem)

If we have a high-write-intensive system, chances are this will be distributed in nature. Under network partitions, we must choose between consistency and availability because writing to multiple replicas guarantees durability but requires a trade-off between them. In this case, we need to choose the replication strategy for either eventual or strong consistency that is more suited to our non-functional requirements.

Concurrency and Locking

In write-intensive systems, there will be a lot of concurrent write requests hitting our system that can try to update the same record (e.g., an inventory record or a rate limit counter) in rows or partitions, which will cause lock contention.

There are multiple solutions to handle this, depending on the use case:

  • Multi Version Concurrency Control (MVCC) - a database technique that allows different transactions to read and update without blocking each other. Instead of overwriting a record when it is updated, the database creates a new version of that record. The user can read only the record while the new record writer updates a new version without stopping others.
  • Optimistic locking - also known as optimistic concurrency control (OCC), which creates an actual database-level lock.

Indexing and Data Partitioning

Every write operation must update all secondary indexes. The more indexes, the higher the overhead and storage. Also, a single node database won't work when the data increases - a single node will eventually run out of storage. Thus, we need to distribute the data across different database nodes, and as a result, index storage will also increase.

The solution is to implement Horizontal Sharding by distributing data across multiple databases. This requires a carefully chosen partition key to ensure the write workload is evenly distributed across nodes, along with implementing write-optimized index structures.

Systems Worth Studying

  • Cassandra - LSM tree, tunable consistency, excellent write throughput
  • RocksDB - Embeddable LSM engine used by many write-heavy systems
  • Kafka - Sequential log writes, used as a write buffer in many architectures
  • ScyllaDB - Cassandra-compatible, designed to reduce write latency further
  • ClickHouse - Write-optimized columnar store for analytics

πŸ”— Let’s Connect! If you enjoyed this post or would like to connect professionally, feel free to reach out to me on LinkedIn.

Comments

No comments yet. Start the discussion.