notes on Book: ‘Designing Data Intensive Applications’

4 minute read

My Book Club notes

Part I: Foundations of Data Systems

Ch 1: Relible, Scalable and Maintanable Apps

Data-intensive vs Compute-Intensive

CPU is rarely a limit these days. usually it is: - amount of data - complexity of data - speed at which it is changing

Thinking about Data Systems

Boundaries between data systems(DBs, Caches, queues) are becoming blurred

—

role of Software Engineer now also includes DataSystem designer (Arch?)
- We have to address:
  - keeping data correct and complete during storm
  - providing consistent performance to clients, when sys is degraded
  - how to scale to handle increased load
  - what is a good API for this service?
- factors:
  - team skill / exp
  - legacy sys
  - time-pressure
  - risk appetite
  - regulatory
  - etc etc
- Note: what is legacy? assumptions/conventions w/o data

Reliability

fault-tolerant (aka resilient)
fault vs failure
Define scope of faults - we can’t tackle them all (i.e diff region? alien invasion?)
Essentially we build reliable sys from unreliable parts (i.e. my Mec*ano kit)
we need to deliberatly trigger faults (i.e. kill processe w/o warning). Many bugs are due to poor error handling
Netflix Chaos Monkey
we prever tolerating faults over preventing faults

Hardware Faults

-1st response: add redundancy - as it is well understood until recently hardware redundancy was sufficient, but it changes with the rise of flexibility and elasticity priorities, over single machine reliability Hence the move is towards systems that can tolerate the loss of machines, by using software fault-tolerance techniques (in preference OR in addition to hardware redundancy)