Different companies choose different databases not because one is better, but because of different access patterns (read/write behavior, query needs, latency requirements).
Key idea:
“Right database = Right access pattern + Right trade-offs”
Netflix Architecture (Cassandra)
Problem
- Massive scale: 3+ million writes per second
- Tracks:
- Viewing activity
- Search
- User interactions
- Device data
Solution: Apache Cassandra
- Designed for high write throughput
- Works like a distributed hash map
- Writes:
- Routed via partition key
- Stored in memory (memtable)
- Fast acknowledgment
Strengths
- Horizontal scalability
- Extremely high write performance
- Simple key-based reads
Trade-offs
- ❌ No joins
- ❌ No flexible queries
- ❌ No ad-hoc analytics
Design Strategy
- Query-based data modeling (not entity-based)
- Heavy denormalization
- Same data stored in multiple tables
👉 Example:
- Table 1: User viewing history
- Table 2: Trending by region
(Same data, different structure)
Key Insight
👉 Cassandra = “Write fast, query simple”
👉 You design tables per query
Instagram Architecture (PostgreSQL)
Problem
- Complex queries:
- Feed generation
- Followers & relationships
- Comments sorting
- Aggregations (likes, counts)
- Read-heavy system
Solution: PostgreSQL
Strengths
- Joins
- Aggregations
- Flexible queries
- Easy feature changes
Scaling Techniques
- Connection pooling (PgBouncer)
- Read replicas
- Table partitioning
- Smart indexing
Trade-offs
- ❌ Harder to scale for massive writes
- ❌ Requires careful optimization
Key Insight
👉 Instagram chose flexibility over raw scale
👉 “If your system is relational, use relational DB”
Twitter Architecture (Redis + Hybrid)
Problem
- Timeline generation:
- Merge tweets from thousands of users
- Serve instantly (milliseconds)
- Ultra-low latency requirement
Solution: Redis (Cache Layer)
Strategy: Fan-out on Write
- When user tweets:
- Push tweet to followers’ timelines
- Timeline is precomputed and stored in memory
Strengths
- Microsecond read latency
- Extremely fast
Trade-offs
- ❌ Not durable (data loss risk)
- ❌ Limited storage
Critical Architecture Rule
👉 Redis is NOT source of truth
- Write → Database
- Replicate → Redis
- Read → Redis
- Fallback → DB
Edge Case Handling
- For celebrity users (millions of followers):
- Use fan-out on read
- Merge data dynamically
Key Insight
👉 Redis = “Speed over everything (with backup)”
Comparison Summary
| Platform | Database | Why |
|---|---|---|
| Netflix | Cassandra | Massive writes |
| PostgreSQL | Complex relational queries | |
| Redis | Ultra-fast reads |
Key System Design Learnings
1. Access Pattern Drives Design
- Write-heavy → Cassandra
- Read-heavy + relational → PostgreSQL
- Ultra-fast reads → Redis
2. Every Database Has Trade-offs
| DB | Strength | Weakness |
|---|---|---|
| PostgreSQL | Flexibility | Write scaling |
| Cassandra | Write scale | Query flexibility |
| Redis | Speed | Durability |
👉 You always sacrifice something
3. Denormalization vs Normalization
- SQL → normalized
- NoSQL → denormalized
👉 Cassandra requires data duplication as strategy
4. Precomputation is Powerful
- Twitter pre-builds timelines
- Avoids real-time computation
👉 Trade compute time for read speed
5. Hybrid Architectures Win
- No single DB is enough
- Combine:
- DB (source of truth)
- Cache (speed)
- Analytics systems
6. Scale ≠ Complexity (Important Insight)
👉 Instagram scaled to billions using PostgreSQL
👉 Lesson:
- Don’t over-engineer early
- Use simple systems until needed
Decision Framework (Most Important)
Ask 3 Questions:
1. What is your access pattern?
- Writes? Reads? Queries?
2. What trade-offs can you accept?
- Speed vs flexibility vs durability
3. Do you really need it?
- 95% apps work fine with PostgreSQL
Final Insight
👉 There is no “best database”
👉 The real skill is:
- Understanding data flow
- Understanding user behavior
- Mapping it to correct architecture
Leave a comment
Your email address will not be published. Email is optional. Required fields are marked *




