System Design Series (Part 2) - Crash Course: Architecture, Networking, Databases & Scalability
π― Mα»₯c TiΓͺu BΓ i ViαΊΏt
α» Part 1 chΓΊng ta ΔΓ£ nαΊ―m mindset: solve business problems, keep it simple, scale when needed. BΓ i nΓ y Δi vΓ o technical fundamentals β nhα»―ng building blocks mΓ mα»i system design interview Δα»u hα»i.
System design interview khΓ΄ng phαΊ£i viαΊΏt code. NhΓ tuyα»n dα»₯ng muα»n thαΊ₯y bαΊ‘n hiα»u architecture, scalability vΓ tradeoffs.
Series Navigation
Part 1 β Engineering Mindset: Beyond Beautiful Code
Part 2 β (bΓ i nΓ y) System Design Crash Course: Fundamentals
ποΈ 1. System Design Interview β Focus α» ΔΓ’u?
System Design Interview
Problem Statement
β
βΌ
Architecture Design
β
βΌ
βββββββββββββββββββββ
β Load Balancer β
β App Servers β
β Database β
β Cache β
β CDN β
β Message Queue β
βββββββββββββββββββββ
β
βΌ
Tradeoff Discussion
(Consistency vs Availability,
Latency vs Throughput,
Cost vs Performance)
BαΊ‘n cαΊ§n hiα»u tα»«ng component vΓ biαΊΏt khi nΓ o dΓΉng cΓ‘i nΓ o.
π» 2. Computer Architecture β Nα»n TαΊ£ng
TrΖ°α»c khi thiαΊΏt kαΊΏ hα» thα»ng lα»n, cαΊ§n hiα»u mΓ‘y tΓnh chαΊ‘y code nhΖ° thαΊΏ nΓ o.
Data Hierarchy
Bit β 0 hoαΊ·c 1
Byte β 8 bits
KB β 1,024 bytes
MB β 1,024 KB
GB β 1,024 MB
TB β 1,024 GB
Memory Hierarchy (Tα»c Δα» Truy CαΊp)
Fastest ββββββββββββββββββββ Slowest
CPU Cache RAM SSD HDD
(L1/L2/L3)
~1ns ~100ns ~100ΞΌs ~10ms
Diagram: Computer Architecture
CPU
β
βΌ
CPU Cache (L1/L2/L3) β nanoseconds
β
βΌ
RAM β ~100 nanoseconds
β
βΌ
SSD β ~100 microseconds
β
βΌ
HDD β ~10 milliseconds
| Component | Role | ΔαΊ·c Δiα»m |
|---|---|---|
| CPU | Execute instructions | Xα» lΓ½ logic |
| Cache | Ultra-fast memory (gαΊ§n CPU nhαΊ₯t) | KB β vΓ i chα»₯c MB |
| RAM | Active program memory | GB, mαΊ₯t khi tαΊ―t mΓ‘y |
| SSD/HDD | Persistent storage | TB, lΖ°u lΓ’u dΓ i |
Tip cho Frontend: Khi bαΊ‘n hiα»u memory hierarchy, bαΊ‘n sαΊ½ hiα»u tαΊ‘i sao caching quan trα»ng β nΓ³ ΔΖ°a data lΓͺn tαΊ§ng nhanh hΖ‘n.
π 3. Production Application Architecture
Mα»t production app thα»±c tαΊΏ khΓ΄ng chα» cΓ³ server.
Full Production System
Developer
β
βΌ
Git Repository (GitHub / GitLab)
β
βΌ
CI/CD Pipeline (GitHub Actions / Jenkins)
β
βΌ
Build & Test
β
βΌ
Deployment
β
βΌ
ββββββββββββββββββββββββββββββββ
β PRODUCTION β
β β
β Load Balancer β
β β β
β ββββββ΄ββββββ β
β βΌ βΌ β
β App Server App Server β
β β β β
β ββββββ¬ββββββ β
β β β
β Database Server β
β β β
β External Storage (S3) β
β β
ββββββββββββββββββββββββββββββββ
Observability Layer
Production system luΓ΄n cαΊ§n monitoring:
Application
β
ββββΊ Logging (ELK / CloudWatch)
β
ββββΊ Monitoring (Grafana / Datadog)
β
ββββΊ Alerting (PagerDuty / Slack)
β
ββββΊ Tracing (Jaeger / Zipkin)
Debugging Production Issue β Quy TrΓ¬nh ChuαΊ©n
Step 1: Detect
User report / Alert fires
β
βΌ
Step 2: Analyze
Check logs & monitoring
β
βΌ
Step 3: Reproduce
Reproduce in staging environment
β
βΌ
Step 4: Debug
Root cause analysis
β
βΌ
Step 5: Fix
Hotfix β Test β Deploy
β
βΌ
Step 6: Post-mortem
Document what happened & prevent recurrence
βοΈ 4. Core Pillars of System Design
Mα»t system tα»t phαΊ£i ΔαΊ£m bαΊ£o 4 pillars:
Good System Design
ββββββββββββββββββββββββββ
β 1. Scalability β β Handle growth
β 2. Maintainability β β Easy to change
β 3. Efficiency β β Fast & cost-effective
β 4. Reliability β β Works when needed
ββββββββββββββββββββββββββ
3 HoαΊ‘t Δα»ng ChΓnh Cα»§a Mα»i Hα» Thα»ng
User Request
β
βΌ
Move Data β Network (truyα»n data giα»―a cΓ‘c component)
β
βΌ
Store Data β Database (lΖ°u trα»― persistent)
β
βΌ
Transform Data β Application Logic (xα» lΓ½ business logic)
β
βΌ
Response
Mα»i system design Δα»u xoay quanh Move, Store, Transform data hiα»u quαΊ£.
πΊ 5. CAP Theorem
Distributed system chα» cΓ³ thα» ΔαΊ£m bαΊ£o 2 trong 3:
Consistency (C)
β²
/ \
/ \
/ \
/ pick \
/ only \
/ 2 \
/ \
βΌ βΌ
Availability (A) ββ Partition Tolerance (P)
| Property | NghΔ©a |
|---|---|
| Consistency | Mα»i node Δα»u thαΊ₯y data giα»ng nhau cΓΉng lΓΊc |
| Availability | System luΓ΄n respond (dΓΉ cΓ³ thα» data chΖ°a mα»i nhαΊ₯t) |
| Partition Tolerance | System vαΊ«n hoαΊ‘t Δα»ng khi network bα» chia cαΊ―t |
VΓ Dα»₯ Thα»±c TαΊΏ
Banking System (CP)
ββ Consistency + Partition Tolerance
ββ ChαΊ₯p nhαΊn giαΊ£m availability
ββ LΓ½ do: Sα» dΖ° tΓ i khoαΊ£n PHαΊ’I chΓnh xΓ‘c
Social Media (AP)
ββ Availability + Partition Tolerance
ββ ChαΊ₯p nhαΊn eventual consistency
ββ LΓ½ do: Like count chαΊm 1-2 giΓ’y khΓ΄ng sao
β±οΈ 6. Availability & SLA
Availability Levels
Availability Downtime/year DΓΉng cho
βββββββββββββββββββββββββββββββββββββββββββββ
99% 3.65 ngΓ y Internal tools
99.9% 8.7 giα» Business apps
99.99% 52 phΓΊt E-commerce
99.999% 5 phΓΊt Banking, Healthcare
SLO vs SLA
SLO (Service Level Objective) SLA (Service Level Agreement)
= Internal goal = Contract vα»i khΓ‘ch hΓ ng
"Response time < 300ms" "99.99% uptime"
"99.9% uptime" "Otherwise refund 10%"
"Error rate < 0.1%" "Penalty if breached"
Reliability Concepts
Primary Server
β
β β health check
β
Backup Server (standby)
β
βΌ
Failover System
(auto-switch khi primary fail)
Reliability = System hoαΊ‘t Δα»ng ΔΓΊng
Fault tolerance = System chα»u Δược lα»i
Redundancy = CΓ³ backup components
π 7. Throughput vs Latency
Hai metric quan trα»ng nhαΊ₯t
Throughput Latency
= Bao nhiΓͺu request xα» lΓ½ Δược = Thα»i gian xα» lΓ½ 1 request
ΔΖ‘n vα»: ΔΖ‘n vα»:
ββ Requests/second (RPS) ββ Milliseconds (ms)
ββ Queries/second (QPS) ββ Seconds (s)
ββ Bytes/second
VΓ dα»₯: VΓ dα»₯:
"Server xα» lΓ½ 10,000 RPS" "API respond trong 50ms"
Client ββββ Request βββββΊ Server
β
Process
β
Client βββ Response ββββββ β
Latency = tα»ng thα»i gian tα»« request β response
Tradeoff: TΔng throughput (xα» lΓ½ nhiα»u request hΖ‘n) cΓ³ thα» lΓ m tΔng latency (mα»i request chαΊm hΖ‘n) nαΊΏu server overloaded.
π 8. Networking Basics
IP Address
IPv4 = 32 bit β 4 billion addresses
VΓ dα»₯: 192.168.1.1
IPv6 = 128 bit β gαΊ§n nhΖ° unlimited
VΓ dα»₯: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
Packet Communication
Computer A Computer B
β β²
βΌ β
ββββββββββββ ββββββββββββ
β Packet β βββββββββββΊ β Packet β
β IP headerβ β IP headerβ
β Data β β Data β
ββββββββββββ ββββββββββββ
TCP vs UDP
TCP (Transmission Control Protocol)
Client Server
βββ SYN βββββββββββββββΊ β
ββββ SYN-ACK ββββββββββ β
βββ ACK βββββββββββββββΊ β
β β
βββββ Reliable Data ββββΊβ
β (ordering, ack, β
β retransmission) β
Use cases: Web, API, Database, Email
ββββββββββββββββββββββββββββββββββββ
UDP (User Datagram Protocol)
Client Server
βββ Packet βββββββββββββΊ β
βββ Packet βββββββββββββΊ β
βββ Packet βββββββββββββΊ β
β (no guarantee, β
β no ordering) β
Use cases: Video call, Live streaming, Gaming
| Feature | TCP | UDP |
|---|---|---|
| Reliable | β Yes | β No |
| Ordering | β Guaranteed | β No |
| Speed | Slower | Faster |
| Connection | Connection-based | Connectionless |
| Use case | Web, API | Realtime |
π 9. DNS β Internet Phonebook
User types: example.com
β
βΌ
DNS Resolver
β
βΌ
Root DNS Server
β
βΌ
TLD DNS Server (.com)
β
βΌ
Authoritative DNS
β
βΌ
IP: 93.184.216.34
β
βΌ
Web Server responds
π‘ 10. Application Protocols
HTTP β Request/Response
Client ββ GET /api/users βββΊ Server
Client βββ 200 OK + data ββ Server
HTTP Status Codes
2xx β Success
ββ 200 OK
ββ 201 Created
ββ 204 No Content
3xx β Redirect
ββ 301 Moved Permanently
ββ 304 Not Modified
4xx β Client Error
ββ 400 Bad Request
ββ 401 Unauthorized
ββ 403 Forbidden
ββ 404 Not Found
5xx β Server Error
ββ 500 Internal Server Error
ββ 502 Bad Gateway
ββ 503 Service Unavailable
WebSocket β Realtime
Client βββββββββββββΊ Server
persistent connection
(bi-directional)
Use cases: Chat, Live trading, Sports updates
WebRTC β Peer to Peer
Client A βββββββββββββΊ Client B
direct connection
(no server in between)
Use cases: Video call, Voice chat
π 11. API Design
CRUD Operations (RESTful)
POST /products β Create
GET /products β Read all
GET /products/{id} β Read one
PUT /products/{id} β Update
DELETE /products/{id} β Delete
API Architecture
Client (Angular app)
β
βΌ
API Gateway
β
ββββΊ Service A (Users)
β β
β βΌ
β Database A
β
ββββΊ Service B (Products)
β β
β βΌ
β Database B
β
ββββΊ Service C (Orders)
β
βΌ
Database C
REST vs GraphQL vs gRPC
REST GraphQL gRPC
βββββββββββββββββββββββββββββββββββββββββββββββββββββ
GET /users query { Binary protocol
GET /orders user { Protobuf schema
name
orders
}
}
Pros: Pros: Pros:
ββ Simple ββ No over-fetching ββ Fastest
ββ Widely used ββ Flexible queries ββ Type-safe
ββ Cacheable ββ Single endpoint ββ Streaming support
Cons: Cons: Cons:
ββ Over-fetching ββ Complex ββ Not browser-native
ββ Multiple endpoints ββ Caching harder ββ Learning curve
ββ N+1 problem
Frontend perspective: REST lΓ default. GraphQL khi UI cαΊ§n flexible data fetching. gRPC cho microservice-to-microservice.
π 12. Caching
TαΊ‘i Sao Caching Quan Trα»ng?
Cache ΔΖ°a data lΓͺn tαΊ§ng memory nhanh hΖ‘n, giαΊ£m load cho database.
Cache Hit vs Cache Miss
Cache Hit (fast path) Cache Miss (slow path)
User User
β β
βΌ βΌ
Cache βββΊ Data found! Cache βββΊ Not found
β β
βΌ βΌ
Return response Database
(~1ms) β
βΌ
Get data
β
βΌ
Store in Cache
β
βΌ
Return response
(~100ms)
Cache Strategies
Cache-Aside (Lazy loading)
ββ App checks cache first
ββ If miss β read from DB β write to cache
ββ Most common strategy
Write-Through
ββ App writes to cache AND DB simultaneously
ββ Data always consistent
Write-Behind
ββ App writes to cache
ββ Cache async writes to DB
ββ Fastest write, risk of data loss
TTL (Time to Live)
ββ Cache expires after X seconds
ββ Balance between freshness vΓ performance
Common Cache Tools
Browser Cache β Static assets (JS, CSS, images)
CDN Cache β Content close to users
Redis / Memcached β Application-level cache
Database Cache β Query result cache
π 13. CDN β Content Delivery Network
CDN = Servers phΓ’n bα» trΓͺn toΓ n thαΊΏ giα»i, serve content gαΊ§n user nhαΊ₯t.
KhΓ΄ng cΓ³ CDN:
User (Vietnam)
β
β ~200ms latency
βΌ
Origin Server (US)
ββββββββββββββββββββββββββββββββ
CΓ³ CDN:
User (Vietnam)
β
β ~20ms latency
βΌ
CDN Edge (Singapore)
β
β (cache miss β fetch from origin)
βΌ
Origin Server (US)
CDN Benefits
ββ GiαΊ£m latency (serve tα»« edge gαΊ§n nhαΊ₯t)
ββ GiαΊ£m load origin server
ββ Chα»ng DDoS (distributed traffic)
ββ TΔng availability (multiple edge locations)
Frontend impact: Bundle JS/CSS, images, fonts β ΔαΊ·t lΓͺn CDN β user load nhanh hΖ‘n 5-10x.
π 14. Proxy
Forward Proxy
Client Internet
β β²
βΌ β
Forward Proxy βββββββββββββββββββββββββββββββΊβ
(hide client identity)
(content filtering)
(access control)
Reverse Proxy (Quan Trα»ng HΖ‘n Cho System Design)
Client
β
βΌ
Reverse Proxy (Nginx / Cloudflare)
β
βββ SSL termination
βββ Rate limiting
βββ Caching
β
ββββββββββββ
βΌ βΌ
Server1 Server2
βοΈ 15. Load Balancer
PhΓ’n Phα»i Request Δα»u Giα»―a CΓ‘c Server
Clients
/ | \
βΌ βΌ βΌ
βββββββββββββββββββββββ
β Load Balancer β
ββββββββββββ¬βββββββββββ
ββββββΌβββββ
βΌ βΌ βΌ
Srv1 Srv2 Srv3
Load Balancing Algorithms
Round Robin
ββ Request 1 β Server A
ββ Request 2 β Server B
ββ Request 3 β Server C
ββ Request 4 β Server A (lαΊ·p lαΊ‘i)
ββ Simple, equal distribution
Least Connections
ββ Route to server with fewest active connections
ββ Better for long-running requests
IP Hashing
ββ Same client IP β same server
ββ Session affinity
Weighted Round Robin
ββ Server A (powerful) β weight 5
ββ Server B (normal) β weight 2
ββ A nhαΊn nhiα»u traffic hΖ‘n
Layer 4 vs Layer 7
Layer 4 (Transport) Layer 7 (Application)
ββ Based on IP + Port ββ Based on HTTP content
ββ Faster ββ URL routing
ββ Simple ββ Header-based routing
ββ More flexible
ποΈ 16. Databases
SQL (Relational)
Examples: PostgreSQL, MySQL, SQLite
ββββββββββββββββββββββββββββ
β users β
ββββββββββ¬βββββββββ¬βββββββββ€
β id β name β email β
ββββββββββΌβββββββββΌβββββββββ€
β 1 β John β j@x.co β
β 2 β Jane β j@y.co β
ββββββββββ΄βββββββββ΄βββββββββ
Properties: ACID
ββ Atomicity β All or nothing
ββ Consistency β Data always valid
ββ Isolation β Transactions don't interfere
ββ Durability β Committed data persists
NoSQL
Document DB (MongoDB)
ββ JSON-like documents
ββ Flexible schema
ββ Good for: content management, catalogs
Key-Value (Redis)
ββ Simple key β value pairs
ββ Ultra fast
ββ Good for: caching, sessions
Wide-Column (Cassandra)
ββ Column families
ββ Massive scale
ββ Good for: time series, IoT
Graph DB (Neo4j)
ββ Nodes + relationships
ββ Good for: social networks, recommendations
Khi NΓ o DΓΉng SQL vs NoSQL?
SQL NoSQL
βββββββββββββββββββββββββββββββββββββββββ
Structured data Flexible schema
Complex queries (JOIN) Simple queries
ACID required Scale required
Relationships important High throughput
Banking, E-commerce Social media, IoT, Cache
π 17. Database Scaling
Vertical Scaling
Database Server
β
More CPU
More RAM
More Disk
Simple nhΖ°ng cΓ³ limit.
Horizontal Scaling β Replication
Master-Slave Replication
Master (Write)
β
ββββββ΄ββββββ
βΌ βΌ
Slave 1 Slave 2
(Read) (Read)
Use case: Read-heavy apps (90% read, 10% write)
ββββββββββββββββββββββββββββββββββββββββββββββ
Master-Master Replication
Master A βββββΊ Master B
(Read/Write) (Read/Write)
Use case: High availability, multi-region
Horizontal Scaling β Sharding
Sharding = chia data ra nhiα»u DB servers
User ID 1-1M User ID 1M-2M User ID 2M-3M
β β β
βΌ βΌ βΌ
Shard 1 Shard 2 Shard 3
Pros: Massive scale
Cons: Complex, cross-shard queries khΓ³
Database Performance β 3 Kα»Ή ThuαΊt Quan Trα»ng
1. Caching
ββ Cache query results trong Redis
ββ GiαΊ£m DB load dramatically
2. Indexing
ββ TαΊ‘o index trΓͺn columns hay query
ββ SELECT * FROM users WHERE email = ?
β β NαΊΏu cΓ³ index trΓͺn email: ~1ms
β β KhΓ΄ng cΓ³ index: full table scan ~1000ms
ββ Tradeoff: write chαΊm hΖ‘n, tα»n storage
3. Query Optimization
ββ EXPLAIN ANALYZE
ββ TrΓ‘nh SELECT *
ββ TrΓ‘nh N+1 queries
ββ Pagination thay vΓ¬ load all
πΊοΈ 18. Tα»ng Hợp β System Design Building Blocks
ββββββββββββββββββββββββββββββββββββββββββββββββ
β SYSTEM DESIGN BUILDING BLOCKS β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Client β β CDN β β DNS β β
β β (Browser)β β β β β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β
β β β β β
β ββββββββββββββββ΄βββββββββββββββ β
β β β
β βββββββββ΄βββββββββ β
β β Load Balancer β β
β βββββββββ¬βββββββββ β
β ββββββΌβββββ β
β βΌ βΌ βΌ β
β App Servers (cluster) β
β β β β β
β ββββββΌβββββ β
β β β
β ββββββββββββΌβββββββββββ β
β βΌ βΌ βΌ β
β Cache Database Message Queue β
β (Redis) (PostgreSQL) (RabbitMQ) β
β β β
β ββββββ΄βββββ β
β βΌ βΌ β
β Primary Replica β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββ
π― 19. Checklist Tα»± ΔΓ‘nh GiΓ‘
Computer Architecture
- Hiα»u memory hierarchy (Cache β RAM β SSD β HDD)?
- BiαΊΏt tαΊ‘i sao caching quan trα»ng?
Networking
- Hiα»u TCP vs UDP vΓ khi nΓ o dΓΉng?
- BiαΊΏt DNS hoαΊ‘t Δα»ng thαΊΏ nΓ o?
- PhΓ’n biα»t Δược HTTP status codes?
API Design
- BiαΊΏt REST vs GraphQL vs gRPC tradeoffs?
- Design Δược RESTful API chuαΊ©n?
Caching & CDN
- Hiα»u cache hit/miss flow?
- BiαΊΏt cache strategies (Cache-Aside, Write-Through)?
- Hiα»u CDN giαΊ£i quyαΊΏt vαΊ₯n Δα» gΓ¬?
Load Balancing
- BiαΊΏt cΓ‘c load balancing algorithms?
- PhΓ’n biα»t Δược Layer 4 vs Layer 7?
Databases
- BiαΊΏt khi nΓ o dΓΉng SQL vs NoSQL?
- Hiα»u ACID properties?
- BiαΊΏt replication vs sharding?
- Hiα»u CAP theorem vΓ Γ‘p dα»₯ng?
System Design Tα»ng Hợp
- VαΊ½ Δược architecture diagram cho production system?
- BiαΊΏt availability levels (99.9%, 99.99%)?
- PhΓ’n biα»t throughput vs latency?
π TΓ i Liα»u Tham KhαΊ£o
- Book: Designing Data-Intensive Applications β Martin Kleppmann
- Book: System Design Interview β Alex Xu
- Free: system-design-primer (GitHub)
- Free: ByteByteGo Newsletter
- Video: System Design for Beginners β NeetCode
- Practice: GreatFrontEnd System Design
π‘ Tα»ng KαΊΏt
System Design Interview cαΊ§n hiα»u:
1οΈβ£ Computer Architecture β Memory hierarchy, bottlenecks
2οΈβ£ Networking β TCP/UDP, DNS, HTTP
3οΈβ£ APIs β REST, GraphQL, gRPC
4οΈβ£ Caching β Redis, CDN, strategies
5οΈβ£ Load Balancing β Algorithms, L4 vs L7
6οΈβ£ Databases β SQL vs NoSQL, ACID, replication
7οΈβ£ Distributed Systems β CAP theorem, consistency models
8οΈβ£ Scalability β Vertical vs Horizontal
9οΈβ£ Reliability β Availability, redundancy, failover
Frontend Engineer System Design Focus:
ββ CDN & Caching strategies (αΊ£nh hΖ°α»ng trα»±c tiαΊΏp UX)
ββ API Design (REST/GraphQL β bαΊ‘n consume hΓ ng ngΓ y)
ββ Load Balancing (hiα»u tαΊ‘i sao request cα»§a bαΊ‘n ΔαΊΏn ΔΓΊng server)
ββ Database basics (hiα»u data model Δα» design UI tα»t hΖ‘n)
ββ Monitoring & Observability (debug production issues)
βA system is only as strong as its weakest component.β
BΓ i trΖ°α»c: Part 1 β Engineering Mindset: Beyond Beautiful Code β TΖ° duy professional engineering, business impact, vΓ nguyΓͺn tαΊ―c simplicity.