video - System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

🎯 Tại Sao Senior Cần Biết System Design?

Bạn có thể code Angular/React rất giỏi, nhưng nếu không hiểu system design, bạn chỉ là một feature developer, không phải senior engineer.

Câu hỏi phỏng vấn Senior thường gặp:

  • “Thiết kế hệ thống chat real-time”
  • “Thiết kế URL shortener như bit.ly”
  • “Thiết kế news feed như Facebook”

Để trả lời tốt, bạn cần hiểu toàn bộ hệ thống, không chỉ frontend.

Đây là bản tóm tắt tinh gọn + hệ thống hóa theo tư duy Senior System Design.


🔥 1. Tư Duy Senior Khác Mid Ở Đâu?

So Sánh

Mid-Level Developer Senior Developer
Code theo yêu cầu Thiết kế từ con số 0
Thêm feature vào hệ thống có sẵn Quyết định architecture
Làm theo spec rõ ràng Challenge requirement
Focus implementation Focus scalability & trade-off
Nghĩ về code Nghĩ về system

Mindset Shift

MID:
"Làm sao implement feature này?"

SENIOR:
"Feature này ảnh hưởng gì đến:
├─ Performance?
├─ Scalability?
├─ Maintenance?
├─ Cost?
└─ User experience?"

🏗️ 2. Bắt Đầu Từ Single Server

Mô Hình Cơ Bản

User (Web / Mobile)
        ↓
     DNS
        ↓
   Single Server
  ├─ Web App
  ├─ API
  ├─ Database
  └─ Cache

Request Flow Chi Tiết

1. User gõ: example.com
        ↓
2. DNS lookup: example.com → 192.168.1.1
        ↓
3. TCP handshake (3-way)
        ↓
4. HTTP Request
        ↓
5. Server process request
        ↓
6. Query Database
        ↓
7. Generate Response (HTML/JSON)
        ↓
8. Send back to User

⚠️ Vấn Đề Của Single Server

LIMITATIONS:
├─ Không scale được (max RAM/CPU)
├─ Single Point of Failure (SPOF)
│  └─ Server down = Toàn bộ app down
├─ No redundancy
└─ Slow khi traffic tăng

🗄️ 3. Database Design

🔹 SQL (RDBMS)

Đặc Điểm

Đặc điểm Giải thích
Schema Fixed, structured
ACID Atomicity, Consistency, Isolation, Durability
Relations JOIN mạnh mẽ
Use case Banking, ecommerce, transaction

Ví Dụ

-- Users table
CREATE TABLE users (
  id INT PRIMARY KEY,
  email VARCHAR(255) UNIQUE,
  name VARCHAR(100)
);

-- Orders table
CREATE TABLE orders (
  id INT PRIMARY KEY,
  user_id INT,
  total DECIMAL(10,2),
  FOREIGN KEY (user_id) REFERENCES users(id)
);

-- Query with JOIN
SELECT u.name, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.total > 1000;

🔹 NoSQL

Phân Loại

1. Document Database

MongoDB:
{
  "_id": "123",
  "name": "John",
  "orders": [
    {"id": 1, "total": 100},
    {"id": 2, "total": 200}
  ]
}

2. Key-Value Store

Redis:
user:123 → {"name": "John", "email": "john@example.com"}

3. Wide-Column Store

Cassandra:
Row key: user_123
Columns: {name, email, age, country, ...}

4. Graph Database

Neo4j:
(User)-[:FOLLOWS]->(User)
(User)-[:LIKES]->(Post)

Ưu Điểm NoSQL

✅ Scale horizontal dễ dàng
✅ Flexible schema
✅ Low latency (milliseconds)
✅ High throughput
✅ Phù hợp big data

🔥 Chọn SQL hay NoSQL?

Chọn SQL Khi:

├─ Data có structure rõ ràng
├─ Cần ACID transaction
├─ Cần JOIN nhiều
├─ Data integrity quan trọng
└─ Ví dụ: Banking, Accounting, ERP

Chọn NoSQL Khi:

├─ Data lớn & không structured
├─ Cần scale horizontal
├─ Low latency requirement
├─ Flexible schema
└─ Ví dụ: Social media, IoT, Real-time analytics

Hybrid Approach

Dùng cả hai:
├─ SQL cho: User, Order, Transaction
└─ NoSQL cho: Product catalog, Logs, Cache

📈 4. Scaling Strategies

Vertical Scaling (Scale Up)

BEFORE:
Server: 4 CPU, 8GB RAM

AFTER:
Server: 16 CPU, 64GB RAM

✅ Ưu Điểm

  • Đơn giản
  • Không cần thay đổi code
  • No distributed system complexity

❌ Nhược Điểm

  • Có giới hạn vật lý
  • Expensive
  • Single Point of Failure
  • Downtime khi upgrade

Horizontal Scaling (Scale Out)

BEFORE:
1 Server (16 CPU, 64GB RAM)

AFTER:
4 Servers (4 CPU, 16GB RAM each)

Architecture

           Load Balancer
                ↓
      ┌─────────┼─────────┐
   Server1   Server2   Server3
      │         │         │
      └─────────┼─────────┘
                ↓
            Database

✅ Ưu Điểm

  • Fault tolerant (1 server down ≠ system down)
  • Scale vô hạn (thêm server)
  • Phù hợp high traffic
  • Cost-effective (dùng nhiều server rẻ)

❌ Nhược Điểm

  • Phức tạp hơn
  • Cần load balancer
  • Session management khó
  • Data consistency challenge

⚖️ 5. Load Balancing

Vai Trò

Load Balancer = Traffic Cop
├─ Distribute requests evenly
├─ Health check servers
├─ Remove unhealthy servers
└─ Add new servers dynamically

Các Thuật Toán Phổ Biến

1. Round Robin

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (lặp lại)

Ưu: Đơn giản, công bằng
Nhược: Không xét tải hiện tại

2. Least Connections

Server A: 10 connections
Server B: 5 connections
Server C: 8 connections

→ New request → Server B

Ưu: Cân bằng tải tốt hơn
Nhược: Cần track connections

3. Least Response Time

Server A: 100ms avg
Server B: 50ms avg
Server C: 80ms avg

→ New request → Server B

Ưu: Best performance
Nhược: Complex monitoring

4. IP Hash

User IP: 192.168.1.100
→ hash(192.168.1.100) % 3 = 1
→ Always route to Server 1

Ưu: Session sticky
Nhược: Không cân bằng nếu IP distribution lệch

5. Weighted

Server A: 50% capacity → weight 5
Server B: 30% capacity → weight 3
Server C: 20% capacity → weight 2

→ Distribute theo tỷ lệ 5:3:2

6. Geographic

User từ Asia → Singapore server
User từ US → Oregon server
User từ EU → Frankfurt server

Health Check

Load Balancer
    ↓
Every 10s: Ping /health endpoint
    ↓
If response OK → Keep routing
If fail 3 times → Stop routing
    ↓
Auto retry every 30s

💣 6. Single Point of Failure (SPOF)

Định Nghĩa

SPOF = Một component mà nếu fail, toàn bộ system fail

Ví Dụ SPOF

ARCHITECTURE:
Users → LB → API → Single Database

SPOF:
├─ Load Balancer (chỉ 1)
├─ Database (chỉ 1)
└─ Nếu DB fail → Toàn bộ API fail

Giải Pháp

1. Database Replication

         API Servers
              ↓
        ┌─────┴─────┐
     Primary      Replicas
     (Write)      (Read)
        │            │
        └────────────┘
     Sync continuously

2. Multiple Load Balancers

         DNS (Round Robin)
              ↓
      ┌───────┴───────┐
   LB-1             LB-2
      │               │
      └───────┬───────┘
           Servers

3. Self-Healing Systems

Health Check detects failure
        ↓
Auto restart container/VM
        ↓
If still fail → Alert humans

🌐 7. API Design

API Là Gì?

Contract giữa Client và Server

3 Style Chính

1️⃣ REST (Representational State Transfer)

Đặc điểm:

├─ Resource-based (/users, /products)
├─ HTTP methods (GET, POST, PUT, DELETE, PATCH)
├─ Stateless (mỗi request độc lập)
└─ Standard HTTP status codes

Ví dụ:

GET    /api/v1/products          → List all
POST   /api/v1/products          → Create
GET    /api/v1/products/123      → Get one
PUT    /api/v1/products/123      → Update
DELETE /api/v1/products/123      → Delete

RESTful Best Practices:

✅ Dùng nouns, không dùng verbs
   /products (not /getProducts)

✅ Versioning
   /api/v1/products

✅ Filtering & Pagination
   /products?category=tech&page=2&limit=10

✅ HATEOAS (Hypermedia)
   Response có links cho next actions

2️⃣ GraphQL

Đặc điểm:

├─ 1 endpoint duy nhất (/graphql)
├─ Client định nghĩa response schema
├─ Tránh overfetching/underfetching
└─ Strongly typed

Ví dụ:

# Query
query {
  user(id: "123") {
    name
    email
    posts(limit: 5) {
      title
      createdAt
    }
  }
}

# Mutation
mutation {
  createPost(title: "Hello", content: "World") {
    id
    title
  }
}

# Subscription (Real-time)
subscription {
  newMessage {
    id
    text
    user {
      name
    }
  }
}

Ưu điểm:

✅ Tránh overfetching (chỉ lấy field cần)
✅ Tránh multiple requests (1 query get all)
✅ Strong typing (schema validation)
✅ Introspection (self-documenting)

Nhược điểm:

❌ Phức tạp hơn REST
❌ Khó cache
❌ Query phức tạp = performance issue
❌ Learning curve cao

3️⃣ gRPC

Đặc điểm:

├─ Protocol Buffers (binary, compact)
├─ HTTP/2 (multiplexing, streaming)
├─ Strongly typed
└─ Phù hợp microservices internal communication

Ví dụ:

// user.proto
service UserService {
  rpc GetUser (UserRequest) returns (UserResponse);
  rpc ListUsers (Empty) returns (stream UserResponse);
}

message UserRequest {
  string user_id = 1;
}

message UserResponse {
  string id = 1;
  string name = 2;
  string email = 3;
}

Khi nào dùng:

✅ Internal microservices communication
✅ High performance requirement
✅ Binary protocol (nhỏ & nhanh)
✅ Bidirectional streaming

❌ Browser không support native (cần gRPC-Web)

So Sánh REST vs GraphQL vs gRPC

Tiêu chí REST GraphQL gRPC
Protocol HTTP/1.1 HTTP/1.1 HTTP/2
Format JSON JSON Protocol Buffer
Endpoints Multiple Single Service methods
Caching Easy Hard N/A
Browser Full support Full support Needs gRPC-Web
Use case Public API Complex queries Internal services

📡 8. API Protocols

HTTP/HTTPS (Request-Response)

Client → Request → Server
Client ← Response ← Server

Characteristics:
├─ Stateless
├─ Text-based
└─ Overhead lớn cho real-time

Khi nào dùng:

  • RESTful API
  • Traditional web apps
  • CRUD operations

WebSocket (Full-Duplex)

Client ↔ Persistent Connection ↔ Server

Characteristics:
├─ Bidirectional
├─ Real-time
└─ Low latency

Khi nào dùng:

  • Chat applications
  • Live notifications
  • Collaborative editing (Google Docs)
  • Live sports scores

Server-Sent Events (SSE)

Client ← Stream ← Server
(One direction only)

Characteristics:
├─ Server push to client
├─ HTTP-based
└─ Auto reconnect

Khi nào dùng:

  • Live feed updates
  • Stock prices
  • News ticker

AMQP (Message Queue)

Producer → Queue → Consumer

Characteristics:
├─ Async messaging
├─ Reliable delivery
└─ Decoupled systems

Khi nào dùng:

  • Background jobs
  • Email sending
  • Order processing
  • Microservices communication

Bảng So Sánh

Protocol Direction Use Case Real-time
HTTP Request-Response CRUD API
WebSocket Bidirectional Chat, Gaming
SSE Server→Client Live feed
gRPC Bidirectional Microservices
AMQP Async Queue Background jobs

🚀 9. Transport Layer

TCP (Transmission Control Protocol)

Đặc điểm:

✅ Reliable (đảm bảo delivery)
✅ Ordered (packet đúng thứ tự)
✅ Error checking
✅ 3-way handshake
✅ Congestion control

3-Way Handshake:

Client → SYN → Server
Client ← SYN-ACK ← Server
Client → ACK → Server
→ Connection established

Khi nào dùng:

  • HTTP/HTTPS
  • Email (SMTP)
  • File transfer (FTP)
  • Banking transactions

UDP (User Datagram Protocol)

Đặc điểm:

✅ Fast (không handshake)
✅ Low latency
❌ Unreliable (packet có thể mất)
❌ No ordering guarantee

Khi nào dùng:

  • Video streaming
  • Online gaming
  • VoIP (Skype, Zoom)
  • DNS queries
  • Live broadcasting

So Sánh TCP vs UDP

Tiêu chí TCP UDP
Reliability ✅ Guaranteed ❌ Best effort
Ordering ✅ Yes ❌ No
Speed Slower Faster
Overhead High Low
Use case Banking, File transfer Gaming, Streaming

🧱 10. RESTful Best Practices

Resource Modeling

✅ Good Examples

GET    /users
POST   /users
GET    /users/123
PUT    /users/123
DELETE /users/123

GET    /users/123/orders
POST   /users/123/orders

❌ Bad Examples

/getUsers
/createUser
/deleteUser
/getUserOrders

Filtering, Sorting, Pagination

# Filtering
GET /products?category=electronics&brand=samsung

# Sorting
GET /products?sort=price_asc
GET /products?sort=created_at_desc

# Pagination
GET /products?page=2&limit=20
GET /products?offset=40&limit=20

# Combined
GET /products?category=tech&sort=price_asc&page=1&limit=10

HTTP Status Codes

Code Meaning When to use
200 OK Success
201 Created Resource created successfully
204 No Content Delete success
400 Bad Request Invalid input
401 Unauthorized Not authenticated
403 Forbidden Authenticated but no permission
404 Not Found Resource doesn’t exist
409 Conflict Duplicate resource
422 Unprocessable Entity Validation failed
500 Internal Server Error Server bug
503 Service Unavailable Maintenance/Overload

API Versioning

Option 1: URL versioning
/api/v1/products
/api/v2/products

Option 2: Header versioning
Accept: application/vnd.myapi.v1+json

Option 3: Query parameter
/api/products?version=1

Recommended: URL versioning (rõ ràng nhất)


🔐 11. Authentication

Basic Auth

Authorization: Basic base64(username:password)

Example:
Authorization: Basic am9objpzZWNyZXQ=

Ưu: Đơn giản
Nhược: Insecure (dễ decode)


Bearer Token

Authorization: Bearer <token>

Example:
Authorization: Bearer eyJhbGciOiJIUzI1NiIs...

OAuth2 + JWT Flow

1. User click "Login with Google"
        ↓
2. Redirect to Google OAuth
        ↓
3. User login & approve
        ↓
4. Google return authorization code
        ↓
5. Exchange code for access token
        ↓
6. Use access token to call API

JWT Structure:

{
  "header": {
    "alg": "HS256",
    "typ": "JWT"
  },
  "payload": {
    "userId": "123",
    "role": "admin",
    "exp": 1234567890
  },
  "signature": "..."
}

Access + Refresh Token Pattern

Login success
    ↓
Return: Access Token (15 min) + Refresh Token (7 days)
    ↓
Use Access Token for API calls
    ↓
Access Token expired
    ↓
Use Refresh Token to get new Access Token
    ↓
Refresh Token expired → Re-login

Why:

✅ Security: Short-lived access token
✅ UX: Không cần login lại thường xuyên
✅ Revocable: Có thể revoke refresh token

🔒 12. Authorization

RBAC (Role-Based Access Control)

ROLES:
├─ Admin (full access)
├─ Editor (read + write)
└─ Viewer (read only)

USER → ROLE → PERMISSIONS

Example:
John → Admin → [CREATE, READ, UPDATE, DELETE]
Jane → Editor → [READ, UPDATE]
Bob → Viewer → [READ]

ABAC (Attribute-Based Access Control)

ATTRIBUTES:
├─ User attributes (department, seniority)
├─ Resource attributes (owner, confidential)
├─ Environment (time, location, device)
└─ Action (read, write, delete)

RULE:
IF user.department == "Finance"
AND resource.type == "Invoice"
AND time.hour >= 9 AND time.hour <= 17
THEN allow READ

ACL (Access Control List)

RESOURCE-CENTRIC:

Document A:
├─ Alice → Owner (full control)
├─ Bob → Editor (read + write)
└─ Charlie → Viewer (read only)

Document B:
├─ Bob → Owner
└─ Alice → Viewer

Ví dụ: Google Docs permissions


🛡️ 13. API Security

7 Kỹ Thuật Chính

1. Rate Limiting

Limit: 100 requests / minute / IP

Request 101 → 429 Too Many Requests

Implementation:
├─ Token bucket algorithm
├─ Redis counter
└─ API Gateway (AWS, Kong)

2. CORS (Cross-Origin Resource Sharing)

Frontend: https://app.example.com
API: https://api.example.com

Without CORS → Browser blocks request

With CORS:
Access-Control-Allow-Origin: https://app.example.com
Access-Control-Allow-Methods: GET, POST
Access-Control-Allow-Headers: Authorization

3. SQL Injection Prevention

❌ Vulnerable:
query = "SELECT * FROM users WHERE id = " + userId

✅ Safe:
query = "SELECT * FROM users WHERE id = ?"
execute(query, [userId])

4. XSS (Cross-Site Scripting) Prevention

❌ Vulnerable:
<div innerHTML={userInput}></div>

✅ Safe:
<div textContent={userInput}></div>
// Or sanitize with DOMPurify

5. CSRF (Cross-Site Request Forgery) Protection

Solution:
├─ CSRF Token in form/header
├─ SameSite cookie attribute
└─ Verify Origin/Referer header

6. Input Validation

✅ Validate input:
├─ Type checking
├─ Length limits
├─ Format (email, phone)
├─ Whitelist allowed values
└─ Sanitize before use

7. HTTPS Only

✅ Enforce HTTPS:
├─ Encrypt data in transit
├─ HSTS header
└─ Redirect HTTP → HTTPS

🧠 14. Toàn Bộ System Design Flow

                ┌──────────────┐
                │    Users     │
                └──────┬───────┘
                       ↓
                    DNS
                 (Domain → IP)
                       ↓
                Load Balancer
            (Health check + distribute)
                       ↓
        ┌──────────────┼──────────────┐
     API Server 1   API Server 2   API Server 3
   (Stateless)     (Stateless)    (Stateless)
        │               │               │
        └──────────────┬──────────────┘
                       ↓
                  Database Layer
           ┌────────────┼────────────┐
        Primary      Replica       Cache
       (Write)      (Read)        (Redis)
                       │
                       ↓
                Message Queue
              (RabbitMQ / Kafka)
                       ↓
                 Background Jobs
            (Email, Processing, Cleanup)

🎯 15. Kết Luận Cốt Lõi

Muốn Lên Senior System Design Cần:

1. HIỂU REQUEST FLOW
   DNS → TCP → HTTP → API → DB → Response

2. HIỂU TRADE-OFFS
   ├─ SQL vs NoSQL
   ├─ REST vs GraphQL vs gRPC
   ├─ TCP vs UDP
   ├─ Vertical vs Horizontal scaling
   └─ Consistency vs Availability (CAP theorem)

3. LOẠI BỎ SPOF
   ├─ Replication
   ├─ Redundancy
   └─ Health checks

4. THIẾT KẾ API CLEAN + SECURE
   ├─ RESTful principles
   ├─ Proper status codes
   ├─ Versioning
   └─ Security best practices

5. HIỂU AUTHENTICATION VS AUTHORIZATION
   ├─ Auth: Ai bạn là?
   └─ Author: Bạn được làm gì?

💪 Checklist Tự Đánh Giá

Technical Foundation

  • Giải thích được request flow từ browser → server?
  • Phân biệt được TCP vs UDP?
  • Hiểu DNS lookup process?
  • Biết HTTP status codes phổ biến?

Database

  • Biết khi nào dùng SQL vs NoSQL?
  • Hiểu ACID là gì?
  • Biết các loại NoSQL (Document, Key-Value, Graph)?
  • Thiết kế được database schema?

Scaling

  • Phân biệt vertical vs horizontal scaling?
  • Hiểu load balancing algorithms?
  • Biết cách eliminate SPOF?
  • Hiểu database replication?

API Design

  • Thiết kế được RESTful API?
  • Biết khi nào dùng GraphQL?
  • Hiểu gRPC use cases?
  • Apply được API best practices?

Security

  • Implement được authentication flow?
  • Phân biệt RBAC vs ABAC vs ACL?
  • Biết 7 kỹ thuật API security?
  • Prevent được common vulnerabilities (SQL injection, XSS, CSRF)?

📚 Tài Liệu Tham Khảo

  • Book: “Designing Data-Intensive Applications” - Martin Kleppmann
  • Book: “System Design Interview” - Alex Xu
  • Resource: systemdesignprimer.com
  • Practice: leetcode.com/discuss/interview-question/system-design

💡 Câu Chốt Lõi

Senior không chỉ biết code.
Senior hiểu TOÀN BỘ HỆ THỐNG.

Từ DNS lookup đến database query,
từ load balancer đến message queue,
từ authentication đến authorization.

System Design = Senior Mindset.

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand. Great engineers design systems that scale.” - Martin Fowler (adapted)