Design Netflix
Optimizing global video streams using localized ISP-embedded Open Connect hardware caches.
What you'll learn
- Open Connect CDN (ISP Colocation)
- Content Pre-Staging (Proactive Push)
- Microservices on AWS (200+ Services)
- Recommendation Engine (Collaborative Filtering)
- AV1 Codec + Per-Title Encoding
- A/B Testing at Scale
TL;DR
Optimizing global video streams using localized ISP-embedded Open Connect hardware caches.
Visual System Topology
Netflix — Streaming Platform Architecture
Data plane: Client → nearest Open Connect ISP box → HLS/DASH video stream (no AWS involved)
Concept Overview
Netflix serves 250M+ subscribers with 200M+ hours of video daily at peak. Its defining architectural decision: separating the control plane (AWS microservices) from the data plane (custom Open Connect CDN hardware at ISPs) — so your streaming video never touches AWS after the initial URL is returned.
Functional Requirements:
- Browse movie/TV catalog with personalized recommendations
- Stream video with adaptive quality and zero buffering
- Download for offline viewing
- Continue watching across devices
- Multiple user profiles per account
- Creator tools (studio/production partner uploads)
Non-Functional Requirements:
- < 0.5% rebuffering rate — smooth streaming is the #1 metric
- 25M+ concurrent streams at peak (evenings)
- Multi-device support (TV, phone, browser, game console)
- 99.99% availability for streaming
Capacity Estimation (250M subscribers):
- Active streams at peak: 25M concurrent × avg 4 Mbps = 100 Tbps of bandwidth
- Storage: 36,000 titles × avg 5 GB/quality level × 5 quality levels ≈ 900 TB encoded
- Control plane API: login, catalog, search — millions of requests/hour (AWS-based)
- Open Connect traffic: 95% of 100 Tbps = 95 Tbps served from ISP-colocated hardware
Key Architectural Pillars
Open Connect CDN (ISP Colocation)
Netflix builds and operates their own CDN hardware called Open Connect Appliances (OCA). These are custom-built servers (8–250TB of flash storage, 100 Gbps network interfaces) installed inside ISP data centers and exchange points worldwide. When a subscriber in New York watches Stranger Things, the video bytes travel only from their ISP's machine room to their home — never crossing the public internet to AWS. This is why Netflix has near-zero buffering even during peak hours.
Content Pre-Staging (Proactive Push)
Netflix knows which shows are releasing tomorrow (they made them). Every night during off-peak hours, Netflix pushes video chunks for upcoming releases to all relevant Open Connect appliances worldwide. By the time subscribers click "play" on a new season premiere, the files are already on their ISP's hardware. This eliminates origin-server load spikes on launch night. For older content: ML predicts which titles will be watched this week and pre-stages accordingly.
Microservices on AWS (200+ Services)
The control plane runs on AWS with 200+ microservices: Authentication, Profile, Catalog, Search, Recommendations, Playback URL generation, Analytics, Billing, A/B Testing, etc. Netflix pioneered Chaos Engineering: deliberately injecting failures into production (Chaos Monkey kills random service instances) to verify resilience. Services communicate via REST and Kafka. All services are deployed across 3 AWS regions (us-east-1, eu-west-1, ap-southeast-1) for failover.
Recommendation Engine (Collaborative Filtering)
Netflix's recommendation system is its core differentiator — 80% of watched content comes from recommendations (not search). The system uses: (1) Collaborative filtering: "users similar to you also watched X" (matrix factorization on watch history), (2) Content-based filtering: "you liked action thrillers, here are more", (3) Contextual signals: time of day, device type, day of week, (4) A/B testing: different recommendation algorithms run simultaneously for different user cohorts.
AV1 Codec + Per-Title Encoding
Netflix doesn't use the same bitrate for every video. Each title is analyzed frame-by-frame and encoded at the minimum bitrate that maintains quality for that specific content. An animated cartoon needs far less bandwidth than a dark, complex nature documentary at the same visual quality. This "per-title encoding" reduces storage and bandwidth by 20–40%. Netflix uses AV1 (30% more efficient than H.264) for supported devices.
A/B Testing at Scale
Netflix runs 100s of simultaneous A/B experiments. Every user cohort may see a different: thumbnail image for a show, row ordering on the home screen, recommendation algorithm, UI layout, even video encoding quality level. The experimentation platform randomly assigns users to treatment groups, measures engagement metrics (click-through rate, completion rate, retention), and statistically determines winning variants. Features ship if they win A/B tests with statistical significance.
