Building a social feed sounds deceptively simple. User A follows User B. User B posts. User A sees it. Done.
Except it’s not. The moment you put real traffic numbers behind that sentence — 100 million users, 500 million tweets per day, timelines that must load in under 200ms — the naive implementation collapses immediately. What you’re left with is one of the most interesting distributed systems problems in backend engineering: the fan-out problem.
This article walks through both canonical solutions, explains where each breaks, and shows you the hybrid architecture that Twitter/X actually settled on after years of painful iteration.
The Core Problem
Every timeline is essentially a ranked merge of recent activity from N sources. For an average Twitter user following 500 people, that’s a continuous stream of content from 500 independent writers. When that user opens the app, they need to see the latest 50-ish posts from those 500 people, sorted by time (or by whatever algorithmic score applies).
The question is: when do you do the work?
- At write time: when someone posts, you push the post into each follower’s timeline immediately.
- At read time: when someone opens their feed, you pull and merge posts from everyone they follow on demand.
That’s it. That’s the whole debate. Everything else is an implementation detail.
Fan-Out on Read (Pull Model)
How It Works
The simplest implementation you can think of. Each user’s post goes into a single table. When User A requests their timeline, you run something like:
-- Get the timeline for user_id = 42
SELECT p.*
FROM posts p
JOIN follows f ON f.following_id = p.author_id
WHERE f.follower_id = 42
AND p.created_at > NOW() - INTERVAL '7 days'
ORDER BY p.created_at DESC
LIMIT 50;
No precomputation. No background jobs. Just a query at read time.
Why It Seems Fine at First
Writing a post is dead cheap — one INSERT. Storage is minimal since each post lives exactly once. You get fresh data instantly; there’s no cache invalidation headache. If you edit or delete a post, the change reflects immediately on the next read.
For a small app — say, an internal company feed with a few hundred users — this works perfectly. Ship it.
Where It Dies
The query above has a JOIN across potentially millions of rows. For a user following 2,000 accounts, you’re scanning at minimum the last 7 days of posts from those 2,000 people. Under load, the database gets hammered by every single timeline refresh, every scroll, every notification poll.
Indexes help, but they don’t fix the fundamental scatter-gather nature of the query. You’re still reading from N sources and merging them at query time. As follower counts go up, read latency climbs with them.
Throw 10,000 concurrent users at this and your Postgres instance is on fire.
The read path is expensive. The write path is cheap. That’s the fan-out-on-read trade-off.
Fan-Out on Write (Push Model)
How It Works
When a user posts, a background job fans out that post to every follower’s timeline cache — typically Redis sorted sets, where the score is the timestamp.
def publish_post(author_id: int, post_id: int, created_at: float):
followers = db.get_followers(author_id) # returns list of user IDs
pipeline = redis.pipeline()
for follower_id in followers:
timeline_key = f"timeline:{follower_id}"
pipeline.zadd(timeline_key, {post_id: created_at})
# Trim to last 800 posts to cap memory usage
pipeline.zremrangebyrank(timeline_key, 0, -801)
pipeline.execute()
Reading the timeline is then a single Redis call:
def get_timeline(user_id: int, limit: int = 50) -> list[int]:
timeline_key = f"timeline:{user_id}"
# Returns post IDs sorted by score (timestamp) descending
return redis.zrevrange(timeline_key, 0, limit - 1)
Sub-millisecond reads. The database doesn’t care. Your API latency is now bounded by Redis, not by a scatter-gather SQL query.
Why It’s Powerful
Timeline reads are O(1). There’s no aggregation happening at read time. You precompute everything the moment a post is created, so reading is just fetching a sorted list. This is why high-traffic systems aggressively favor write-time precomputation — reads happen far more frequently than writes in social networks (often 100:1 or more).
The Celebrity Problem
Here’s where fan-out on write breaks spectacularly.
Imagine Beyoncé posts a tweet. She has 30 million followers. Your fan-out job now needs to write post_id → 30M Redis keys. Even at 10,000 writes per second, that’s 50 minutes of lag before every follower sees the post. During that window, your Kafka consumers are backed up, your Redis cluster is absorbing a write avalanche, and early followers see stale timelines.
This is called the celebrity problem or sometimes the hotkey problem. High-follower users are outliers that completely violate the assumptions baked into naive fan-out-on-write.
The write path is expensive for celebrities. The read path is cheap everywhere. That’s the fan-out-on-write trade-off.
The Hybrid Architecture (What Actually Works)
Twitter published their approach publicly, and the industry has broadly converged on the same pattern. The insight is: use fan-out-on-write for normal users and fan-out-on-read for celebrities.
Define "celebrity" pragmatically — any account with more than X followers (Twitter historically used ~10,000 as a rough threshold). These accounts are excluded from the write fan-out. Their posts are not pre-distributed.
At read time, when building a user’s timeline:
- Fetch their precomputed timeline from Redis (posts from normal users they follow).
- For each celebrity they follow, fetch the last 50 posts directly from a separate celebrity posts cache or database read.
- Merge and rank the results.
- Return the top 50 entries.
def get_timeline(user_id: int, limit: int = 50) -> list[Post]:
# Step 1: Pull the precomputed timeline (fan-out-on-write data)
precomputed_post_ids = redis.zrevrange(f"timeline:{user_id}", 0, limit * 3)
# Step 2: Get celebrity accounts this user follows
celebrity_ids = db.get_celebrity_followings(user_id)
# Step 3: Fetch recent posts from each celebrity (fan-out-on-read)
celebrity_posts = []
for celeb_id in celebrity_ids:
posts = redis.zrevrange(f"user_posts:{celeb_id}", 0, limit - 1)
celebrity_posts.extend(posts)
# Step 4: Merge and sort by timestamp
all_post_ids = list(set(precomputed_post_ids + celebrity_posts))
posts = db.batch_get_posts(all_post_ids)
posts.sort(key=lambda p: p.created_at, reverse=True)
return posts[:limit]
The merge is cheap because celebrity followings are small sets per user, and celebrity posts are already cached separately.
The Full Production Architecture
Let me give you a realistic component diagram of how this plays out in production.
Write Path
User POSTs tweet
│
▼
API Service
│
├── INSERT into posts table (Postgres)
│
└── Publish event to Kafka topic: "post.created"
│
┌──────────┴──────────┐
▼ ▼
Fan-Out Worker Celebrity Cache Worker
(for normal users) (for celebrity accounts)
│ │
▼ ▼
ZADD to follower ZADD to user_posts:{author_id}
timeline keys in Redis in Redis
The Kafka event payload
{
"event": "post.created",
"post_id": "1234567890",
"author_id": "9876",
"author_follower_count": 342,
"created_at": 1716643200,
"is_celebrity": false
}
The fan-out worker checks is_celebrity (or rechecks follower count against the threshold). If false, it fans out to all followers. If true, it routes to the celebrity cache worker instead.
Redis Data Structures
# Regular user's precomputed timeline
# Sorted set: score = unix timestamp, member = post_id
ZADD timeline:user:42 1716643200 post:1234567890
# Celebrity's personal post index (used during hybrid merge at read time)
ZADD user_posts:celeb:99 1716643210 post:9999999
# Trim to cap memory — keep last 800 entries
ZREMRANGEBYRANK timeline:user:42 0 -801
Docker Compose for a Local Dev Stack
version: "3.9"
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: timeline_dev
POSTGRES_USER: app
POSTGRES_PASSWORD: secret
volumes:
- pg_data:/var/lib/postgresql/data
- ./schema.sql:/docker-entrypoint-initdb.d/schema.sql
ports:
- "5432:5432"
redis:
image: redis:7-alpine
command: redis-server --maxmemory 2gb --maxmemory-policy allkeys-lru
ports:
- "6379:6379"
volumes:
- redis_data:/data
zookeeper:
image: confluentinc/cp-zookeeper:7.6.0
environment:
ZOOKEEPER_CLIENT_PORT: 2181
kafka:
image: confluentinc/cp-kafka:7.6.0
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
ports:
- "9092:9092"
fanout-worker:
build: ./fanout-worker
depends_on:
- kafka
- redis
- postgres
environment:
KAFKA_BOOTSTRAP: kafka:9092
REDIS_URL: redis://redis:6379
DB_URL: postgresql://app:secret@postgres/timeline_dev
CELEBRITY_THRESHOLD: 10000
volumes:
pg_data:
redis_data:
Schema (Postgres)
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
author_id BIGINT NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_posts_author_created ON posts (author_id, created_at DESC);
CREATE TABLE follows (
follower_id BIGINT NOT NULL,
following_id BIGINT NOT NULL,
PRIMARY KEY (follower_id, following_id)
);
CREATE INDEX idx_follows_following ON follows (following_id);
-- Track which accounts are in celebrity tier
CREATE TABLE celebrity_accounts (
user_id BIGINT PRIMARY KEY,
follower_count BIGINT NOT NULL,
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
Gotchas
The follower count threshold is not static. An account crosses from normal to celebrity (or back) as their follower count changes. You need a background job that recategorizes accounts and backfills or invalidates their fan-out data when they cross the threshold. Get this wrong and you either over-fan-out (write amplification) or miss timelines (read gaps).
Redis memory is finite. Each user’s timeline sorted set has to be capped. Keep the last 800–1000 post IDs per user. If a user hasn’t logged in for 90 days, delete their timeline key entirely and regenerate it on next login. Cold start generation (rehydrating a timeline from scratch on first access) is a real latency spike you need to handle — either with a background warmup job or a loading state in the UI.
Write amplification can still kill you. Even at 10,000-follower threshold, a user with 9,999 followers posting 20 times a day generates ~200,000 Redis writes daily just for one account. Multiply by many such users and monitor your Redis write throughput carefully.
Deletes and edits are your enemy. Fan-out-on-write distributes post IDs, not post content. If a user deletes a post, you need to either: a) chase down and remove the post ID from every follower’s timeline key (expensive), or b) store a tombstone and filter deleted posts at read time. Option b is far more practical. Never fan out the actual content, always fan out the reference.
Thundering herd on celebrity posts. When a celebrity with 30M followers posts, your timeline reads for those followers will spike immediately after the post appears in their hybrid merge. If you’re not careful, this turns into 30M near-simultaneous cache fetches of the celebrity’s post data. Pre-warm the celebrity’s post cache as soon as they publish, and use short TTLs (30–60s) with stale-while-revalidate semantics.
Kafka consumer lag is your canary. The fan-out worker’s consumer lag tells you how far behind your timelines are. Alert on it. If lag spikes, users with large follower counts see delayed timelines. This is the most operationally visible symptom of fan-out at capacity.
When to Use What
Use pure fan-out on read if: your user base is small (< 50K), your follow graphs are shallow (average < 200 follows), and you can’t justify the operational complexity of Redis + Kafka. A well-indexed Postgres query and connection pooling will carry you further than you expect.
Use pure fan-out on write if: you have no celebrity-tier accounts (a B2B tool, an internal platform, a community with roughly equal follower distributions), your average follower count is under 5,000, and you can afford the memory.
Use the hybrid if: you’re building something with public accounts, influencers, or any power-law follower distribution. The hybrid is the correct answer for any system that looks like a public social network.
Further Reading
The canonical papers here are worth your time. Twitter’s engineering blog published their timeline architecture in detail around 2013 and has updated it since. The original Twitter timelines blog post covers the write fan-out decision. Martin Kleppmann’s Designing Data-Intensive Applications (DDIA) covers the fan-out problem in Chapter 11 with sharp clarity — if you haven’t read it, fix that.
For hands-on experience, build the local stack above, load-test the fan-out worker with a Locust or k6 script, and watch your Redis memory and Kafka lag under simulated celebrity traffic. The numbers will make the trade-offs viscerally obvious in a way no amount of theory will.
The timeline problem is a great lens for understanding the broader read-write trade-off that appears everywhere in distributed systems: search indexing, recommendation engines, notification delivery, and more. Once you’ve solved it here, you’ll recognize the same shape in half the scaling problems you encounter going forward.