When a Million People Click “Buy Now”: The Thundering Herd Problem Explained
What really happens when a million users hit ‘Buy Now’ at the same time

Full Stack Developer.
If you’ve ever tried to buy an iPhone during a flash sale and watched the app freeze, crash, or throw random errors , you’ve already experienced the Thundering Herd Problem.
From a user’s point of view, it feels like bad luck.
From an engineer’s point of view, it’s a predictable failure mode in distributed systems.
Let’s break it down using a real-world iPhone flash sale and see how large-scale systems actually solve it.
The Flash Sale Moment We All Know:
It’s 11:59 PM.
App is open
You’re logged in
Card details are saved
Finger hovering over Buy Now
At 12:00 AM sharp, millions of people do the exact same thing.
That single moment is where everything starts to go wrong.
What Is the Thundering Herd Problem?
In simple terms:
The thundering herd problem happens when a large number of users wait for one event, and when it occurs, all of them act at the same time, overwhelming the system.
Think of:
One door
A million people
Everyone pushing at once
Even if only a few are allowed in, the chaos still happens.
The iPhone Flash Sale Scenario:
Let’s put numbers on it.
iPhones in stock: 10
Users waiting: 1,000,000
Sale start time: 12:00 AM
At midnight:
1 million requests hit the backend within milliseconds
Every request wants to:
Check stock
Reserve inventory
Proceed to payment
This is the perfect setup for a thundering herd.
What Goes Wrong Behind the Scenes
Inventory Turns Into a Hotspot
A naïve inventory check looks like this:
if stock > 0: stock = stock - 1Under heavy concurrency:
Many requests read
stock = 10All attempt to decrement
Database rows or Redis keys get locked
Requests pile up
Only 10 users succeed, but every request stresses the system.
Retry Storm Makes It Worse
Users who fail don’t stop.
They:
Refresh the app
Retry instantly
Spam the same APIs
Traffic doesn’t drop , it multiplies.
Cache Stampede
Often, price and stock are cached.
If that cache expires exactly at midnight:
All requests miss cache
All hit the database
The database becomes the bottleneck
Now even simple reads are slow.
Payment Failures
Some users reach payment, but:
Payment gateways are rate-limited
Downstream services are already overloaded
Result:
Failed payments
Inconsistent orders
Very angry users
Why This Is a Thundering Herd Problem?
This situation checks every box:
Single trigger → sale start
Massive waiters → users waiting at midnight
Simultaneous wake-up → everyone clicks Buy
Few winners → limited stock
System overload → retries and contention
The problem isn’t inventory.
The problem is uncontrolled access.
The Real Solution: Token-Based Purchase
Big companies don’t let everyone fight for inventory.
They add a gate.
The Core Idea
Don’t let users compete for stock.
Let them compete for permission to buy.
That permission is a purchase token.
Step 1: Sale Starts (Inventory Is Still Protected)
At 12:00 AM:
Inventory is not opened directly
A lightweight Token Service becomes active
Step 2: Users Request a Token
When users click Buy Now, they actually call:
POST /get-purchase-token
This endpoint:
Is fast and cheap
Does not touch inventory
Does not trigger payment
Step 3: Only 10 Tokens Exist
In Redis:
available_tokens = 10
Each request performs one atomic operation:
remaining = DECR(available_tokens)
If
remaining >= 0→ token grantedIf
remaining < 0→ rejected immediately
Because Redis operations are atomic:
Exactly 10 users get tokens
No race conditions
No overselling
Step 4: Secure Token Generation
Each successful user gets a token that:
Is cryptographically signed
Is bound to user and product
Has a short expiry (2–5 minutes)
Can only be used once
Everyone else is blocked early, before doing any damage.
Step 5: Purchase Requires a Token
The real purchase API:
POST /purchase
Only works if:
Token is valid
Token belongs to the user
Token is unused and unexpired
Only 10 requests ever reach inventory and payment.
Why This Works So Well?
Without tokens:
1 million users hit inventory and payment
Systems overload instantly
With tokens:
1 million users hit a cheap gate
Only 10 touch critical services
You move the competition away from fragile systems and into a controlled layer.
The Key Takeaway:
Scaling systems isn’t about adding more servers.
It’s about controlling who is allowed to proceed.
Flash sales fail when everyone rushes together.
They succeed when access is carefully orchestrated.
Final Thought:
The next time you miss an iPhone flash sale, remember:
It’s not bad luck.
It’s a distributed systems problem , one engineers design for every day.
And when done right, only 10 people should ever reach “Buy Now.”