Skip to main content

Command Palette

Search for a command to run...

When a Million People Click “Buy Now”: The Thundering Herd Problem Explained

What really happens when a million users hit ‘Buy Now’ at the same time

Published
4 min read
When a Million People Click “Buy Now”: The Thundering Herd Problem Explained
A

Full Stack Developer.

If you’ve ever tried to buy an iPhone during a flash sale and watched the app freeze, crash, or throw random errors , you’ve already experienced the Thundering Herd Problem.

From a user’s point of view, it feels like bad luck.
From an engineer’s point of view, it’s a predictable failure mode in distributed systems.

Let’s break it down using a real-world iPhone flash sale and see how large-scale systems actually solve it.

The Flash Sale Moment We All Know:

It’s 11:59 PM.

  • App is open

  • You’re logged in

  • Card details are saved

  • Finger hovering over Buy Now

At 12:00 AM sharp, millions of people do the exact same thing.

That single moment is where everything starts to go wrong.


What Is the Thundering Herd Problem?

In simple terms:

The thundering herd problem happens when a large number of users wait for one event, and when it occurs, all of them act at the same time, overwhelming the system.

Think of:

  • One door

  • A million people

  • Everyone pushing at once

Even if only a few are allowed in, the chaos still happens.

The iPhone Flash Sale Scenario:

Let’s put numbers on it.

  • iPhones in stock: 10

  • Users waiting: 1,000,000

  • Sale start time: 12:00 AM

At midnight:

  • 1 million requests hit the backend within milliseconds

  • Every request wants to:

    • Check stock

    • Reserve inventory

    • Proceed to payment

This is the perfect setup for a thundering herd.

What Goes Wrong Behind the Scenes

  1. Inventory Turns Into a Hotspot

    A naïve inventory check looks like this:

     if stock > 0:
         stock = stock - 1
    

    Under heavy concurrency:

    • Many requests read stock = 10

    • All attempt to decrement

    • Database rows or Redis keys get locked

    • Requests pile up

Only 10 users succeed, but every request stresses the system.

  1. Retry Storm Makes It Worse

    Users who fail don’t stop.

    They:

    • Refresh the app

    • Retry instantly

    • Spam the same APIs

Traffic doesn’t drop , it multiplies.

  1. Cache Stampede

    Often, price and stock are cached.

    If that cache expires exactly at midnight:

    • All requests miss cache

    • All hit the database

    • The database becomes the bottleneck

Now even simple reads are slow.

  1. Payment Failures

    Some users reach payment, but:

    • Payment gateways are rate-limited

    • Downstream services are already overloaded

Result:

  • Failed payments

  • Inconsistent orders

  • Very angry users

Why This Is a Thundering Herd Problem?

This situation checks every box:

  • Single trigger → sale start

  • Massive waiters → users waiting at midnight

  • Simultaneous wake-up → everyone clicks Buy

  • Few winners → limited stock

  • System overload → retries and contention

The problem isn’t inventory.
The problem is uncontrolled access.


The Real Solution: Token-Based Purchase

Big companies don’t let everyone fight for inventory.

They add a gate.

The Core Idea

Don’t let users compete for stock.
Let them compete for permission to buy.

That permission is a purchase token.


Step 1: Sale Starts (Inventory Is Still Protected)

At 12:00 AM:

  • Inventory is not opened directly

  • A lightweight Token Service becomes active

Step 2: Users Request a Token

When users click Buy Now, they actually call:

POST /get-purchase-token

This endpoint:

  • Is fast and cheap

  • Does not touch inventory

  • Does not trigger payment

Step 3: Only 10 Tokens Exist

In Redis:

available_tokens = 10

Each request performs one atomic operation:

remaining = DECR(available_tokens)
  • If remaining >= 0 → token granted

  • If remaining < 0 → rejected immediately

Because Redis operations are atomic:

  • Exactly 10 users get tokens

  • No race conditions

  • No overselling

Step 4: Secure Token Generation

Each successful user gets a token that:

  • Is cryptographically signed

  • Is bound to user and product

  • Has a short expiry (2–5 minutes)

  • Can only be used once

Everyone else is blocked early, before doing any damage.

Step 5: Purchase Requires a Token

The real purchase API:

POST /purchase

Only works if:

  • Token is valid

  • Token belongs to the user

  • Token is unused and unexpired

Only 10 requests ever reach inventory and payment.


Why This Works So Well?

Without tokens:

  • 1 million users hit inventory and payment

  • Systems overload instantly

With tokens:

  • 1 million users hit a cheap gate

  • Only 10 touch critical services

You move the competition away from fragile systems and into a controlled layer.


The Key Takeaway:

Scaling systems isn’t about adding more servers.
It’s about controlling who is allowed to proceed.

Flash sales fail when everyone rushes together.
They succeed when access is carefully orchestrated.

Final Thought:

The next time you miss an iPhone flash sale, remember:

It’s not bad luck.
It’s a distributed systems problem , one engineers design for every day.

And when done right, only 10 people should ever reach “Buy Now.”