The physics
of waiting.
Why lines form, why your customers give up, and why the same number of agents can deliver brilliant service one day and a meltdown the next. No maths degree required.
Waiting is a property of the system, not a measure of effort.
Here's the uncomfortable truth that queueing theory hands every operations leader: your customers' wait times are mostly decided before the day even starts by how you've arranged things. Not by how hard people work on the phones.
Two facts drive almost everything that follows, and both cost real money:
Every customer who gives up while waiting is a lost sale, a complaint, or a callback that costs you twice. Abandonment is the most expensive number most centres barely look at.
Take a fixed group of agents. Slice them up one way and customers wait seconds. Slice them another way, with the exact same headcount, and the queue collapses. The structure is the lever. That's the whole game.
Get the structure right and you buy shorter waits, fewer walkaways, and calmer staff without hiring a single extra person. Get it wrong and no amount of effort rescues it. The good news is that the rules are simple and they're below.
Three ingredients, and the numbers they cook up.
Strip away the jargon and a queue is just three things meeting each other: work that arrives, how long each job takes, and how many people can do it. Everything customers feel, the waiting, the giving up, falls out of how those three interact.
Queueing theory is simply the study of that interaction. The metrics below are what it produces. The business name is the one your board cares about. The technical name is the one your WFM and analytics teams use. They're the same thing wearing different clothes.
Busy is good. Too busy is a cliff.
You'd think pushing agents to be busier just makes things gradually a bit slower. It doesn't. Waiting stays low and flat while agents are moderately busy, then near the top it goes vertical. The last sliver of "efficiency" buys you an avalanche of waiting.
Drag the volume below. Watch the occupancy creep up calmly while the wait sits quiet, then explode. This is why planners aim to keep agents around 80–85% busy, never 100%. That headroom isn't waste. It's what keeps the queue from falling off the cliff.
One team of 20 agents
Five-minute average handle time. Slide the incoming call volume up and down.
Four players, one shared fate.
Customers
They have a fixed patience. Cross it and they're gone, often for good. Their wait is the number that defines your reputation.
Agents
The servers. What matters isn't only how many you have, but how many are allowed to take any given contact.
Planners & WFM
They forecast demand and staff against it. Bigger, simpler pools are far easier to forecast accurately than many tiny ones.
The business
Pays for the agents, lives with the walkaways. Queue structure is one of the few levers that moves both cost and experience at once.
Three places you decide how the queue behaves.
You don't manage waiting on the phones. You design it upstream, in three decisions that most centres make almost by accident. Each one quietly sets how big your pools are, and pool size is everything (you'll see exactly why in the next section).
The IVR menu (call steering)
Every menu branch you add chops your demand into a thinner stream that then needs its own little pool of agents. "Press 1 for this, 2 for that, 3 for the other" feels helpful, but each press is a tax on pool size. Over-segmenting the front door is the most common self-inflicted wound in the building.
→ Identify intent fast, route once, and steer to the fewest, broadest sensible destinations.
Queue design
Fewer, larger queues answer faster than many small ones at the exact same staffing. Splitting a queue so your reports are clearer is fine and often smart. Splitting the agent pool behind it is what quietly costs you waits. Keep those two ideas separate in your head.
→ Report granularly if you like, but pool the people generously.
Agent competency (skills)
This is the lever with the most leverage. An agent's "skills" decide which contacts they're allowed to take. Narrow, specialised skills create lots of tiny pools, and in a tiny pool a customer can be waiting while qualified-but-untrained agents sit idle ten feet away. Broaden the skills and those idle agents suddenly become available. Same people, far less waiting.
→ Use the fewest skill groups that still keep each pool comfortably large.
Pool generously. Fragmenting is a tax you pay in waiting.
Here's the principle that surprises everyone: a few big pools serve people dramatically faster than many small pools, even with the same total headcount. Think of a bank with one snaking queue feeding ten tellers versus ten separate lines. The snake wins every time, because one slow customer can't trap you behind the wrong wall.
And it bites hardest exactly when things get busy or bursty. Big pools absorb a sudden rush; small pools choke on it. So the moment you can least afford waiting is the moment fragmentation punishes you most.
Below: take one team of 48 agents and the same total call volume. Now chop them into more and more separate skill groups. Watch what happens to the wait and the walkaways. Nothing else changes. Only the number of walls between idle agents and waiting customers.
48 agents, one workload, sliced into pieces
Held at a sensible ~80% busy overall. The only thing you change is how many separate groups the team is split into.
Overlapping queues, and who gets to go first.
Real centres rarely have one queue per team. More often, several queues all point at the same group of agents. One team quietly serves sales, billing, and complaints all at once. That's overlapping queues, and it's pooling done on purpose: the capacity is shared, so the waits drop. The catch is that those queues now compete for the same people.
Priority is how you settle the competition. Hand one queue priority and its callers jump ahead of the others every time an agent frees up. That's exactly what you want for a VIP line or a vulnerable-customer queue. But it has a sharp edge, and most people don't see it coming:
When the centre gets busy, a priority stream keeps cutting to the front, so the lower-priority queue waits longer and longer. Push it far enough and the low queue effectively stops moving. Priority doesn't create capacity. It just decides who suffers the shortage.
The elegant middle path is ring (or "bullseye") routing: start a contact aimed at the smallest, best-matched group of agents, and widen the eligible pool the longer they wait. Specialists when there's slack, the whole pool when there isn't. You get expertise and pooling, and you decide the trade with a timer instead of a wall.
All of this is easier to feel than to read about. So let's actually run it.
A live queue, one event at a time.
This is a proper discrete-event simulation, the same approach the open-source queueing tools use: calls arrive at random, grab a free agent if one's eligible, get served for a random handle time, and give up if they wait past their patience. Nothing here is averaged or pre-baked. It's individual calls and individual agents, playing out in front of you.
Eight agents. Two kinds of call: everyday Standard and impatient VIP. Try the experiments below the stage.
One contact centre, running in real time
Press play. Then push the volume up and watch the queue build. Flip priority on and off. Switch between one shared pool and dedicated splits.
The same run, replayed as a moving chart.
The agent stage above shows you the now. This shows you the story over time, the way a trading screen does. While the simulation up there is playing, these charts record every metric live. Save a run, change the settings, save another, and see them all from t=0 on the same chart.
Waits and walkaways, second by second
Press play on the simulation above first. Save a run when you're happy, then change mode or priority and run another. Saved runs overlay from t=0 so you can compare directly.
| Metric | Shared pool | Dedicated split |
|---|---|---|
| VIP wait | — | — |
| Standard wait | — | — |
| Walkaways | — | — |
How busy the team is, against the calls coming in
When the green line pins near the top, you've run out of headroom - and that's exactly when the waits above take off. Same timeline, same scrubber.
Six rules of thumb for the next planning meeting.
- Aim for ~80–85% busy, never 100%. The last slice of utilisation buys you exponential pain. Headroom is the price of a calm queue.
- Merge before you split. When in doubt, pool. Bigger pools are faster, steadier, and easier to forecast.
- Treat every IVR branch as a cost. Each one shrinks a pool. Make every menu option earn its place.
- Separate "report" from "route". You can slice the numbers finely without slicing the people. Granular reporting, generous pooling.
- Skill breadth is pool size. The width of your agents' skills is the most powerful wait-time lever you own. Use it deliberately.
- When you consolidate skills, watch quality. If waits drop but handle time and resolution hold, you got a genuine win. If quality slips, you made a trade, and you should know its size.
The technical translation.
Everything above is the same theory your analysts, WFM team and platform engineers use. Here's the bridge so the business and the engine room are pointing at the same things. Hand this section to whoever builds the routing.
| Business term | Technical term | Symbol |
|---|---|---|
| Demand / calls arriving | Arrival rate (Poisson) | λ |
| Handle time | Mean service time | 1 / μ |
| Agents / servers | Channels | c |
| Offered demand | Offered load (Erlangs) | A = λ / μ |
| How busy agents are | Utilisation / traffic intensity | ρ = A / c |
| Chance a customer waits | Probability of delay (Erlang C) | Pw |
| Average wait / speed of answer | Mean waiting time in queue | Wq |
| % answered in target | Service level at time T | SL(T) |
| Work in the system | Mean number in system | L |
| Walkaways | Abandonment (Erlang A / patience) | θ |
A fair-warning footnote. The classic Erlang C model assumes nobody hangs up. Real customers do. Once you add patience, the proper model is Erlang A (the abandonment column above), which predicts shorter waits than Erlang C because impatient callers leave and relieve the queue. The walkaway figures in the interactives use a simple patience approximation for illustration. For production staffing, model abandonment explicitly. Treat the numbers here as the shape of the truth, not a quote to staff against.