Workforce & Backlog Management

Cherry-picking vs SL-based allocation

One queue. Same people. Wildly different outcomes depending on how work finds the right person.

01 / the problem

Everyone's busy. The important stuff is still waiting.

Here's something that happens in almost every team with a shared backlog. You've got ten people. The queue is full. Nobody's obviously slacking. And yet... a P1 that came in this morning still hasn't been touched. Meanwhile, three people are deep in P3 items they picked up an hour ago.

That's cherry-picking. Not malicious. Mostly it's not even conscious. People scan the queue, see something they know how to do, and start. The quick wins feel productive. The gnarly, high-pressure stuff stays in the pile because nobody wants to be the one to take it. And the pile grows in exactly the wrong place.

Cherry-picking optimises for the worker, not the customer.

When people self-select from a queue, they gravitate toward familiar, low-effort, low-pressure work. SLA breaches creep up on the items that matter most, while the team stays busy looking productive on items that could have waited.

Service level-based allocation flips this. Instead of letting people choose what they pick up, the system chooses for them, based on what's most at risk right now. Oldest P1 first. Then oldest P2. Skills matched. Nothing sits and quietly dies.

02 / the model

Four tiers. One queue. Sorted by urgency.

The core idea is simple: every piece of work has a priority, and each priority has a service level target, the time you've committed to resolving it in. The queue isn't a list, it's a ranked set of commitments. Your job is to drain it in the right order.

P1
Critical
SLA: 2 hours
Live outage / data loss / total blocker
P2
High
SLA: 8 hours
Major degradation, workaround exists
P3
Medium
SLA: 3 days
Non-blocking bug, feature request
P4
Low
SLA: 2 weeks
Cosmetic, nice-to-have, backlog grooming

The allocation logic has two rules. First: work is assigned to the person with the right skills who's free right now. Second: among all the work that person could take, pick the one whose SLA is closest to breaching. Not the easiest. Not the most familiar. The most urgent.

Age matters, not just priority.

A P2 that's been sitting for 7 hours and 45 minutes should beat a P1 that just arrived 20 minutes ago. SLA-based allocation tracks remaining time against target, not just raw priority level. The queue sorts by "time until breach", ascending.

03 / run it yourself

The same queue, the same team. Two very different systems.

Six workers. A live stream of incoming work. All items need someone with the right skill set. Press play and let it run for a minute, then flip between Cherry-pick mode and SLA-allocation mode and watch what happens to breach rates and queue depth at each priority level.

Interactive · live simulation

One backlog. Two ways of assigning it.

Observe where work piles up, which SLAs get breached, and which workers end up idle.

SLA Breaches
0
items past deadline
Oldest P1 Age
—
in queue, unassigned
P3/P4 Done
0
lower-pri completed
Total Done
0
all priorities
Workers
Queue – waiting for assignment
P1 Critical P2 High P3 Medium P4 Low age shown in red = SLA breached
2.6 / min
Session comparison: totals since last reset
Metric Cherry-pick SLA allocation
SLA Breaches (P1+P2) — —
Avg P1 wait (queue) — —
% low-pri done early — —
Total throughput — —
SLA breach trend
What to look for. In Cherry-pick mode, watch P3/P4 items drain quickly while P1/P2 items age in the queue. The breach count climbs. Flip to SLA allocation. Same people, same arrival rate. P1s and P2s get snatched first. P3/P4 completions slow down, but SLA health improves sharply. Try cranking the arrival rate up until the team is overwhelmed, then compare how each mode fails.
04 / the objections

It looks worse before it looks better. That's the point.

SLA-based allocation has one really uncomfortable side effect that catches managers off-guard: your P3/P4 throughput will drop when you first switch. Workers who were ploughing through easy tasks get redirected to hard, uncertain work. Velocity metrics dip. Someone will say the new system is making things worse.

They're measuring the wrong thing. Throughput isn't the goal. SLA health is the goal. A team that closes 40 tickets a day while letting 3 P1s breach has worse outcomes than a team that closes 28 tickets and keeps all P1/P2 targets green.

Cherry-pick looks like

High ticket closure numbers. Team always appears busy. Lots of P3/P4 completions. But SLA breach rate climbs slowly, and executives hear about it from customers first.

SLA allocation looks like

Slightly lower raw throughput. Some P4s age for weeks. But zero P1 breaches. Customers with critical issues get responses on time. That's the actual product.

The other objection is about autonomy. "My team are professionals. They know what needs doing." Maybe. But cognitive load is real. When someone opens a queue with 80 items, they default to familiarity. A good allocation system isn't distrust. It's cognitive offload. The system does the triage so the team can do the work.

05 / what actually works

Six things worth printing out and putting on the wall.

  1. Sort by time-to-breach, not priority label. A P2 with 15 minutes left beats a P1 with 90 minutes left. The urgency is in the remaining SLA window, not the number.
  2. Match skills first, then urgency. Routing a P1 to someone without the right skills isn't faster. It's just movement. Skill matching is the constraint; urgency sorts within that constraint.
  3. Make the breach risk visible. People cherry-pick partly because they can't easily see which items are about to breach. Surface that data prominently. Moral pressure is real and it's free.
  4. Don't let low-priority work disappear entirely. SLA allocation should deprioritise P3/P4 when there's high-urgency work, but if nobody's free and a P4 has been waiting two weeks, someone should touch it. SLA targets exist for every tier.
  5. Review skill coverage, not just headcount. If P1 items keep waiting it might not be a capacity problem. It might be that only one person can handle them. Breadth of skills is a queue-health lever just as much as headcount.
  6. Measure SLA compliance, not ticket count. The KPI you report on is the KPI you get. Teams optimise for what's measured. If you track tickets closed, you'll get cherry-picking. If you track SLA compliance by priority, you'll get allocation discipline.

The single thing worth carrying out of here: a full queue with good discipline beats a slow queue with bad discipline. The work doesn't drain itself. How you assign it is as important as having enough people to do it.

SLA Breached