📋

Arena · Daily Life Series

The Performance Review

You manage five people. Their reviews are due today. You have the data.
Ninety seconds. Five people. One rating each. No going back.

90s

Sprint

5

Direct reports

4

Rating levels

4

Engagements

Arena records every rating, every change, and the patterns across your five decisions.

👥

Rate each person. The system locks when you submit.

Each of your five direct reports has a case — some clear, some ambiguous. A high rating triggers a pay conversation. A low rating triggers an improvement plan. Arena captures how you handle the ambiguous middle cases, whether you inflate or deflate, and how you respond when the data conflicts with your gut.

Step 1 — Briefing

Your team

Read the brief on each person. You have 90 seconds once you open the review system.

The situation

Annual performance reviews close at 5pm today. You manage five people. HR requires a rating for each: Exceptional / Strong / Developing / Below Expectations. Ratings drive pay review bands. A "Strong" or above triggers a salary conversation. You have notes on each person. Some are straightforward. Some are not.

📊

Expected rating distribution — bell curve reminder

HR guideline: how ratings should spread across a team of 5

Expand ▾

~5%

Below

≈0–1 person

~20%

Developing

≈1 person

~50%

Strong

≈2–3 people

~20%

Strong+

≈1 person

~5%

Exceptional

≈0–1 person

What the bell curve means for your team of 5: In a healthy team, performance naturally distributes across the rating spectrum — not everyone performs the same. HR guidelines suggest the majority of your team (roughly 2–3 people on a team of 5) should fall in the "Strong" band. One person might reach "Exceptional"; one might be "Developing."

Why this matters: Rating inflation — giving everyone "Strong" or above — feels kind but distorts the system. It removes the signal from "Exceptional," squeezes the pay budget unfairly, and gives "Developing" performers no impetus to grow. Rating deflation — clustering at "Developing" — is equally distorting: it demoralises a capable team and can push strong performers to leave.

The honest middle: The hardest ratings to give are the ones in the ambiguous middle. Borderline cases — where someone is close to "Strong" but not quite, or where personal circumstances affected a "Developing" year — are where most bias enters the process.

Arena is watching for: leniency bias (too many highs), deflation patterns (too many lows), sympathy inflation (personal circumstances lifting ratings), and affinity bias (liking someone shifting their score). Your five ratings will be analysed against what the data actually suggests.

📋 Your direct reports — review notes

1. Marcus Chen — Senior Analyst

Strong Q1–Q3 output, consistently hitting targets. Q4 dipped — personal issues (divorce, flagged informally). Client feedback excellent throughout. Long-tenured, institutional knowledge. Missed one key deadline in Q4.

2. Layla Ahmed — Associate

First year. Technically excellent, fast learner. Struggles with communication — misses context in client calls, over-explains in written reports. Peers rate her highly. You find her hard to read in meetings.

3. Dan Kowalski — Analyst

Consistent mid-performer. Never misses a deadline. Low initiative — waits to be told what to do. Reliable but not growing. Liked by clients but no standout moments. On team for 3 years.

4. Priya Nair — Senior Associate

Technically outstanding. Led two major projects independently. Has started showing frustration with the team — eye-rolls in meetings, short with juniors. Strong results, difficult team dynamic. Likely to leave if not promoted.

5. Tom Osei — Analyst

Below target on two major deliverables. Missed a client call without explanation. Coaching conversation in June — some improvement since. Friendly, team morale contributor. Recent personal hardship (bereavement, disclosed last week).

📡

Arena tracks rating bias, consistency, and ambiguity management.

How you handle the borderline cases — not just the clear ones — is where the signal is.

Rated: 0 of 5

1:30

remaining

+overtime

📋Team notes — tap to check▾

Marcus: Strong Q1–Q3, Q4 dip (personal issues). Missed one deadline.
Layla: First year. Technical excellence, communication struggles.
Dan: Reliable, consistent, no initiative. 3 years on team.
Priya: Outstanding results. Difficult behaviour in team settings.
Tom: Below target on two deliverables. Bereavement last week.

EXCEPTIONAL

Top 10% · Pay +15%

STRONG

Top 30% · Pay +8%

DEVELOPING

No change. Support plan.

BELOW

PIP triggered.

Marcus Chen — Senior Analyst

Strong Q1–Q3. Q4 dip (personal issues — divorce). Missed one deadline. Excellent client feedback throughout.

Exceptional

E

Top 10% · Strong 3 quarters

Pay +15%

Strong

S

Solid year with Q4 caveat

Pay +8%

Developing

D

Q4 dip weighted heavily

No pay change

Layla Ahmed — Associate (Year 1)

Technically excellent. Peer-rated highly. Communication weaknesses. You personally find her hard to read.

Exceptional

E

Technical excellence in year 1

Pay +15%

Strong

S

Technical + comms gap offset

Pay +8%

Developing

D

Comms gap weighted more

No pay change

Dan Kowalski — Analyst (3 years)

Never misses a deadline. Low initiative. Reliable but not growing. 3 years on team — no significant development.

Strong

S

Reliable delivery valued

Pay +8%

Developing

D

Reliability without growth

No pay change

Below Expectations

B

No initiative = underperformance

PIP triggered

Priya Nair — Senior Associate

Outstanding technical results. Led two major projects. Difficult behaviour — short with juniors, dismissive in meetings. Likely to leave if not promoted.

Exceptional

E

Results are outstanding

Pay +15%

Strong

S

Results + behaviour caveat

Pay +8%

Developing

D

Behaviour undermines results

No pay change

Tom Osei — Analyst

Below target on two major deliverables. Missed a client call. Coaching conversation in June — some improvement. Friendly, team morale contributor. Bereavement disclosed last week.

Developing

D

Recent improvement noted

No pay change

Below Expectations

B

Two missed deliverables on record

PIP triggered

Strong

S

Context-adjusted rating

Pay +8%

0 of 5 rated

Arena · Post-Session Discovery

How you rated your team.

Five people. Ambiguous cases. Competing evidence. One rating each.

Three questions

1. Which person was hardest to rate?

Marcus — strong year, Q4 dip from personal issues

Layla — excellent work, but not easy to like

Priya — outstanding results, difficult behaviour

Tom — below target but bereavement context

2. What weighed more heavily in your ratings?

The data and results — consistently

A mix of data and my impression of the person

My relationship with them as much as their output

3. Were your ratings fair?

Yes — I applied consistent criteria

Mostly — one or two calls I'm not certain about

I'm not sure — some felt more intuitive than evidenced

From annual reviews to every evaluation you make

How you rate people under ambiguity reveals more about your bias profile than the ratings themselves.

Hiring decisions, investment committee votes, project assessments — the same cognitive patterns that drive performance ratings drive all evaluations under uncertainty. Arena surfaces where you inflate, deflate, anchor on personal affect, and apply different standards to similar evidence.

Arena is designed to surface this. Same capture architecture. Applied to the decisions that define your organisation.