After running my pickleball league with Glicko-2 for over a month, I realized the system had problems. So I did what any reasonable person would do: I threw it out and rebuilt it from scratch with an ELO system.
And yes, I happen to be the biggest beneficiary of the change. Coincidence? Probably. Let me explain the math, and you can be the judge.
The Problem: Glicko-2 Was Overkill
Glicko-2 is a sophisticated rating system designed for competitive chess. It tracks three values per player:
- Rating — Your skill estimate (default: 1500)
- Rating Deviation — How uncertain the system is about your skill
- Volatility — How consistent you are
The math involves converting to different scales, computing probabilities with hyperbolic functions, and solving iteratively for new volatility. It’s clever, but for a casual league of six players, it’s like bringing a sports car to a parking lot.
But the real problem was this: I added a margin bonus to account for wins by different margins (winning 11-9 vs 11-2). The formula?
weighted_score = base_score + tanh(margin/11 × 0.3) × (base_score - 0.5)
Translation: I took the hyperbolic tangent of a fraction, multiplied by an arbitrary constant (why 0.3? No particular reason), and called it science.
This is what’s known as “making stuff up.” It had no theoretical basis and was impossible to explain to players.
The Doubles Problem
The old system calculated team ratings by averaging both partners’ ratings. Sounds reasonable, right?
Until you think about it: If you (1400) play with a strong partner (1700) against two 1550s, the system thinks it’s an even match. But you were carried by a stronger player! Winning that match shouldn’t boost your rating as much as winning with a weaker partner.
The system didn’t account for partner strength, making it unfair for everyone.
Enter: Pure ELO
ELO is elegantly simple. Every player has one number representing their skill. When two players compete:
- Calculate the probability that one player beats the other based on rating difference
- Compare expected performance to actual performance
- Adjust ratings based on the difference
The key formula is:
Expected Win Probability = 1 / (1 + 10^((opponent_rating - your_rating) / 400))
If you’re 1500 and your opponent is 1500, you should win 50% of the time. If you’re 1600 and they’re 1500, you should win about 64% of the time. Simple.
After a match:
Rating Change = K × (Actual Performance - Expected Performance)
Where K = 32 (how much weight each match carries) and Actual Performance is your per-point performance:
Actual Performance = Points Scored / Total Points Played
Win 11-9? That’s 0.55 (55% of points). Win 11-2? That’s 0.846 (84.6%). This captures match quality far better than binary win/loss.
The Secret Sauce: The Effective Opponent Formula
In doubles, we use:
Effective Opponent Rating = Opponent1 + Opponent2 - Your Teammate
Why this works:
If your teammate is strong, the effective opponent rating drops—because your teammate made the match easier. If your teammate is weak, the effective opponent rating rises—because you were undermanned.
Beating 1500-rated opponents with a 1600-rated partner? Effective opponent: 1400. You gain less because your partner carried you.
Beating 1500-rated opponents with a 1400-rated partner? Effective opponent: 1600. You gain more because you did heavy lifting.
This is fair.
The Migration: Before and After
Here’s where things get spicy. I replayed all 29 historical matches through the new ELO system:
| Player | Old Glicko-2 | New ELO | Change | Matches |
|---|---|---|---|---|
| Andrew Stricklin | 1651 | 1538 | −113 | 19 |
| David Pabst | 1562 | 1522 | −40 | 11 |
| Jacklyn Wyszynski | 1557 | 1514 | −43 | 9 |
| Eliana Crew | 1485 | 1497 | +11 | 13 |
| Krzysztof Radziszeski | 1473 | 1476 | +3 | 25 |
| Dane Sabo | 1290 | 1449 | +159 | 25 |
Observations
The Rating Spread Compressed
The old system spread players across 361 rating points. The new system compresses them into 89 points. This makes sense—we’re a recreational group, not chess grandmasters. The new system rates us fairly within a tighter band.
The Winners
- Dane Sabo: +159 points. The old system penalized him for losses with weaker partners. The effective opponent formula gives credit for “carrying.” (Purely coincidental that I benefit from my own math.)
- Eliana Crew: +11 points
- Krzysztof Radziszeski: +3 points
The Losers
- Andrew Stricklin: −113 points. Still ranked #1, but the old system over-credited wins with strong partners.
- Jacklyn Wyszynski: −43 points
- David Pabst: −40 points
A Note on Conflicts of Interest
You may notice that the system designer (me) is also the biggest beneficiary of the new ratings, gaining a convenient 159 points.
I want to assure you this is purely coincidental and the result of rigorous mathematical analysis, not at all influenced by the fact that I was tired of being ranked last.
The new formulas are based on sound theoretical principles that just happen to conclude I was being unfairly penalized all along.
Trust the math. 😉
Why This System Works
For a small league:
- Simple to understand (one rating per player)
- Fair to individual skill (per-point scoring)
- Respects partnership (effective opponent formula)
- Transparent (you can calculate rating changes yourself)
- Fast convergence (5-10 matches to stabilize a rating)
The bottom line: Your rating now reflects your true skill more accurately than before. Even if it means Dane finally looks respectable.