I Built a Rating System for My Pickleball League (And Definitely Didn't Cook the Books)

Posted: 2026-02-26 | Categories: Projects, Recreation | Tags: ELO, pickleball, rating systems, statistics

After running my pickleball league with Glicko-2 for over a month, I realized the system had problems. So I did what any reasonable person would do: I threw it out and rebuilt it from scratch with an ELO system.

And yes, I happen to be the biggest beneficiary of the change. Coincidence? Probably. Let me explain the math, and you can be the judge.

The Problem: Glicko-2 Was Overkill

Glicko-2 is a sophisticated rating system designed for competitive chess. It tracks three values per player:

Rating — Your skill estimate (default: 1500)
Rating Deviation — How uncertain the system is about your skill
Volatility — How consistent you are

The math involves converting to different scales, computing probabilities with hyperbolic functions, and solving iteratively for new volatility. It’s clever, but for a casual league of six players, it’s like bringing a sports car to a parking lot.

But the real problem was this: I added a margin bonus to account for wins by different margins (winning 11-9 vs 11-2). The formula?

weighted_score = base_score + tanh(margin/11 × 0.3) × (base_score - 0.5)

Translation: I took the hyperbolic tangent of a fraction, multiplied by an arbitrary constant (why 0.3? No particular reason), and called it science.

This is what’s known as “making stuff up.” It had no theoretical basis and was impossible to explain to players.

The Doubles Problem

The old system calculated team ratings by averaging both partners’ ratings. Sounds reasonable, right?

Until you think about it: If you (1400) play with a strong partner (1700) against two 1550s, the system thinks it’s an even match. But you were carried by a stronger player! Winning that match shouldn’t boost your rating as much as winning with a weaker partner.

The system didn’t account for partner strength, making it unfair for everyone.

Enter: Pure ELO

ELO is elegantly simple. Every player has one number representing their skill. When two players compete:

Calculate the probability that one player beats the other based on rating difference
Compare expected performance to actual performance
Adjust ratings based on the difference

The key formula is:

Expected Win Probability = 1 / (1 + 10^((opponent_rating - your_rating) / 400))

If you’re 1500 and your opponent is 1500, you should win 50% of the time. If you’re 1600 and they’re 1500, you should win about 64% of the time. Simple.

After a match:

Rating Change = K × (Actual Performance - Expected Performance)

Where K = 32 (how much weight each match carries) and Actual Performance is your per-point performance:

Actual Performance = Points Scored / Total Points Played

Win 11-9? That’s 0.55 (55% of points). Win 11-2? That’s 0.846 (84.6%). This captures match quality far better than binary win/loss.

The Secret Sauce: The Effective Opponent Formula

In doubles, we use:

Effective Opponent Rating = Opponent1 + Opponent2 - Your Teammate

Why this works:

If your teammate is strong, the effective opponent rating drops—because your teammate made the match easier. If your teammate is weak, the effective opponent rating rises—because you were undermanned.

Beating 1500-rated opponents with a 1600-rated partner? Effective opponent: 1400. You gain less because your partner carried you.

Beating 1500-rated opponents with a 1400-rated partner? Effective opponent: 1600. You gain more because you did heavy lifting.

This is fair.

The Migration: Before and After

Here’s where things get spicy. I replayed all 29 historical matches through the new ELO system:

Player	Old Glicko-2	New ELO	Change	Matches
Andrew Stricklin	1651	1538	−113	19
David Pabst	1562	1522	−40	11
Jacklyn Wyszynski	1557	1514	−43	9
Eliana Crew	1485	1497	+11	13
Krzysztof Radziszeski	1473	1476	+3	25
Dane Sabo	1290	1449	+159	25

Observations

The Rating Spread Compressed

The old system spread players across 361 rating points. The new system compresses them into 89 points. This makes sense—we’re a recreational group, not chess grandmasters. The new system rates us fairly within a tighter band.

The Winners

Dane Sabo: +159 points. The old system penalized him for losses with weaker partners. The effective opponent formula gives credit for “carrying.” (Purely coincidental that I benefit from my own math.)
Eliana Crew: +11 points
Krzysztof Radziszeski: +3 points

The Losers

Andrew Stricklin: −113 points. Still ranked #1, but the old system over-credited wins with strong partners.
Jacklyn Wyszynski: −43 points
David Pabst: −40 points

A Note on Conflicts of Interest

You may notice that the system designer (me) is also the biggest beneficiary of the new ratings, gaining a convenient 159 points.

I want to assure you this is purely coincidental and the result of rigorous mathematical analysis, not at all influenced by the fact that I was tired of being ranked last.

The new formulas are based on sound theoretical principles that just happen to conclude I was being unfairly penalized all along.

Trust the math. 😉

Why This System Works

For a small league:

Simple to understand (one rating per player)
Fair to individual skill (per-point scoring)
Respects partnership (effective opponent formula)
Transparent (you can calculate rating changes yourself)
Fast convergence (5-10 matches to stabilize a rating)

The bottom line: Your rating now reflects your true skill more accurately than before. Even if it means Dane finally looks respectable.