Do Stronger Players Play More Accurately? I Analyzed 35,000 Games to Find Out

Nice post, thank you @lucb3 One little note : Lichess does not use ELO, it uses Glicko-2. You can read more about Lichess and Glicko-2 here : lichess.org/page/rating-systems

An interesting part from that page :

"Why don't they all use the same rating system?

Because the first rating system historically used, Elo, is pretty bad. Glicko-1, then Glicko-2, made considerable improvements, and offer greater accuracy. Even if everyone would use the same rating system, ratings would still not be comparable across different pools of players, so there's no need for chess servers to pay the cost of legacy forever, and they moved on to superior systems."

lucb3

Hey @achja ,

Thanks so much for your message and for the helpful clarification! You’re absolutely right—there’s a big difference between Elo and Glicko‐2. In everyday chat everyone still says “Elo” when they mean their Lichess rating, and if I’d written “correlation between Glicko‐2 and precision,” I’d have lost half my readers. I’ll add a little note at the end of the article to point this out. Thanks again!

achja

That explanation makes sense. And thank you for adding a little note @lucb3

mkubecek

1. You don't provide an exact definition of "precision"; is it the same as "accuracy" as shown by lichess or is it calculated in a different (even if likely similar) way?

2. The weakest point I see is that you present only the resulting relation between rating and precision but no measure of reliability of those results. In my experience, accuracy tends to vary a lot not only between different games of the same player but also between players of the same rating level, depending on their playing style. E.g. the lichess accuracy (not sure if it's the same as what you call "precission", see above) tends to harshly penalize risky and sharp tactical play and favors slow and risk free style; it also penalizes big mistakes harshly and is much more tolerant to getting outplayed slowly in small steps. Therefore I would expect the precision to vary a lot among players of the same rating level - but unfortunately you only present the mean (?) value without any information about at least the mean deviation or the actual distribution of precisions.

lucb3 edited

Thank you very much @mkubecek for taking the time to read the article — I really appreciate it and hope you enjoyed it.

1.Regarding your first point: yes, the term "precision" refers to the same percentage shown in Lichess’s post-game analysis. It measures how closely a player’s moves align with the best choices suggested by Stockfish. So, to clarify, it's exactly the same accuracy metric used by Lichess, derived directly from the engine’s evaluation.

2.You raise a very valid concern. Individual precision can indeed vary a lot — not only between games, but also among players with the same rating, especially depending on playing style. Tactical, risky, or sharp play is often penalized more by the engine than solid, risk-averse styles, which tend to result in higher precision scores.

That said, the goal of the article isn’t to evaluate individuals. Of course, it would be wrong to say, “I played like a 2200 because I had 92% precision in this one game.” But when we aggregate data across tens of thousands of games, a clear trend emerges: the higher the Elo, the more "engine-like" the average play becomes.

In other words, the study suggests that, on average, stronger players make fewer sharp or speculative moves, and instead play in a way that more closely resembles the computer’s recommendations. It’s not that sharp, creative play can’t succeed — of course it can — but it becomes less common as rating increases. The overall strategy that seems to correlate most with improvement is, simply put, to play more accurately.

As for variance or distribution of precision within each rating group — you're absolutely right, that would be a valuable dimension to explore in a follow-up analysis. But for this study, I wanted to keep the focus on the overall trend, which I think already gives us an interesting insight into how playing style evolves with rating.

Thanks again for your thoughtful comment — I really enjoyed reading it.

TotalNoob69

Just a small question: in both your data analyses, did you use the rating the player had when playing the game or the current rating of the player?

lucb3

Good question @TotalNoob69 — I used the rating the player had at the time the game started.

aVague

#10

I don't think with timecontrol 3+2 or even 1 minute per game it would be the same. 10+ games you have planty time to think, of course such correlation would appear