lichess.org
Donate

Rating comparison to other sites

THOSE NUMBERS DON'T MEAN ANYTHING! :D
The main difference though is that ELO can sometimes significantly deviate from 1200 (mostly due to the lack of any calculation of RD or volatility), while Glicko-2 is designed to pretty well average at 1500 over time no matter what.
@5: Concerning the average ratings they display over there at chess.com, they're incorrect and out-of-date since the site raised the default level of a new account from 1200 to 1500. Those numbers are therefore deflated; if you add 300 points to each of the segments you'll reach a more truthful result.
@everyone: You can't just assume that the average player on one site is the same as the average player on another site. For example, it's likely that the average ICC player is stronger than the average Yahoo chess player because sites attract different players.

I'm saying it's an unknown whether lichess.org and chess.com attract the same players, so average ratings are meaningless.
#11, exactly. They're just arbitrary numbers. It's not like you can count your rating in a real-world currency.

Worst of all is when I see people comparing two different rating systems (e.g. FIDE's Elo and Lichess's Glicko-2) which were never meant to be compared without a proper mathematical recalculation of your rating. There is thus no sense in comparing them.
Technically, if you had sufficiently many ratings of people who play in both pools, you could do some complicated statistical analysis (which involves somehow renormalizing the data to be comparable, somehow correcting either one or both distribution curves to become similar in shape).
#16, that may not be enough. You would probably have to recalculate each person's rating for both systems to make it a good comparison. Which means recalculating the result of each game -- a mammoth task. Only then can there be a somewhat fair comparison for each site's skill level.
#17 Good point, this would be the best way to do a fair comparison (to recalculate one of the sites' ratings using the other site's formula) and it is a mammoth task.

Now perhaps there's a way to mathematically manipulate the rating distribution curves to look similar in shape: for example, for both sites generate some "approximate Elo" rating which satisfies this criterion:

"The difference in the ratings between two players serves as a predictor of the outcome of a match. Two players with equal ratings who play against each other are expected to score an equal number of wins. A player whose rating is 100 points greater than their opponent's is expected to score 64%; if the difference is 200 points, then the expected score for the stronger player is 76%."
http://en.wikipedia.org/wiki/Elo_rating_system

You wouldn't necessarily need to analyze all the games to derive such an *approximation* for both sites. Then, if you had players in both pools and you knew their approximate Elo ratings, you could draw a correlation, then finally re-fit the data to the sites' original rating systems.

And at the end of the day, you'll have functions for approximating a player's rating on one site based on a player's rating on the other site, but it'll be so complex that nobody can understand it! Which leads back to the original point: that comparing ratings across sites is virtually meaningless.
OP here. I agree that comparing ratings across sites is virtually meaningless. Maybe the more interesting question is how do the strength of players compare from one site to another. Obviously lichess lacks the IM/GM representation, but I do wonder how are 25th,50th,75th percentiles would compare to other sites.

I can't think of a way empirically measure, but I'm curious what opinions you all have.
To do it accurately you'd need data from players who play on both sites.

To estimate it, you could analyze games from both sites and apply Prof. Regan's research on Intrinsic Performance Ratings to estimate how skilled players are based on their moves alone. I'm curious if anyone has tried such a thing...

This topic has been archived and can no longer be replied to.