- Blind mode tutorial
lichess.org
Donate

How to estimate your FIDE rating (conversion formula inside)

To reiterate: I don't think it's reasonable to expect that the formula will be "spot-on" for every individual. On average, though, it should be pretty accurate.

The whole exercise was prompted by the fact that many people don't have access to Fide-rated tournaments, and can only play online. There are good reasons for those people to get a ballpark estimate of their Fide rating; for instance, several books are explicitly pitched at a certain level.

This is just a tool for these people to estimate their Fide rating (give or take a few points), based on the available data. No more, no less. In that respect, I think it provides a valuable service, even if it's obviously not perfect.

The formula is definitely not "meaningless". Whether it's useful to you as an individual is a different thing altogether. Feel free to ignore it if you don't care about the result.

To reiterate: I don't think it's reasonable to expect that the formula will be "spot-on" for every individual. On average, though, it should be pretty accurate. The whole exercise was prompted by the fact that many people don't have access to Fide-rated tournaments, and can only play online. There are good reasons for those people to get a ballpark estimate of their Fide rating; for instance, several books are explicitly pitched at a certain level. This is just a tool for these people to **estimate** their Fide rating (give or take a few points), based on the available data. No more, no less. In that respect, I think it provides a valuable service, even if it's obviously not perfect. The formula is definitely not "meaningless". Whether it's useful to you as an individual is a different thing altogether. Feel free to ignore it if you don't care about the result.

@dudeski_robinson writes:

"The whole exercise was prompted by the fact that many people don't have access to Fide-rated tournaments, and can only play online. There are good reasons for those people to get a ballpark estimate of their Fide rating; for instance, several books are explicitly pitched at a certain level.

This is just a tool for these people to estimate their Fide rating (give or take a few points), based on the available data. No more, no less. In that respect, I think it provides a valuable service, even if it's obviously not perfect."

mdinnerspace asks:
So the whole point of downloading every game ever played at lichess, to use a weeks computer time 24/7 to search and verify every member who "volunteered a FIDE rating", compare it to their online blitz rating, was to create a formula that predicts a rating approximately 100 points below their online rating?

I coulda told you that. Saved yourself all the effort. +/- is 100 is within a reasonable approximation.

Bold prediction though, your formula is"within a few points".
Where is the empirical evidence of cases in point? Actual examples of players who for the 1st time playing OTB and their results as compared to online ratings? Would this not be more scientific than volunteered profile ratings?

You present no such data. Only a speculation that what is assumed to be true, can be proven by graphs and statistics, which is easily fabricated to fit expected results. Very common in your field as you're well aware.

You have yet to discuss the "flaw" I have made regarding the formula.
Which is that of a progressive value after calculations.
The lower the online rating, a progressively higher value is estimated for the FIDE.
The higher the online rating, a progressively lower value is estimated for the FIDE.
The center point on a graph is very close to a 1790 online rating.
The formula is not valid. It progressively skews results outward, resulting in no consistent results.

@dudeski_robinson writes: "The whole exercise was prompted by the fact that many people don't have access to Fide-rated tournaments, and can only play online. There are good reasons for those people to get a ballpark estimate of their Fide rating; for instance, several books are explicitly pitched at a certain level. This is just a tool for these people to **estimate** their Fide rating (give or take a few points), based on the available data. No more, no less. In that respect, I think it provides a valuable service, even if it's obviously not perfect." mdinnerspace asks: So the whole point of downloading every game ever played at lichess, to use a weeks computer time 24/7 to search and verify every member who "volunteered a FIDE rating", compare it to their online blitz rating, was to create a formula that predicts a rating approximately 100 points below their online rating? I coulda told you that. Saved yourself all the effort. +/- is 100 is within a reasonable approximation. Bold prediction though, your formula is"within a few points". Where is the empirical evidence of cases in point? Actual examples of players who for the 1st time playing OTB and their results as compared to online ratings? Would this not be more scientific than volunteered profile ratings? You present no such data. Only a speculation that what is assumed to be true, can be proven by graphs and statistics, which is easily fabricated to fit expected results. Very common in your field as you're well aware. You have yet to discuss the "flaw" I have made regarding the formula. Which is that of a progressive value after calculations. The lower the online rating, a progressively higher value is estimated for the FIDE. The higher the online rating, a progressively lower value is estimated for the FIDE. The center point on a graph is very close to a 1790 online rating. The formula is not valid. It progressively skews results outward, resulting in no consistent results.

@mdinnerspace

What have I "fabricated", exactly?

The whole point of the exercise was to find the most precise formula possible, based on the available data. I have done that, and I have demonstrated that, based on 2000+ Lichess profiles, my formula is a fair bit more precise than simply subtracting 100 points to someone's online rating (see my post on mean squared error).

Look, I'm under no illusion that this is world-changing. I did this for fun, the computer ran for a week, but I automated it and the amount of work necessary was actually very small. I spent more time arguing with you than preparing and writing the original post (now THAT was a waste of time).

Now you seem to agree that the formula is right (on average), but you claim that it is imprecise. Fair enough. The imprecision is determined by the vertical spread of the dots in the original scatter plot. I can't do anything about this: the data's the data.

The fact that you keep arguing about the weights of the formula or the statistical approach doesn't bother me (I know that you don't understand basic statistics, so there's no point in arguing further).

What kind of bugs me, however, is when you keep implying that the data were somehow tampered with to justify some theory (what's my interest in cheating, here?). To be honest, I also don't care much for your recurrent use of ridiculously strong words like "fabricated", "speculation", "meaningless", "useless".

This was done for fun. I thought it was interesting. Many others also thought it was interesting. If you don't think it is interesting, that's OK too. Just don't be an ass about it.

@mdinnerspace What have I "fabricated", exactly? The whole point of the exercise was to find the most precise formula possible, based on the available data. I have done that, and I have demonstrated that, based on 2000+ Lichess profiles, my formula is a fair bit more precise than simply subtracting 100 points to someone's online rating (see my post on mean squared error). Look, I'm under no illusion that this is world-changing. I did this for fun, the computer ran for a week, but I automated it and the amount of work necessary was actually very small. I spent more time arguing with you than preparing and writing the original post (now THAT was a waste of time). Now you seem to agree that the formula is right (on average), but you claim that it is imprecise. Fair enough. The imprecision is determined by the vertical spread of the dots in the original scatter plot. I can't do anything about this: the data's the data. The fact that you keep arguing about the weights of the formula or the statistical approach doesn't bother me (I know that you don't understand basic statistics, so there's no point in arguing further). What kind of bugs me, however, is when you keep implying that the data were somehow tampered with to justify some theory (what's my interest in cheating, here?). To be honest, I also don't care much for your recurrent use of ridiculously strong words like "fabricated", "speculation", "meaningless", "useless". This was done for fun. I thought it was interesting. Many others also thought it was interesting. If you don't think it is interesting, that's OK too. Just don't be an ass about it.

Answer to your question #1 by @mdinnerspace :

The ideal research design for me would be to know the real names of all Lichess players, and to match that to Fide database to know their real ratings. The fact that they're first time OTB players is irrelevant.

But yeah, the research design is not absolutely perfect here (as I acknowledged clearly in the original post). I just did the best possible with the available data.

Answer to your question #1 by @mdinnerspace : The ideal research design for me would be to know the real names of all Lichess players, and to match that to Fide database to know their real ratings. The fact that they're first time OTB players is irrelevant. But yeah, the research design is not absolutely perfect here (as I acknowledged clearly in the original post). I just did the best possible with the available data.

The formula predicts approx. the same FIDE rating as a 1790 online rating.
As ratings lower ...(refer to previous posts for calculations)
A 1600 online rating results in an approx. 1650 FIDE ...+50
A 1200 online rating results in an approx. 1300 FIDE ...+100
A 600 online rating results in an approx. 750 FIDE ...+150

While ratings above 1790 take the other direction.
A 1900 online rating results in an approx. 1850 FIDE -50
A 2200 online rating results in an approx. 2100 FIDE -100
A 2700 online rating results in an approx. 2550 FIDE -150

The "formula" is flawed.
It is telling us your FIDE rating will be higher if you're lower rated and lower if you're higher rated.
What evidence supports this?

The formula predicts approx. the same FIDE rating as a 1790 online rating. As ratings lower ...(refer to previous posts for calculations) A 1600 online rating results in an approx. 1650 FIDE ...+50 A 1200 online rating results in an approx. 1300 FIDE ...+100 A 600 online rating results in an approx. 750 FIDE ...+150 While ratings above 1790 take the other direction. A 1900 online rating results in an approx. 1850 FIDE -50 A 2200 online rating results in an approx. 2100 FIDE -100 A 2700 online rating results in an approx. 2550 FIDE -150 The "formula" is flawed. It is telling us your FIDE rating will be higher if you're lower rated and lower if you're higher rated. What evidence supports this?

Answer to your question #2 by @mdinnerspace

You are worried that my formula will act funny at very high or very low ratings. This is easy to assess empirically.

  1. Consider the profiles for all 95,730 players who played at least 50 classical and at least 50 blitz games on Lichess.
  2. Calculate the predicted Fide rating for each of those players using my formula.
  3. Plot those predictions against the players’ actual/observed Classical ratings.
  4. Do the predictions look crazy or pretty stable?

Results: https://imgur.com/a/jdXMs

To my eye, the predictions look remarkably stable across the sample of Lichess profiles. Sure, you can make up crazy examples. But when you consider actual combinations of ratings from actual players on Lichess, the predicted ratings look non-crazy and pretty stable across skill levels.

Answer to your question #2 by @mdinnerspace You are worried that my formula will act funny at very high or very low ratings. This is easy to assess empirically. 1. Consider the profiles for all 95,730 players who played at least 50 classical and at least 50 blitz games on Lichess. 2. Calculate the predicted Fide rating for each of those players using my formula. 3. Plot those predictions against the players’ actual/observed Classical ratings. 4. Do the predictions look crazy or pretty stable? Results: https://imgur.com/a/jdXMs To my eye, the predictions look remarkably stable across the sample of Lichess profiles. Sure, you can make up crazy examples. But when you consider actual combinations of ratings from actual players on Lichess, the predicted ratings look non-crazy and pretty stable across skill levels.

And by the way, nothing says that the gap between Lichess ratings and Fide ratings should be exactly constant at every level of skill. It may be that the different pools of players at OTB tournaments and online make it so that ratings are more de/inflated for high/low-rated players. So the fact that the formula produces different gaps in different parts of the distribution is not a problem at all. It just reflects the empirical/observed features of the dataset at hand.

And by the way, nothing says that the gap between Lichess ratings and Fide ratings should be exactly constant at every level of skill. It may be that the different pools of players at OTB tournaments and online make it so that ratings are more de/inflated for high/low-rated players. So the fact that the formula produces different gaps in different parts of the distribution is not a problem at all. It just reflects the empirical/observed features of the dataset at hand.

@dudeski_robinson
I in no way suggest any tampering with the evidence.
I only suggest that the evidence presented, as factual as it is, is not the entire evidence and should not be construed as the only criteria to make your premise.

@dudeski_robinson I in no way suggest any tampering with the evidence. I only suggest that the evidence presented, as factual as it is, is not the entire evidence and should not be construed as the only criteria to make your premise.

@mdinnerspace Good, now please look at my last 2 posts, look at the graph. You'll see that your concern about extreme values is empirically unfounded.

@mdinnerspace Good, now please look at my last 2 posts, look at the graph. You'll see that your concern about extreme values is empirically unfounded.

By the way, we had this in our German forum. Elo vs. DWZ (German „Elo“), 20.000 entries have been considered. They discussed it and somewhat a quadratic estimation turned out to be best although it looked linear at first glance.

Best fit was ELO = 0,0000886⋅DWZ^2 + 0,515⋅DWZ + 674

So the same procedure, with more pros than cons.

PS: just one posting with the raw data plots

https://www.schachfeld.de/threads/6246-dwz-wieviel-elo/page3?p=486589&viewfull=1#post486589

(not sure if it works, number #54, 24.10.2017, 17:44)

By the way, we had this in our German forum. Elo vs. DWZ (German „Elo“), 20.000 entries have been considered. They discussed it and somewhat a quadratic estimation turned out to be best although it looked linear at first glance. Best fit was ELO = 0,0000886⋅DWZ^2 + 0,515⋅DWZ + 674 So the same procedure, with more pros than cons. PS: just one posting with the raw data plots https://www.schachfeld.de/threads/6246-dwz-wieviel-elo/page3?p=486589&viewfull=1#post486589 (not sure if it works, number #54, 24.10.2017, 17:44)

This topic has been archived and can no longer be replied to.