- Blind mode tutorial
lichess.org
Donate

How to estimate your FIDE rating (conversion formula inside)

@krasnaya

I'm not exactly sure which standard error you would like me to calculate.

What I can tell you now is that if you only look at profiles for people who have played at least 50 blitz and 50 classical games (over 95,000 observations), the correlation between blitz and classical ratings is 0.89.

If I regress blitz on classical ratings using ordinary least squares, the residual standard error is about 122.

@krasnaya I'm not exactly sure which standard error you would like me to calculate. What I can tell you now is that if you only look at profiles for people who have played at least 50 blitz and 50 classical games (over 95,000 observations), the correlation between blitz and classical ratings is 0.89. If I regress blitz on classical ratings using ordinary least squares, the residual standard error is about 122.

dudeski_robinson said:

I'm not exactly sure which standard error
you would like me to calculate.

OK, i could have said it more clearly: if blitz playing strength and classical playing strength is completely coupled the difference between the two ratings should be either (near) zero or it should be at least some constant (like "blitz rating + X = classical rating").

If you calculate the real difference of these two values and the standard deviation of these differences you get a measure for how scattered the distribution is and a verfication or falsification of the underlying concept of "strength uniformity" versus "strength disparity". No?

Note that if you take the normal (quadratic) standard deviation you'd lose the fact that some players have a higher blitz rating than classical rating and for some it is the other way round.

i.e. A: b=1900, c=2000 ; B: b=1900, c=1800
Quadratic \sigma would be 100, but in fact the distibution is in the range of 200 points around the blitz rating.

krasnaya

dudeski_robinson said: > I'm not exactly sure which standard error > you would like me to calculate. OK, i could have said it more clearly: if blitz playing strength and classical playing strength is completely coupled the difference between the two ratings should be either (near) zero or it should be at least some constant (like "blitz rating + X = classical rating"). If you calculate the real difference of these two values and the standard deviation of these differences you get a measure for how scattered the distribution is and a verfication or falsification of the underlying concept of "strength uniformity" versus "strength disparity". No? Note that if you take the normal (quadratic) standard deviation you'd lose the fact that some players have a higher blitz rating than classical rating and for some it is the other way round. i.e. A: b=1900, c=2000 ; B: b=1900, c=1800 Quadratic \sigma would be 100, but in fact the distibution is in the range of 200 points around the blitz rating. krasnaya

@krasnaya Here it is:

average_gap = mean(classical_rating - blitz_rating) = 150.38
std_dev(classical_rating - (blitz_rating + gap)) = 125.29

So you're right, there's still a fair amount of variation. The relationship between the two is not perfect.

However, I work with social science data all the time, and I can tell you without a doubt that correlations of more than 0.89 rarely pop-up in real life.

Is the relationship between blitz and classical skills perfect? No, obviously not. Is it very strong? Absolutely.

@krasnaya Here it is: average_gap = mean(classical_rating - blitz_rating) = 150.38 std_dev(classical_rating - (blitz_rating + gap)) = 125.29 So you're right, there's still a fair amount of variation. The relationship between the two is not perfect. However, I work with social science data all the time, and I can tell you without a doubt that correlations of more than 0.89 rarely pop-up in real life. Is the relationship between blitz and classical skills perfect? No, obviously not. Is it very strong? Absolutely.

Blitz and classical ratings can not be compared at lichess without knowing the games played. Classical is defined as any game 8+ minutes. Blitz games can be 5+increment, often lasting longer.

krasnaya stated the point better than I:
Any formula that pretends to predict an outcome whereby all the data is not represented is false.
The OP even suggests his formula is not accurate within a wide margin. At other times proclaims it's accuracy within a few points in defense off all his time and troubles.
The question begs to answered. Why the effort if not to make an accurate prediction?
The premise, that a correlation exists between an online rating and a NEW player to OTB, his expected rating in such an event is false.
Statistics can show a "median" of online ratings and ESTABLISHED OTB ratings, but the statistics prove nothing when attempting predict what a players OTB rating will be based on fast time control games online.

Blitz and classical ratings can not be compared at lichess without knowing the games played. Classical is defined as any game 8+ minutes. Blitz games can be 5+increment, often lasting longer. krasnaya stated the point better than I: Any formula that pretends to predict an outcome whereby all the data is not represented is false. The OP even suggests his formula is not accurate within a wide margin. At other times proclaims it's accuracy within a few points in defense off all his time and troubles. The question begs to answered. Why the effort if not to make an accurate prediction? The premise, that a correlation exists between an online rating and a NEW player to OTB, his expected rating in such an event is false. Statistics can show a "median" of online ratings and ESTABLISHED OTB ratings, but the statistics prove nothing when attempting predict what a players OTB rating will be based on fast time control games online.

The OP has yet to address the issue regarding his "formula".

That issue is:

It's predicted OTB rating is higher for lower rated online players, and at the same time the formula predicts a lower rating for higher rated online players. The problem lies in the fact that predictions are progressive, the lower the online rating, the higher (relative) rating. Same applies to the higher rated online, the higher the rating, the predicted OTB rating becomes (relatively) that much lower.

This discrepancy starts at approx. 1800

Clearly the fault lies in the +187 constant. A fact that few seem to realize. A faulty mathematical formula.

The OP has yet to address the issue regarding his "formula". That issue is: It's predicted OTB rating is higher for lower rated online players, and at the same time the formula predicts a lower rating for higher rated online players. The problem lies in the fact that predictions are progressive, the lower the online rating, the higher (relative) rating. Same applies to the higher rated online, the higher the rating, the predicted OTB rating becomes (relatively) that much lower. This discrepancy starts at approx. 1800 Clearly the fault lies in the +187 constant. A fact that few seem to realize. A faulty mathematical formula.

@mdinnerspace

I have addressed this specific question in these two posts:

https://lichess.org/forum/general-chess-discussion/how-to-estimate-your-fide-rating-conversion-formula-inside?page=20#196

https://lichess.org/forum/general-chess-discussion/how-to-estimate-your-fide-rating-conversion-formula-inside?page=20#197

@mdinnerspace I have addressed this specific question in these two posts: https://lichess.org/forum/general-chess-discussion/how-to-estimate-your-fide-rating-conversion-formula-inside?page=20#196 https://lichess.org/forum/general-chess-discussion/how-to-estimate-your-fide-rating-conversion-formula-inside?page=20#197

Does not address the question regarding the formula.
A constant number (+187) and it's application.
A poor understanding of mathematics for a statistics instructor.

Does not address the question regarding the formula. A constant number (+187) and it's application. A poor understanding of mathematics for a statistics instructor.

It is evident as ratings above or below an online rating of 1800, the "formulas" prediction of an OTB rating takes a steady course, that of higher (for lower rated online) and lower (for higher rated online). The formulas predictions are progressive, as the ratings approach 0 and 3000, the greater the prediction.

It is evident as ratings above or below an online rating of 1800, the "formulas" prediction of an OTB rating takes a steady course, that of higher (for lower rated online) and lower (for higher rated online). The formulas predictions are progressive, as the ratings approach 0 and 3000, the greater the prediction.

@mdinnerspace

Once more onto the breach.

There are good questions in your posts, but my answers in #196 and #197 did specifically address your points about the formula. Let me try to clarify.

You raised a perfectly valid point/worry in your posts: If the relationship between Fide and ratings+blitz is non-linear, then the model could produce "crazy" results for very high or very low ratings. To illustrate this, you created extreme examples of combinations of Lichess ratings for blitz and classical, and showed that the predicted results were strange.

This is a perfectly good point to raise, as it could potentially cause important problems. After all, the regression model (i.e., formula) is a linear approximation. Thankfully, it's a criticism we can assess empirically.

In post #196, I addressed your numerical counter-examples. Specifically, I showed that your extreme examples were not realistic, and therefore not relevant to the evaluation of the model. For example, there are no observed Lichess ratings below 700 in the sample. Moreover, I calculated predicted Fide ratings for all actually observed rating combinations on Lichess, and showed that the formula did not behave in crazy fashion like you claim. For most of the actually observed combinations of blitz and classical ratings, the average gap between classical and Fide remains quite stable (though not totally constant; more on this below). For real-world players (not made up examples), there is essentially no "flipping" going on like in your extreme, contrived, and unrealistic examples.

Your second critique was that the average gap between observed Classical and predicted Fide was not constant. In post #197, I addressed this non-constant gap argument explicitly. In short, when you have a regression model with TWO independent variables (i.e., blitz and classical ratings), the gap between the dependent variable (i.e., self-reported Fide rating) and ONE of the explanators (i.e., classical ratings) need not be constant across the sample. This is a feature of the underlying mathematical model, and it is not a problem. It is totally expected in this case (this was the answer I already gave you in post #197).

This graph suggests that:

  • At the 2000 classical level, Lichess ratings are inflated by about 200 points relative to Fide
  • At the 1500 classical level, Lichess ratings are inflated by about 100 points relative to Fide

https://imgur.com/a/jdXMs

That's interesting, no?

I feel like we're making progress :)

@mdinnerspace Once more onto the breach. There are good questions in your posts, but my answers in #196 and #197 did specifically address your points about the formula. Let me try to clarify. You raised a perfectly valid point/worry in your posts: If the relationship between Fide and ratings+blitz is non-linear, then the model could produce "crazy" results for very high or very low ratings. To illustrate this, you created *extreme* examples of combinations of Lichess ratings for blitz and classical, and showed that the predicted results were strange. This is a perfectly good point to raise, as it could potentially cause important problems. After all, the regression model (i.e., formula) is a linear approximation. Thankfully, it's a criticism we can assess empirically. In post #196, I addressed your numerical counter-examples. Specifically, I showed that your extreme examples were not realistic, and therefore not relevant to the evaluation of the model. For example, there are no observed Lichess ratings below 700 in the sample. Moreover, I calculated predicted Fide ratings for *all actually observed rating combinations on Lichess*, and showed that the formula did *not* behave in crazy fashion like you claim. For most of the actually observed combinations of blitz and classical ratings, the average gap between classical and Fide remains quite stable (though not totally constant; more on this below). For real-world players (not made up examples), there is essentially no "flipping" going on like in your extreme, contrived, and unrealistic examples. Your second critique was that the average gap between observed Classical and predicted Fide was not constant. In post #197, I addressed this non-constant gap argument explicitly. In short, when you have a regression model with TWO independent variables (i.e., blitz and classical ratings), the gap between the dependent variable (i.e., self-reported Fide rating) and ONE of the explanators (i.e., classical ratings) need not be constant across the sample. This is a feature of the underlying mathematical model, and it is *not* a problem. It is totally expected in this case (this was the answer I already gave you in post #197). This graph suggests that: * At the 2000 classical level, Lichess ratings are inflated by about 200 points relative to Fide * At the 1500 classical level, Lichess ratings are inflated by about 100 points relative to Fide https://imgur.com/a/jdXMs That's interesting, no? I feel like we're making progress :)

A given formula of...
x and y (averaged) + z (any number used as a constant) = D is invalid.

A given formula of... x and y (averaged) + z (any number used as a constant) = D is invalid.

This topic has been archived and can no longer be replied to.