- Blind mode tutorial
lichess.org
Donate

How to estimate your FIDE rating (conversion formula inside)

Bias is the tendency of a statistic to overestimate or underestimate a parameter. To understand the difference between a statistic and a parameter. Bias can seep into your results for a slew of reasons including sampling or measurement errors, or unrepresentative samples

Bias is the tendency of a statistic to overestimate or underestimate a parameter. To understand the difference between a statistic and a parameter. Bias can seep into your results for a slew of reasons including sampling or measurement errors, or unrepresentative samples

Sampling error is the tendency for a statistic not to exactly match the population. Error doesn’t necessarily mean that a mistake was made in your sampling; Sampling Variability could be a more accurate name.

Sampling error is the tendency for a statistic not to exactly match the population. Error doesn’t necessarily mean that a mistake was made in your sampling; Sampling Variability could be a more accurate name.

Alright. Instead of actually addressing my response, you're now down to quoting some wikipedia article?

All the critiques you have marshalled so far have been debunked, so you fall back on vague assertions about "a slew of reasons", but you can't point to even one actual reason (beyond what was already acknowledged in the original post).

That rhetorical strategy is intellectually lazy.

Alright. Instead of actually addressing my response, you're now down to quoting some wikipedia article? All the critiques you have marshalled so far have been debunked, so you fall back on vague assertions about "a slew of reasons", but you can't point to even one actual reason (beyond what was already acknowledged in the original post). That rhetorical strategy is intellectually lazy.

Addressing your response:

Please give actual evidence, verifiable facts, that an online blitz player rated approx. 670 should expect a FIDE rating of 805?

This is not in line with known results. It is based on your "statistics" which are "biased". Information given in profiles, information that is voluntarily provided, not verified and is prone to exaggeration.

Addressing your response: Please give actual evidence, verifiable facts, that an online blitz player rated approx. 670 should expect a FIDE rating of 805? This is not in line with known results. It is based on your "statistics" which are "biased". Information given in profiles, information that is voluntarily provided, not verified and is prone to exaggeration.

@mdinnerspace

As I explained in post #210, there is no need to provide this evidence, because it is completely irrelevant to model evaluation.

As I already made clear in the FIRST post, I agree that self-reports may not always be accurate. 200+ posts later, you still can't come up with an original and relevant critique.

@mdinnerspace As I explained in post #210, there is no need to provide this evidence, because it is completely irrelevant to model evaluation. As I already made clear in the FIRST post, I agree that self-reports may not always be accurate. 200+ posts later, you still can't come up with an original and relevant critique.

No need ? Irrelevant?
Observable evidence disputes the premise, a formula is shown to be inconsistent....

Quite the "scientific method" being used here.
CYA Good luck in your endeavors.

No need ? Irrelevant? Observable evidence disputes the premise, a formula is shown to be inconsistent.... Quite the "scientific method" being used here. CYA Good luck in your endeavors.

@mdinnerspace

Read the post: It is irrelevant because there is not a single observed data point in the sample with a 670 Lichess rating. You are asking the formula to do something it was not designed to do.

@mdinnerspace Read the post: It is irrelevant because there is not a single observed data point in the sample with a 670 Lichess rating. You are asking the formula to do something it was not designed to do.

In response to this, posted by :
dudeski_robinson

You have presented no evidence to support your claim that
blitz and tournament play skills only show a "weak correlation".
In contrast, all the available evidence suggests that the
relationship is in fact very strong (with the usual "these are
self-reports" caveat).

First off, i have shown at least one publicly available counterexample for my claim. Yes, i don't have access to FIDEs rating database and i can't therefore present any data generated from it.

I can, though, use myself as test case: in my chess club i am on about the 20th place in the clubs (DWZ, the german national ELO) rating list. Because we organise a monthly blitz tournament we have a year-long championship where the best 6 results out of possible 12 decide. Right now, after 10 of the 12 events this year, i am third, far better than my standing in the rating list would suggest.

And i can use what you said yourself about the precision of your formula: it could be 100 points off. Even if your formula works (which i neither contest nor confirm, i have simply no opinion about that) that means that there is a 200 points range wherein the "real" rating lies.

Now, just suppose for a moment your fomula gives "2000" as someones projected rating. He plays against someone really rated with a FIDE rating of 1900. The expectation to win will be (calculated using http://www.bobnewell.net/cgi/elop.pl):

1900: 0.50 // 0.50
2000: 0.64 // 0.36
2100: 0.76 // 0.24

As you see the win expectancy varies - depending on how far and which direction your predicted rating is off from the real rating - by quite a lot. What does it mean for a tournament result to either end with 4.5 points or 7.0 points in the usual 9-round event, hm?

So, again: i think you put in a lot of work and kudos for that, but because of, first, a lot of factors contributing to a players OTB tournament strength which are simply not measured here and therefore do not influence your statistics and, second, the fact that the resulting variation is too big to make the result really meaningful.

Finally: you yourself made some equally unfounded assumptions, i.e. the decision to delete all published FIDE-ratings divisible by 500 from your set. Why not the numbers divisible by 300? Do you think "2100" is any more exact a value than "2000"?

And at last: if you want to to give me (as opposed to "the public") any answer - don't bother. I don't care for a discussion fought as agressively as here (no, that is not solely your fault) and after writing this i am out here. I come here for fun and reading this isn't any.

krasnaya

In response to this, posted by : dudeski_robinson You have presented no evidence to support your claim that blitz and tournament play skills only show a "weak correlation". In contrast, all the available evidence suggests that the relationship is in fact very strong (with the usual "these are self-reports" caveat). First off, i have shown at least one publicly available counterexample for my claim. Yes, i don't have access to FIDEs rating database and i can't therefore present any data generated from it. I can, though, use myself as test case: in my chess club i am on about the 20th place in the clubs (DWZ, the german national ELO) rating list. Because we organise a monthly blitz tournament we have a year-long championship where the best 6 results out of possible 12 decide. Right now, after 10 of the 12 events this year, i am third, far better than my standing in the rating list would suggest. And i can use what you said yourself about the precision of your formula: it could be 100 points off. Even if your formula works (which i neither contest nor confirm, i have simply no opinion about that) that means that there is a 200 points range wherein the "real" rating lies. Now, just suppose for a moment your fomula gives "2000" as someones projected rating. He plays against someone really rated with a FIDE rating of 1900. The expectation to win will be (calculated using http://www.bobnewell.net/cgi/elop.pl): 1900: 0.50 // 0.50 2000: 0.64 // 0.36 2100: 0.76 // 0.24 As you see the win expectancy varies - depending on how far and which direction your predicted rating is off from the real rating - by quite a lot. What does it mean for a tournament result to either end with 4.5 points or 7.0 points in the usual 9-round event, hm? So, again: i think you put in a lot of work and kudos for that, but because of, first, a lot of factors contributing to a players OTB tournament strength which are simply not measured here and therefore do not influence your statistics and, second, the fact that the resulting variation is too big to make the result really meaningful. Finally: you yourself made some equally unfounded assumptions, i.e. the decision to delete all published FIDE-ratings divisible by 500 from your set. Why not the numbers divisible by 300? Do you think "2100" is any more exact a value than "2000"? And at last: if you want to to give me (as opposed to "the public") any answer - don't bother. I don't care for a discussion fought as agressively as here (no, that is not solely your fault) and after writing this i am out here. I come here for fun and reading this isn't any. krasnaya

@krasnaya

I agree with pretty much everything you say here. The formula isn't super precise. As I stated in the last sentence of my original post, it should only be taken as an "informed guess", based on the best available evidence.

Sure, the "divisible by 500" is completely arbitrary. I just noticed that there were suspiciously many of those, so I tried removing them in case that made a difference. I also tried many alternative approaches to remove outliers, but the results didn't seem super sensitive to the choices made there. But again, you're right: that's an obvious weakness of the dataset...

@krasnaya I agree with pretty much everything you say here. The formula isn't super precise. As I stated in the last sentence of my original post, it should only be taken as an "informed guess", based on the best available evidence. Sure, the "divisible by 500" is completely arbitrary. I just noticed that there were suspiciously many of those, so I tried removing them in case that made a difference. I also tried many alternative approaches to remove outliers, but the results didn't seem super sensitive to the choices made there. But again, you're right: that's an obvious weakness of the dataset...

After some hesitation i will be untrue to my word and add something here. I hope that - given the long time passed since the "discussion" here - the "discutands" have moved elsewhere and fight their fights there instead of here.

To find out if longer and shorter time controls are as closely related regarding strength you might do the following, given your access to the Lichess database: get all members with a blitz and a classical rating and more than X (to be determined) played games with either timecontrol. Calculate the standard deviation from this dataset. I guess it will not be as small as you might expect.

And, i may reiterate, 90-120 min for 40 moves (typical OTB time controls in tournaments) is completely different to G/10 (which is regarded "classical" here).

krasnaya

After some hesitation i will be untrue to my word and add something here. I hope that - given the long time passed since the "discussion" here - the "discutands" have moved elsewhere and fight their fights there instead of here. To find out if longer and shorter time controls are as closely related regarding strength you might do the following, given your access to the Lichess database: get all members with a blitz and a classical rating and more than X (to be determined) played games with either timecontrol. Calculate the standard deviation from this dataset. I guess it will not be as small as you might expect. And, i may reiterate, 90-120 min for 40 moves (typical OTB time controls in tournaments) is completely different to G/10 (which is regarded "classical" here). krasnaya

This topic has been archived and can no longer be replied to.