- Blind mode tutorial
lichess.org
Donate

Rating System Broken.

#69 Your concern was "Why does Stockfish make random moves which cause mistakes?" and I am developing automated tests which can detect mistakes in case you have some fear of me releasing low-quality software.

To be clear, I am not attempting to make SF play like a human, even at low levels. If players want to play against an opponent which makes human-like mistakes, there are human opponents available. I have spent decades trying to make AIs which emulate human behavior and made zero progress while at the same time ChessBase co-founder Frederic Friedel claims (on the perpetual chess podcast) this is one of the most difficult problems in chess AI development. But also there is no need to hijack this forum thread, but I'll do so anyway:

The official SF team focuses on improving performance of Stockfish at its strongest Skill Level setting. Doing so requires that moves be pseudo-randomly selected (else a sufficiently powerful AI could "read SF's mind" by running or emulating the same deterministic code). Without performance degradation (of the strongest Skill Level), this randomization is simply increased to produce Skill Level values. These are likely the factors the official SF team considered when producing lower skill AI levels, although if you want to ask them about their design choices which I had no part in, they have a mailing list, a Google Group, a GitHub issue tracker (including comments on both closed issues and pull requests), and a discord server.

#69 Your concern was "Why does Stockfish make random moves which cause mistakes?" and I am developing automated tests which can detect mistakes in case you have some fear of me releasing low-quality software. To be clear, I am not attempting to make SF play like a human, even at low levels. If players want to play against an opponent which makes human-like mistakes, there are human opponents available. I have spent decades trying to make AIs which emulate human behavior and made zero progress while at the same time ChessBase co-founder Frederic Friedel claims (on the perpetual chess podcast) this is one of the most difficult problems in chess AI development. But also there is no need to hijack this forum thread, but I'll do so anyway: The official SF team focuses on improving performance of Stockfish at its strongest Skill Level setting. Doing so requires that moves be pseudo-randomly selected (else a sufficiently powerful AI could "read SF's mind" by running or emulating the same deterministic code). Without performance degradation (of the strongest Skill Level), this randomization is simply increased to produce Skill Level values. These are likely the factors the official SF team considered when producing lower skill AI levels, although if you want to ask them about their design choices which I had no part in, they have a mailing list, a Google Group, a GitHub issue tracker (including comments on both closed issues and pull requests), and a discord server.

@HellevatorOperator I love how you edited your post after I responded in an attempt to pre-empt and invalidate my response hahahah. "That's garbage maths" you say without any evidence. It is very precise math. It's easy to work out percentages when your wins and losses and points gain or loss is listed clearly in your activity on your profile. The math is simple: I played ten games with 8 wins, and lost points. That's an 80% win rate. I was having to hit 90% to start winning points, 85% was not enough. And in 50% of the games I started with a 2 point disadvantage in the evaluation, which any player who knows 1 opening can extend to 4 points, even against stockfish. How can you possibly win 90% when you start half your games with a 2-4 point disadvantage in the evaluation?! You say: "There are plenty of people above you who play longer than bullet time controls in 3 check, and play the same time controls as you." A few of them might play me, but they get the vast majority of their points from hyperbullet to 3 min games. Can you tell me one rapid 3 check player, who mostly plays rapid games in the top 50? I would be glad to challenge them.

@HellevatorOperator I love how you edited your post after I responded in an attempt to pre-empt and invalidate my response hahahah. "That's garbage maths" you say without any evidence. It is very precise math. It's easy to work out percentages when your wins and losses and points gain or loss is listed clearly in your activity on your profile. The math is simple: I played ten games with 8 wins, and lost points. That's an 80% win rate. I was having to hit 90% to start winning points, 85% was not enough. And in 50% of the games I started with a 2 point disadvantage in the evaluation, which any player who knows 1 opening can extend to 4 points, even against stockfish. How can you possibly win 90% when you start half your games with a 2-4 point disadvantage in the evaluation?! You say: "There are plenty of people above you who play longer than bullet time controls in 3 check, and play the same time controls as you." A few of them might play me, but they get the vast majority of their points from hyperbullet to 3 min games. Can you tell me one rapid 3 check player, who mostly plays rapid games in the top 50? I would be glad to challenge them.

#53 Many years ago I made the same blunder. Rating estimates measure performance, not skill.

#53 Many years ago I made the same blunder. Rating estimates measure performance, not skill.

#82 No high-rated player, yourself included, plays Three-Check at a rapid time control. Why are you asking us a question which you can answer?
https://lichess.org/games/search?ratingMin=2000&perf=15&durationMin=600&dateMin=2020-06-01&sort.field=d&sort.order=desc#results

#82 No high-rated player, yourself included, plays Three-Check at a rapid time control. Why are you asking us a question which you can answer? https://lichess.org/games/search?ratingMin=2000&perf=15&durationMin=600&dateMin=2020-06-01&sort.field=d&sort.order=desc#results

Hey AceN00b, keep complaining that you have to play half your games as black. If you can't accept your rating is what it is, nobody can do it for you.

Hey AceN00b, keep complaining that you have to play half your games as black. If you can't accept your rating is what it is, nobody can do it for you.

@Toadofsky 5 minutes + 17 seconds is not a rapid time control? In fact, it is. Almost all my 3000+ games were at least 5+15. The nearest comparable time control played in 3 check on the list is 5 minute or 4+4.

@Toadofsky 5 minutes + 17 seconds is not a rapid time control? In fact, it is. Almost all my 3000+ games were at least 5+15. The nearest comparable time control played in 3 check on the list is 5 minute or 4+4.

Here is a whole conversation of Acenoob complaining about the same things (unfair rating system, not getting the results he wants) with a different audience.

https://lichess.org/forum/lichess-feedback/unacceptable-time-out

He will not stop until he gets what he wants out of people - the recognition that he is this unspeakably misunderstood and brilliant 3 check player, who is unfairly and harshly undone by the rating system, who would be 2100 or better if it wasn't for those darn calculations. Instead of just a good 3 check player who loses sometimes, and doesn't like losing.

That's all that he is. And this will never stop.

Let's just go back to playing and this guy can go back to thinking that life is unfair and he doesn't have the rating he deserves. The system may be imperfect but its not imperfect in the way you want it to be. It never was.

Here is a whole conversation of Acenoob complaining about the same things (unfair rating system, not getting the results he wants) with a different audience. https://lichess.org/forum/lichess-feedback/unacceptable-time-out He will not stop until he gets what he wants out of people - the recognition that he is this unspeakably misunderstood and brilliant 3 check player, who is unfairly and harshly undone by the rating system, who would be 2100 or better if it wasn't for those darn calculations. Instead of just a good 3 check player who loses sometimes, and doesn't like losing. That's all that he is. And this will never stop. Let's just go back to playing and this guy can go back to thinking that life is unfair and he doesn't have the rating he deserves. The system may be imperfect but its not imperfect in the way you want it to be. It never was.

#87 For the majority of players that's fine; but as a staff member I try to respond to issues, even when players have ill-formed arguments (and do the common "customer" thing of conflating complaints together in the hope of drawing attention and acting all surprised when I provide critical feedback).

#86 I'll try this query again (for 5+5 and slower) later:
https://lichess.org/games/search?ratingMin=2000&perf=15&clock.initMin=300&clock.incMin=5&dateMin=2020-06-01&sort.field=d&sort.order=desc#results

There may be some merit to creating separate rating categories per TC for variants, but I've already asked about this several times and gotten a very hard "pass" from the entire team each time due to lack of interest. I've already repeatedly shared my proposed improvements to the rating system numerous times in fora (including one to account for first-player advantage) and irrespective how valid or invalid the claims are I don't think there's anything further to discuss here (other than retrying the 5+5 and slower query when it finds those 3000+ games and hopefully but doubtfully some others).

#87 For the majority of players that's fine; but as a staff member I try to respond to issues, even when players have ill-formed arguments (and do the common "customer" thing of conflating complaints together in the hope of drawing attention and acting all surprised when I provide critical feedback). #86 I'll try this query again (for 5+5 and slower) later: https://lichess.org/games/search?ratingMin=2000&perf=15&clock.initMin=300&clock.incMin=5&dateMin=2020-06-01&sort.field=d&sort.order=desc#results There may be some merit to creating separate rating categories per TC for variants, but I've already asked about this several times and gotten a very hard "pass" from the entire team each time due to lack of interest. I've already repeatedly shared my proposed improvements to the rating system numerous times in fora (including one to account for first-player advantage) and irrespective how valid or invalid the claims are I don't think there's anything further to discuss here (other than retrying the 5+5 and slower query when it finds those 3000+ games and hopefully but doubtfully some others).

@Toadofsky #81

Why not try another approach?

For instance to ignore SF suggested move if it is actually a blunder, then second call will actually return another, "normal" move regarding desired level. This way nothing in SF repository require change, only in front end app.

@Toadofsky #81 Why not try another approach? For instance to ignore SF suggested move if it is actually a blunder, then second call will actually return another, "normal" move regarding desired level. This way nothing in SF repository require change, only in front end app.

@Toadofsky There is a thriving 10 minute + 10 seconds league in crazyhouse organized by Okei for 7 seasons with 30-40 players. I'm sure they would appreciate separate rating categories per TC for variants too. If the categories exist, more people will be attracted to play. I did not conflate complaints, as I told you before. There are several different complaints to be made. Forming an overall critique of disparate strands is not conflation.

@Toadofsky There is a thriving 10 minute + 10 seconds league in crazyhouse organized by Okei for 7 seasons with 30-40 players. I'm sure they would appreciate separate rating categories per TC for variants too. If the categories exist, more people will be attracted to play. I did not conflate complaints, as I told you before. There are several different complaints to be made. Forming an overall critique of disparate strands is not conflation.

This topic has been archived and can no longer be replied to.