Rating System Broken.

#89 Because I don't want to create extra work for myself?

The Skill Level parameter should do roughly the same thing for Multi-Variant Stockfish that it does in the official-stockfish project... if you want them to change what the Skill Level parameter does, ask them about it.

I don't know how the Lichess UI would know (without consulting Stockfish) whether a move is actually a blunder.

#89 Because I don't want to create extra work for myself? The Skill Level parameter should do roughly the same thing for Multi-Variant Stockfish that it does in the official-stockfish project... if you want *them* to change what the Skill Level parameter does, ask them about it. I don't know how the Lichess UI would know (without consulting Stockfish) whether a move is actually a blunder.

Toadofsky

#92

#90 That's good to hear, however this is a battle which cannot be won... there's far too much complexity in supporting the feature.

n321 edited

#93

@Toadofsky #91

That will not require extra work from you at all, just to tweak front end app a bit.

If score of the SF drop at least 2 or 3, that is certainly minor piece blunder. I believe that is possible by front end app (Lechess) to monitor from previous calculated lines. After that is necessary just to repeat call to SF to calculate new move and that should be all. That new move almost certainly will not be a blunder.

I'm not aware of any parameter which can disable these random blunders "feature" inside SF.

@Toadofsky #91 That will not require extra work from you at all, just to tweak front end app a bit. If score of the SF drop at least 2 or 3, that is certainly minor piece blunder. I believe that is possible by front end app (Lechess) to monitor from previous calculated lines. After that is necessary just to repeat call to SF to calculate new move and that should be all. That new move almost certainly will not be a blunder. I'm not aware of any parameter which can disable these random blunders "feature" inside SF.

Toadofsky

#94

#93 Of course there is a parameter to disable blunders... it is called "Skill Level" (it's what separates AI levels 1-7 from AI level 8):
https://github.com/official-stockfish/Stockfish#skill-level

"I believe that is possible by front end app (Lechess) to monitor from previous calculated lines. After that is necessary just to repeat call to SF to calculate new move and that should be all. That new move almost certainly will not be a blunder."

Like I said, extra work to develop and test based upon your assumptions about how Stockfish works, and the slightest coding mistake could break everything. My bot @GodelEscherBot is broken at the moment due to some communication error which I have been troubleshooting for the past month, while at the same time my other automated tests are broken and bugs are being filed against my project:
https://travis-ci.org/github/ddugovic/Stockfish/builds
https://github.com/ddugovic/Stockfish/issues/574

All of my tests are simultaneously failing and I am demonstrating that Stockfish is too complicated for me to understand, and now you want me against the wishes of other Lichess staff to make the Lichess code even more complicated, in an effort to solve a problem which even ChessBase's Freidel (on the Perpetual Chess Podcast) acknowledges could take decades of research effort (for an AI to quickly play human-like moves and differentiate between obvious blunders and non-obvious mistakes)?

#93 Of course there is a parameter to disable blunders... it is called "Skill Level" (it's what separates AI levels 1-7 from AI level 8): https://github.com/official-stockfish/Stockfish#skill-level "I believe that is possible by front end app (Lechess) to monitor from previous calculated lines. After that is necessary just to repeat call to SF to calculate new move and that should be all. That new move almost certainly will not be a blunder." Like I said, extra work to develop and test based upon your assumptions about how Stockfish works, and the slightest coding mistake could break everything. My bot @GodelEscherBot is broken at the moment due to some communication error which I have been troubleshooting for the past month, while at the same time my other automated tests are broken and bugs are being filed against my project: https://travis-ci.org/github/ddugovic/Stockfish/builds https://github.com/ddugovic/Stockfish/issues/574 All of my tests are simultaneously failing and I am demonstrating that Stockfish is too complicated for me to understand, and now you want me against the wishes of other Lichess staff to make the Lichess code even more complicated, in an effort to solve a problem which even ChessBase's Freidel (on the Perpetual Chess Podcast) acknowledges could take decades of research effort (for an AI to quickly play human-like moves and differentiate between obvious blunders and non-obvious mistakes)?

n321 edited

#95

@Toadofsky

Definitely I will look deeply in the parameters documentation of SF. So far I'm aware that some parameters cancel settings of others, thus all parameters in StockFish need to be sync. What I know certainly that forcing Lichess to takeback poor move of SF and playing the same move again, his answer is not a blunder anymore, but much more expected move.

This interesting subect is far out from this topic and I will stop futher comments regarding this "random blunder feature"in this thread. Perhaps I will open another after provided more detail tests.

@Toadofsky Definitely I will look deeply in the parameters documentation of SF. So far I'm aware that some parameters cancel settings of others, thus all parameters in StockFish need to be sync. What I know certainly that forcing Lichess to takeback poor move of SF and playing the same move again, his answer is not a blunder anymore, but much more expected move. This interesting subect is far out from this topic and I will stop futher comments regarding this "random blunder feature"in this thread. Perhaps I will open another after provided more detail tests.

Toadofsky

#96

Thanks... yeah, definitely more testing and research is required.

https://lichess.org/forum/lichess-feedback/proposal-for-lichess-to-develop-a-complexity-metric-feature is still in my backlog, which seems more clearly defined (but also a hard problem). In complex positions a human is more prone to blunder... if there were some oracle who could tell Stockfish "this is a complex position" or "this is a simple position" then Stockfish's skill level could be adjusted just before requesting a move, although there's a real chance even that wouldn't solve the "obvious blunder" problem.

Thanks... yeah, definitely more testing and research is required. https://lichess.org/forum/lichess-feedback/proposal-for-lichess-to-develop-a-complexity-metric-feature is still in my backlog, which seems more clearly defined (but also a hard problem). In complex positions a human is more prone to blunder... if there were some oracle who could tell Stockfish "this is a complex position" or "this is a simple position" then Stockfish's skill level could be adjusted just before requesting a move, although there's a real chance even that wouldn't solve the "obvious blunder" problem.

n321 edited

#97

Back on topic.

I'm currently still playing with opponents computer gives with similar current rating, but few games in a day. So far I have winning streak of 11 games, gaining still needy 4-6 points per game. Rating deviation is still about 45%.

Thus, how many wins are necessary that rating deviation after such winning streak changes significantly and current rating become provisional again (if at all that is allowed), giving much more points? If 10 games at beginning is enough to drop rating for 700 points, it is logical that 10 wins in a row take it back.

In other words, on how many latest game rating deviation is calculated and/or current rating again become provisional?

Anyway, another inconsistence in rating logic prove all previously mentioned about this rating system implememntation on Lichess.

Back on topic. I'm currently still playing with opponents computer gives with similar current rating, but few games in a day. So far I have winning streak of 11 games, gaining still needy 4-6 points per game. Rating deviation is still about 45%. Thus, how many wins are necessary that rating deviation after such winning streak changes significantly and current rating become provisional again (if at all that is allowed), giving much more points? If 10 games at beginning is enough to drop rating for 700 points, it is logical that 10 wins in a row take it back. In other words, on how many latest game rating deviation is calculated and/or current rating again become provisional? Anyway, another inconsistence in rating logic prove all previously mentioned about this rating system implememntation on Lichess.

sheckley666

#98

#97 "If 10 games at beginning is enough to drop rating for 700 points, it is logical that 10 wins in a row take it back."

No. Why?

Btw, this is also not true for FIDE Elo. If a player wins 10 games in a row, and manages with his last win to reach 2400, then his k will be reduced - and it will stay reduced even if he falls back below 2400. If he now loses 10 games in a row, then he will not be back to the value where he started his winning streak.

#97 "If 10 games at beginning is enough to drop rating for 700 points, it is logical that 10 wins in a row take it back." No. Why? Btw, this is also not true for FIDE Elo. If a player wins 10 games in a row, and manages with his last win to reach 2400, then his k will be reduced - and it will stay reduced even if he falls back below 2400. If he now loses 10 games in a row, then he will not be back to the value where he started his winning streak.

doom12384

#99

@n321 #97
The reason your rating fluctuates so much at the start is because the rating system is not confident of where you are. As you play more games, the system becomes more confident in its estimation of your rating (and rightfully so!). If you're losing rating points as quickly as you're gaining them, it's because your rating is truly a good reflection of your playing strength. If it wasn't, you'd either be gaining more than you're losing (if your rating is low relative to your playing strength) or losing more than you're gaining (if you're rating is high relative to your playing strength).

@n321 #97 The reason your rating fluctuates so much at the start is because the rating system is not confident of where you are. As you play more games, the system becomes more confident in its estimation of your rating (and rightfully so!). If you're losing rating points as quickly as you're gaining them, it's because your rating is truly a good reflection of your playing strength. If it wasn't, you'd either be gaining more than you're losing (if your rating is low relative to your playing strength) or losing more than you're gaining (if you're rating is high relative to your playing strength).

n321 edited

#100

Well, FIDE rating system is much, much clear and there is no single reason for suspicious what will happens. But here all is unclear, all deep into "gray area", always with some uncertainty...

Prehaps when I break threshold of 50% winning/losing ratio, then things may start to change a bit...

Well, FIDE rating system is much, much clear and there is no single reason for suspicious what will happens. But here all is unclear, all deep into "gray area", always with some uncertainty... Prehaps when I break threshold of 50% winning/losing ratio, then things may start to change a bit...

This topic has been archived and can no longer be replied to.