Thank You very much. I enjoy mathematical content on games like this a lot. You are up to something here. Maybe You could find a way to also factor in the time each player has left. This might give us a much more precise evaluation for the odds within a game. You could just feed data of games with timestamps into the neuronal net and let it learn how time affects the outcome. Because often we see situations, where one player is clearly winning but ends up loosing, because they are low on time.
This would be great especially for watching live games.
Thank You very much. I enjoy mathematical content on games like this a lot. You are up to something here. Maybe You could find a way to also factor in the time each player has left. This might give us a much more precise evaluation for the odds within a game. You could just feed data of games with timestamps into the neuronal net and let it learn how time affects the outcome. Because often we see situations, where one player is clearly winning but ends up loosing, because they are low on time.
This would be great especially for watching live games.
Hi - thanks for the great post.
Here's an idea that might be some work to implement but is perhaps worth trying:
- From the Lichess database, take Stockfish analyzed rapid games between players in the rating range of 2100-2400.
- For every position before a move marked as a blunder or mistake, see if your rating marks it as a difficult position. (Perhaps leave out blunders and mistakes made in time trouble by looking at how much time the player had before making the move.)
- Fiddle with the parameters so that ease rating better predicts of blunders or mistakes.
- Cross-validate and/or test on previously unseen positions.
Hi - thanks for the great post.
Here's an idea that might be some work to implement but is perhaps worth trying:
1. From the Lichess database, take Stockfish analyzed rapid games between players in the rating range of 2100-2400.
2. For every position before a move marked as a blunder or mistake, see if your rating marks it as a difficult position. (Perhaps leave out blunders and mistakes made in time trouble by looking at how much time the player had before making the move.)
3. Fiddle with the parameters so that ease rating better predicts of blunders or mistakes.
4. Cross-validate and/or test on previously unseen positions.
@AnlamK said in #12:
For every position before a move marked as a blunder or mistake, see if your rating marks it as a difficult position. (Perhaps leave out blunders and mistakes made in time trouble by looking at how much time the player had before making the move.)
I think the more interesting study is: how well do tournament players deal with complexity, and how do players manage their time, and how does that correlate with player strength (in a game, in a tournament, and in a lifetime)? I suspect GM Komarov is right: if you make consecutive threats, your opponent will blunder (either in a complex position or after they relax).
@AnlamK said in #12:
> For every position before a move marked as a blunder or mistake, see if your rating marks it as a difficult position. (Perhaps leave out blunders and mistakes made in time trouble by looking at how much time the player had before making the move.)
I think the more interesting study is: how well do tournament players deal with complexity, and how do players manage their time, and how does that correlate with player strength (in a game, in a tournament, and in a lifetime)? I suspect GM Komarov is right: if you make consecutive threats, your opponent will blunder (either in a complex position or after they relax).
Hey @matstc this is great. Have you tried looking into maia policy and how would Maia + ease metric work, compared to leela? I would love to integrate your formula in my project, but I think basing it on Maia will make it more "human" like because it was indeed trained on human games at certain ELO. Also would you be open if I used your calculations, my work is foss anyways!
Hey @matstc this is great. Have you tried looking into maia policy and how would Maia + ease metric work, compared to leela? I would love to integrate your formula in my project, but I think basing it on Maia will make it more "human" like because it was indeed trained on human games at certain ELO. Also would you be open if I used your calculations, my work is foss anyways!
@Toadofsky said in #9:
I would love an evaluation that will actually reflect how easy or hard to play a position is, for human beings.
It's refreshing to see critical thinking in action! I hope that players start realizing that while "AI" often hallucinates, tools designed by humans provide useful information.
Yep! When AI relies on correctly made human tools, only then will it know what is right or wrong
@Toadofsky said in #9:
> > I would love an evaluation that will actually reflect how easy or hard to play a position is, for human beings.
>
> It's refreshing to see critical thinking in action! I hope that players start realizing that while "AI" often hallucinates, tools designed by humans provide useful information.
Yep! When AI relies on correctly made human tools, only then will it know what is right or wrong
@Noobmasterplayer123 said in #14:
Hey @matstc this is great. Have you tried looking into maia policy and how would Maia + ease metric work, compared to leela? I would love to integrate your formula in my project, but I think basing it on Maia will make it more "human" like because it was indeed trained on human games at certain ELO. Also would you be open if I used your calculations, my work is foss anyways!
I have not tried with Maia but that would be a really helpful follow-up.
Of course, happily use anything from the notebook in a FOSS project.
Please share whatever results you get!
@Noobmasterplayer123 said in #14:
> Hey @matstc this is great. Have you tried looking into maia policy and how would Maia + ease metric work, compared to leela? I would love to integrate your formula in my project, but I think basing it on Maia will make it more "human" like because it was indeed trained on human games at certain ELO. Also would you be open if I used your calculations, my work is foss anyways!
I have not tried with Maia but that would be a really helpful follow-up.
Of course, happily use anything from the notebook in a FOSS project.
Please share whatever results you get!
@AnlamK said in #12:
Hi - thanks for the great post.
Here's an idea that might be some work to implement but is perhaps worth trying:
- From the Lichess database, take Stockfish analyzed rapid games between players in the rating range of 2100-2400.
- For every position before a move marked as a blunder or mistake, see if your rating marks it as a difficult position. (Perhaps leave out blunders and mistakes made in time trouble by looking at how much time the player had before making the move.)
- Fiddle with the parameters so that ease rating better predicts of blunders or mistakes.
- Cross-validate and/or test on previously unseen positions.
That would be really interesting! Thinking more about this... I would love to:
- Look at not only mistakes and blunders but also good moves that keep the objective evaluation flat.
- Look at the objective evaluation drop or gain in arbitrarily many positions.
- Look at the corresponding ease for all of those positions.
And then look at the correlation between the ease and the evaluation change. That would be a fantastic way to calibrate the metric.
@AnlamK said in #12:
> Hi - thanks for the great post.
>
> Here's an idea that might be some work to implement but is perhaps worth trying:
>
> 1. From the Lichess database, take Stockfish analyzed rapid games between players in the rating range of 2100-2400.
> 2. For every position before a move marked as a blunder or mistake, see if your rating marks it as a difficult position. (Perhaps leave out blunders and mistakes made in time trouble by looking at how much time the player had before making the move.)
> 3. Fiddle with the parameters so that ease rating better predicts of blunders or mistakes.
> 4. Cross-validate and/or test on previously unseen positions.
That would be really interesting! Thinking more about this... I would love to:
- Look at not only mistakes and blunders but also good moves that keep the objective evaluation flat.
- Look at the objective evaluation drop or gain in arbitrarily many positions.
- Look at the corresponding ease for all of those positions.
And then look at the correlation between the ease and the evaluation change. That would be a fantastic way to calibrate the metric.
woa!!!!!!!!!!!!
This is a great idea! I'd love to see this integrated into Lichess.
This is a great idea! I'd love to see this integrated into Lichess.
Another way I use to asess how easy a position is for humans is to check how many moves continue to maintain the draw according to Stockfish. I always analize my games with at least 5 candidate moves given by Stockfish. If there's ONLY ONE MOVE that continues to maintain the draw AND that move isn't a check, a capture or a threat I consider it difficult for humans.
If on the contrary, all 5 candidate moves are draws, then I consider it easy.
Another way I use to asess how easy a position is for humans is to check how many moves continue to maintain the draw according to Stockfish. I always analize my games with at least 5 candidate moves given by Stockfish. If there's ONLY ONE MOVE that continues to maintain the draw AND that move isn't a check, a capture or a threat I consider it difficult for humans.
If on the contrary, all 5 candidate moves are draws, then I consider it easy.