Proposal for Lichess to Develop a Complexity Metric Feature • page 1/2 • Lichess Feedback • lichess.org

I would like to propose that lichess develops a measure of chess complexity or sharpness. Such a feature has the potential to revolutionize chess and would be invaluable to any chess website. Some specific applications include generating non-tactical puzzles (imagine tactics trainer for positional chess puzzles), creating chess computers that play with human personalities, and identifying concepts that are key to improvement at any rating level.

Here (https://imgur.com/a/6pYfDcV) you can find 16 positions selected from 1000 random positions from the 2019 World Cup sorted by our algorithm, with the least complex positions on the far left-column and the most complex positions on the far right-column. While we have achieved good results, our algorithm can be improved significantly with more data. Our model only trained on 25,000 games, but lichess has over a billion games, a significant portion already analyzed by users. With the incredible amount of data and technical expertise at lichess, lichess’ help will be invaluable in creating these revolutionary features.

Please find more details about the algorithm in my full proposal here: (https://github.com/Amethyst-Cat/ChessComplexity/blob/master/A%20Metric%20of%20Chess%20Complexity.pdf).

I would like to propose that lichess develops a measure of chess complexity or sharpness. Such a feature has the potential to revolutionize chess and would be invaluable to any chess website. Some specific applications include generating non-tactical puzzles (imagine tactics trainer for positional chess puzzles), creating chess computers that play with human personalities, and identifying concepts that are key to improvement at any rating level. Here (https://imgur.com/a/6pYfDcV) you can find 16 positions selected from 1000 random positions from the 2019 World Cup sorted by our algorithm, with the least complex positions on the far left-column and the most complex positions on the far right-column. While we have achieved good results, our algorithm can be improved significantly with more data. Our model only trained on 25,000 games, but lichess has over a billion games, a significant portion already analyzed by users. With the incredible amount of data and technical expertise at lichess, lichess’ help will be invaluable in creating these revolutionary features. Please find more details about the algorithm in my full proposal here: (https://github.com/Amethyst-Cat/ChessComplexity/blob/master/A%20Metric%20of%20Chess%20Complexity.pdf).

pendru

#2

@Amethyst-Cat lichess games are in accessible data you can find it here https://database.lichess.org/

Amethyst-Cat

#3

@pendru Thanks for the response! I have seen the database, but I think the difficulty in going through the raw database is in processing the games. You would first have to store about 286 GB of chess games, and then find the analyzed games from all played games (you cannot train a model on games that aren't analyzed). I think that the staff at lichess may have better tools to search through the database.

#4

Toadofsky

#5

#1 @Amethyst-Cat I see a great deal of effort went into producing this PDF (although it confused me because it is not structured or peer reviewed). The key concept of using a classifier (ML-based or otherwise) does have my curiosity since prior research in this area (Lucas Chess) lacks generality:
http://lucaschess.blogspot.com/2013/04/version-8-step-2-degree-of-complexity.html

Section 2.7 is inaccurate (BKT is based upon time, not depth); a reference implementation to which I contributed (and added support for Lichess variants) can be found at
https://github.com/niklasf/python-chess/blob/master/examples/bratko_kopec/bratko_kopec.py

... and it is referenced here: https://www.chessprogramming.org/Bratko-Kopec_Test
and further explained here: http://www.sci.brooklyn.cuny.edu/~kopec/Publications/Publications/O_11_C.pdf

I am curious how ML-based engines such as Leela and Phoenix could validate your ML-based classifier, although validation sounds expensive so perhaps not worth pursuing at this time:
Leela: https://lczero.org
Phoenix (I couldn't find the source code): https://arxiv.org/pdf/1603.09051.pdf

#1 @Amethyst-Cat I see a great deal of effort went into producing this PDF (although it confused me because it is not structured or peer reviewed). The key concept of using a classifier (ML-based or otherwise) does have my curiosity since prior research in this area (Lucas Chess) lacks generality: http://lucaschess.blogspot.com/2013/04/version-8-step-2-degree-of-complexity.html Section 2.7 is inaccurate (BKT is based upon time, not depth); a reference implementation to which I contributed (and added support for Lichess variants) can be found at https://github.com/niklasf/python-chess/blob/master/examples/bratko_kopec/bratko_kopec.py ... and it is referenced here: https://www.chessprogramming.org/Bratko-Kopec_Test and further explained here: http://www.sci.brooklyn.cuny.edu/~kopec/Publications/Publications/O_11_C.pdf I am curious how ML-based engines such as Leela and Phoenix could validate your ML-based classifier, although validation sounds expensive so perhaps not worth pursuing at this time: Leela: https://lczero.org Phoenix (I couldn't find the source code): https://arxiv.org/pdf/1603.09051.pdf

Toadofsky

#6

I meant to say "structured like an academic paper" but hadn't had my morning tea yet.

Amethyst-Cat

#7

#5 @Toadofsky Thank you so much for the detailed response! I apologize for the fact that my paper is not properly structured. I have not had much experience writing academic papers but I will look into fixing it (as well as section 2.7).

While I agree that there should be more validation of this work, I will mention that Goldammer (who has much more academic credibility than I do) has implemented a very similar metric (https://github.com/cgoldammer/chess-analysis/blob/master/position_sharpness.ipynb) that you can test out online here (https://chessinsights.org/analysis/).

As of now, I'm not sure how ML-based engines can validate this work, as their target is objectively evaluating the position, not the likelihood that a human chess player will not find the best move. Please let me know if you have any further questions. Thanks again!

#5 @Toadofsky Thank you so much for the detailed response! I apologize for the fact that my paper is not properly structured. I have not had much experience writing academic papers but I will look into fixing it (as well as section 2.7). While I agree that there should be more validation of this work, I will mention that Goldammer (who has much more academic credibility than I do) has implemented a very similar metric (https://github.com/cgoldammer/chess-analysis/blob/master/position_sharpness.ipynb) that you can test out online here (https://chessinsights.org/analysis/). As of now, I'm not sure how ML-based engines can validate this work, as their target is objectively evaluating the position, not the likelihood that a human chess player will not find the best move. Please let me know if you have any further questions. Thanks again!

Toadofsky

#8

#7 @Amethyst-Cat And thanks for the submission! I want to be supportive because I do enjoy science, but also I'm trying to weigh the practical implications of the proposal versus whatever benefits are gained from it...

With that said, I find the concept fascinating and will research & try to validate it, but also given how much we know so far this proposal falls short. Honestly I'm not trying to publicly embarrass or discourage or anything, it's just when I imagine what this proposal might look like as a research paper, questions about the research itself (its validity and general applicability) surface. I apologize for raining on your parade; this is an idea I myself had but didn't know how to implement, and the paper gives me more ideas.

As for ML-based engines... supposing that there's some function f(position, rating) which produces the odds of a blunder:
https://chessinsights.org/analysis/ -- see the "Elo" slider
... then it should be possible to define a reward function based upon accurately simulating function f:
https://en.wikipedia.org/wiki/Adversarial_machine_learning

I can't say I've done that before, but it sounds interesting at any rate.

#7 @Amethyst-Cat And thanks for the submission! I want to be supportive because I do enjoy science, but also I'm trying to weigh the practical implications of the proposal versus whatever benefits are gained from it... With that said, I find the concept fascinating and will research & try to validate it, but also given how much we know so far this proposal falls short. Honestly I'm not trying to publicly embarrass or discourage or anything, it's just when I imagine what this proposal might look like as a research paper, questions about the research itself (its validity and general applicability) surface. I apologize for raining on your parade; this is an idea I myself had but didn't know how to implement, and the paper gives me more ideas. As for ML-based engines... supposing that there's some function f(position, rating) which produces the odds of a blunder: https://chessinsights.org/analysis/ -- see the "Elo" slider ... then it should be possible to define a reward function based upon accurately simulating function f: https://en.wikipedia.org/wiki/Adversarial_machine_learning I can't say I've done that before, but it sounds interesting at any rate.

Amethyst-Cat

#9

#8 @Toadofsky No worries! I really appreciate the feedback. I wanted to put the idea out there in case anyone with more machine learning expertise wanted to take it up. I thought you might also be interested in this peer reviewed paper (https://www.rug.nl/research/portal/files/65618477/ICPRAM_CHESS_DNN_2018.pdf) where researchers showed that neural networks can accurately approximate Stockfish evaluations in positions. I think that using neural networks to approximate errors from positions is a very similar task, replacing one number (Stockfish evaluation of position) with another (error made in position). While I do think my proposal alone may fall short for such a computationally expensive undertaking, there does seem to be evidence suggesting that it is possible to approximate the difficulty of positions, and I hope that I've shown that an accurate measure of complexity can have significant applications in chess.

Toadofsky

#10

#9 Whoa... I've seen hobbyists do hack-a-thon prototypes, but have never seen that concept carried to fruition. I'll pass along that information to other developers (while I still research your concept).

Indeed, difficulty of positions (puzzles, even) is measurable (using a variety of engines and settings):
http://github.com/niklasf/python-chess/blob/master/examples/bratko_kopec/bratko_kopec.py

... I'm still curious and will consider things your way, since despite large up-front CPU costs to prove the concept, once proven (or sufficiently demonstrated such that Lichess is interested; I don't know how to better describe this and it can be challenging) it could prove quite useful.

#9 Whoa... I've seen hobbyists do hack-a-thon prototypes, but have never seen that concept carried to fruition. I'll pass along that information to other developers (while I still research your concept). Indeed, difficulty of positions (puzzles, even) is measurable (using a variety of engines and settings): http://github.com/niklasf/python-chess/blob/master/examples/bratko_kopec/bratko_kopec.py ... I'm still curious and will consider things your way, since despite large up-front CPU costs to prove the concept, once proven (or sufficiently demonstrated such that Lichess is interested; I don't know how to better describe this and it can be challenging) it could prove quite useful.