- Blind mode tutorial
lichess.org
Donate

Positive Reinforcement During Computer Analysis

I really appreciate the discussion!

Sybotes, Toadofsky, if you have links to previous posts here or elsewhere regarding this I'd love to take a look.

jperkins that article is definitely more towards what I'm thinking about. I started looking more into some scientific supporting arguments, but I'm not cognitive psychologist/learning expert. This Wiki article was fun to start reading: https://en.wikipedia.org/wiki/Operant_conditioning
My brief understanding of that and a few published article summaries seems to suggest that a mix of both positive and negative reinforcement is ideal for building behavior/learning.

I'm not sure that it would be more resource-intensive then what already is being computed. A "best (least bad) move" is already being computed during analysis, I think just displaying something like "(#total_moves - (#inaccuracies + #mistakes + #blunders))/(#total_moves)" could be simple positive feedback.

There's definitely an interesting discussion about where the line is between "what lichess should do" and "what you should get elsewhere". I would argue that if mistakes/centipawn loss/etc. are provided, then any positive metrics that can be computed in a similar manner fall into the same category.

I'll attempt to spend some time looking at the front-end code and see if there's anything simple I can come up with. Again the discussion is much appreciated!

I really appreciate the discussion! Sybotes, Toadofsky, if you have links to previous posts here or elsewhere regarding this I'd love to take a look. jperkins that article is definitely more towards what I'm thinking about. I started looking more into some scientific supporting arguments, but I'm not cognitive psychologist/learning expert. This Wiki article was fun to start reading: https://en.wikipedia.org/wiki/Operant_conditioning My brief understanding of that and a few published article summaries seems to suggest that a mix of both positive and negative reinforcement is ideal for building behavior/learning. I'm not sure that it would be more resource-intensive then what already is being computed. A "best (least bad) move" is already being computed during analysis, I think just displaying something like "(#total_moves - (#inaccuracies + #mistakes + #blunders))/(#total_moves)" could be simple positive feedback. There's definitely an interesting discussion about where the line is between "what lichess should do" and "what you should get elsewhere". I would argue that if mistakes/centipawn loss/etc. are provided, then any positive metrics that can be computed in a similar manner fall into the same category. I'll attempt to spend some time looking at the front-end code and see if there's anything simple I can come up with. Again the discussion is much appreciated!

What about some kind of "criticism sandwich" algorithm? Wedge the criticism between two loaves of "compliment" bread.

[compliment] [criticism] [compliment2]

I'm not a programmer, but more of an amateur software architect, and that was me diagramming my idea.

Perhaps some random selection of compliment from a compliment database? Then just insert the normal LiChess Stockfish commentary. Followed by another compliment, or even create an encouraging remark database?

But in all seriousness, I really appreciate the positive thinking from your post. I understand that it's a bit of computer science/cognitive science challenge to code this without sending out too many false positives (no pun intended). But it's great ideas like this that bring change. Thank you.

Jeff

What about some kind of "criticism sandwich" algorithm? Wedge the criticism between two loaves of "compliment" bread. [compliment] [criticism] [compliment2] I'm not a programmer, but more of an amateur software architect, and that was me diagramming my idea. Perhaps some random selection of compliment from a compliment database? Then just insert the normal LiChess Stockfish commentary. Followed by another compliment, or even create an encouraging remark database? But in all seriousness, I really appreciate the positive thinking from your post. I understand that it's a bit of computer science/cognitive science challenge to code this without sending out too many false positives (no pun intended). But it's great ideas like this that bring change. Thank you. Jeff

#12 "But it's great ideas like this that bring change."

Unfortunately the world is not quite that simple... sorry. Great ideas which can't be executed can't be used; and in software details matter quite a bit.
http://xyproblem.info/

#12 "But it's great ideas like this that bring change." Unfortunately the world is not quite that simple... sorry. Great ideas which can't be executed can't be used; and in software details matter quite a bit. http://xyproblem.info/

Do you really need Stockfish to stoke your ego? When I analyse with a computer what I want to know first and foremost is where I can improve.

Do you really need Stockfish to stoke your ego? When I analyse with a computer what I want to know first and foremost is where I can improve.

№ 13,

Well, the entire Modern era is an example of nice-sounding ideas that have already been proven not to work at all, under any conditions whatsoever, still being tested ad nauseam anyway by those in power. 😑 (100 million deaths … 200 million deaths … can we get 300 million?? … SOLD to the highest voter!)

I know it’s a pessimistic thing to say, but … just saying.

PS: I wish those in power in “the real world” were all programmers; or at least intelligent, honest and reasonable like you. 🧡

№ 13, Well, the entire Modern era is an example of nice-sounding ideas that have already been proven not to work at all, under any conditions whatsoever, still being tested ad nauseam anyway by those in power. 😑 (100 million deaths … 200 million deaths … can we get 300 million?? … SOLD to the highest voter!) I know it’s a pessimistic thing to say, but … just saying. PS: I wish those in power in “the real world” were all programmers; or at least intelligent, honest and reasonable like you. 🧡

I regret not being more direct about my implementation ideas in the first post, regarding simply showing inverse statistics, etc. In hindsight I understand it comes off a bit like $OBLIVIOUS_USER (#13).

I do not think that comments like #14 (and the jokes in #12) are very productive. Is your argument that "people shouldn't be babies" ? If this is the majority opinion I'd be disappointed; I very much love lichess and its vision but would not be nearly as excited to engage if suggestions for beginner accessibility and improvement were dismissed by "don't be a snowflake" at every turn. The engine is already telling you where you went wrong, I see no logical reason for why this is any different from telling you where you went right.

From my brief research, there seems to be a decent amount of scientific evidence that positive reinforcement is valuable for learning. Skimming these articles suggests a mix may be best for maximizing reward (in this case learning).
https://pubs.aeaweb.org/doi/pdf/10.1257/000282803322157142 (direct PDF)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006973 (journal article landing page)

If someone has evidence to the contrary I would love to see it, and am open to my opinion on this being swayed. I'm less likely to be convinced by coding difficulty arguments; but if that's the case I'll work on the dev onboarding and see what I can come up with.

I regret not being more direct about my implementation ideas in the first post, regarding simply showing inverse statistics, etc. In hindsight I understand it comes off a bit like $OBLIVIOUS_USER (#13). I do not think that comments like #14 (and the jokes in #12) are very productive. Is your argument that "people shouldn't be babies" ? If this is the majority opinion I'd be disappointed; I very much love lichess and its vision but would not be nearly as excited to engage if suggestions for beginner accessibility and improvement were dismissed by "don't be a snowflake" at every turn. The engine is already telling you where you went wrong, I see no logical reason for why this is any different from telling you where you went right. From my brief research, there seems to be a decent amount of scientific evidence that positive reinforcement is valuable for learning. Skimming these articles suggests a mix may be best for maximizing reward (in this case learning). https://pubs.aeaweb.org/doi/pdf/10.1257/000282803322157142 (direct PDF) https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006973 (journal article landing page) If someone has evidence to the contrary I would love to see it, and am open to my opinion on this being swayed. I'm less likely to be convinced by coding difficulty arguments; but if that's the case I'll work on the dev onboarding and see what I can come up with.

This topic has been archived and can no longer be replied to.