I believe everyone can agree that for chess rules and principles to be useful in decision-making, they must influence us to play moves we wouldn't have chosen without consulting them.
more random chunk readings. I am actually looking for when my understanding of the learning question might depart from yours. Our respective unspoken premises left, even through blog 1 to blog 3 precautions which I have followed, and the set of axioms, which still has room for my understanding. It might go back to axiom 1 unspoken or undefined (my current working hypothesis). I am also being distracted by your youtube video about woodpecker. jomega has had a look at it, not me. But then it seems that it might be where you would be headed, or have already been trhough in blog 4 parts i have not yet read, or any of previous post blog discussions.
About the quote above first: It seems you may have restricted the purpose of chess theory to the in-game decision process of an already learned enough or expert enough player. That this might be a hint for me about the scope of axiom one, that is not spelled out. For me, it is clear that chess theory is about communicatoin and learning or teaching first, and is not for performance at high level. I think we all agree that once internalized chess concepts are better left in execution mode to that subconscious fast acting brain (but slow learning btw).
I have a wider interpretation of the axiom one problem. which is about for once not conflating the expert decision model with the learning of it from any stage of learnedness in the more general form. So, can you specify axiom1 a bit more with respect to the external science phenomenology that we might not have the same common sense about.
Now going back into my thinking in relatoin to your proposal that I could read so far.
Few words in your replies to me, gave me some hint that we are departing on axiom 1. about the very question of improving or learning specifics. Since as worded there is no restriction I can point to, I might have to use hypotheses about your meaning.
It appears that all you are saying about the relation between training set and performance set, is that it is bigger.
In the limit, it could be the training set or previous "digested" experience set, plus one new position never visited.
Tangent: might I had, that this might be a dynamical thing is self, how does the new experience become the new training set, and does any new instance of single exposure to a seed position and one continuation from it, means it is equally now part of the training set as the previous training set).
From my current preferred existing machine model of learning using representation likely full deep NN latent space in A0 and LC0, as at least one model to consider, I would say, that the question is lacking definition, or specifics, that might be helpful.
The question of position similarity, constrast, difference, or more generally and more general, consistent and potentially complete questino of distance between positions have been thoroughly avoided in chess culture. This, in spite of the sprawling growth of "opening theory" since that started being tallied for all to find secret winning fasct execution recipes, so that they don't need to think about it until out of book.. book growing being that point.
Sometimes we don't ask because it is not helpful. Sometimes we don't or stop asking, because we do not have yet the tools that would sustain the well formed question.
As you said, possibly misusing the word "volume" as possibly from a database or economical language point of view to me still count or cardinality of a countable (ahh.. words, pain in the ...) finite coutable set.
That a set of position size is only about how many distinct FEN strings are in it. So that, given the couting nounds of the past, this is not within our grasp to go specific in your axiom 1. The number of stars in the universe thing.
I should say, that the chess culture avoidance seems so strong that even LC0 and A0 have been oblivious to it for some time. It possibly recently changed: they did produce a paper where they realized that the A0 champion (and LC0 as the open source version or off-shoot, might be ever better than A0, if not taking winning shortcuts based on undefined ELO pools of engine to beat as priority of optimization, always a danger, at the core of the RL dilemna of exploratoin versus exploitation. (this is the dynamic version of the problem of generalization in supervised learning). They found out that their expert zero prior chess champion, did make compromission in its chess position exploration to become better. I would love a discussion on the paper with anyone willing to participate in a forum thread with some plan of reading.. I need the social motivation for the remaining things I did not read or might have guessed over too fast. (the methodology of combining the repertoire biased or as they say specialists, to beat the zero-prior one RL trajectory champion).
But the mathematical model that is being implemented in those machines, allows tools, that do not require that compromission. I guess I will leave this at that. In short. There are other ways, to talk about size. and you were saying that surely size is not all that matters, and of course. whereabouts of the positions with respect to all positions in some space, ..., well that is already formalized that space in the mathematical model...
and it is not blasphemy to imagine it in our visual internal support imagination. even if projecting 32 planes into some 3D cloud of positions, as points. I wonder why people trust one number from the SF magics without blinking an eye, and have trouble with having a stable external non competitive agnostic of evaluation metric system for their position... Is it that the tools are not part of the extended chess math culture, save A0 and LC0? and now some of SF activityies. Note: they can use deep NN as black boxes forever.. and would still get great ELO to show for. I suggest that generalist verus specialist deepmind paper. sorry i lost the exact links and titles. my usual compacting long term memory processes that only retain assocations with some logical probabliity aspect.
> I believe everyone can agree that for chess rules and principles to be useful in decision-making, they must influence us to play moves we wouldn't have chosen without consulting them.
more random chunk readings. I am actually looking for when my understanding of the learning question might depart from yours. Our respective unspoken premises left, even through blog 1 to blog 3 precautions which I have followed, and the set of axioms, which still has room for my understanding. It might go back to axiom 1 unspoken or undefined (my current working hypothesis). I am also being distracted by your youtube video about woodpecker. jomega has had a look at it, not me. But then it seems that it might be where you would be headed, or have already been trhough in blog 4 parts i have not yet read, or any of previous post blog discussions.
About the quote above first: It seems you may have restricted the purpose of chess theory to the in-game decision process of an already learned enough or expert enough player. That this might be a hint for me about the scope of axiom one, that is not spelled out. For me, it is clear that chess theory is about communicatoin and learning or teaching first, and is not for performance at high level. I think we all agree that once internalized chess concepts are better left in execution mode to that subconscious fast acting brain (but slow learning btw).
I have a wider interpretation of the axiom one problem. which is about for once not conflating the expert decision model with the learning of it from any stage of learnedness in the more general form. So, can you specify axiom1 a bit more with respect to the external science phenomenology that we might not have the same common sense about.
Now going back into my thinking in relatoin to your proposal that I could read so far.
Few words in your replies to me, gave me some hint that we are departing on axiom 1. about the very question of improving or learning specifics. Since as worded there is no restriction I can point to, I might have to use hypotheses about your meaning.
It appears that all you are saying about the relation between training set and performance set, is that it is bigger.
In the limit, it could be the training set or previous "digested" experience set, plus one new position never visited.
Tangent: might I had, that this might be a dynamical thing is self, how does the new experience become the new training set, and does any new instance of single exposure to a seed position and one continuation from it, means it is equally now part of the training set as the previous training set).
From my current preferred existing machine model of learning using representation likely full deep NN latent space in A0 and LC0, as at least one model to consider, I would say, that the question is lacking definition, or specifics, that might be helpful.
The question of position similarity, constrast, difference, or more generally and more general, consistent and potentially complete questino of distance between positions have been thoroughly avoided in chess culture. This, in spite of the sprawling growth of "opening theory" since that started being tallied for all to find secret winning fasct execution recipes, so that they don't need to think about it until out of book.. book growing being that point.
Sometimes we don't ask because it is not helpful. Sometimes we don't or stop asking, because we do not have yet the tools that would sustain the well formed question.
As you said, possibly misusing the word "volume" as possibly from a database or economical language point of view to me still count or cardinality of a countable (ahh.. words, pain in the ...) finite coutable set.
That a set of position size is only about how many distinct FEN strings are in it. So that, given the couting nounds of the past, this is not within our grasp to go specific in your axiom 1. The number of stars in the universe thing.
I should say, that the chess culture avoidance seems so strong that even LC0 and A0 have been oblivious to it for some time. It possibly recently changed: they did produce a paper where they realized that the A0 champion (and LC0 as the open source version or off-shoot, might be ever better than A0, if not taking winning shortcuts based on undefined ELO pools of engine to beat as priority of optimization, always a danger, at the core of the RL dilemna of exploratoin versus exploitation. (this is the dynamic version of the problem of generalization in supervised learning). They found out that their expert zero prior chess champion, did make compromission in its chess position exploration to become better. I would love a discussion on the paper with anyone willing to participate in a forum thread with some plan of reading.. I need the social motivation for the remaining things I did not read or might have guessed over too fast. (the methodology of combining the repertoire biased or as they say specialists, to beat the zero-prior one RL trajectory champion).
But the mathematical model that is being implemented in those machines, allows tools, that do not require that compromission. I guess I will leave this at that. In short. There are other ways, to talk about size. and you were saying that surely size is not all that matters, and of course. whereabouts of the positions with respect to all positions in some space, ..., well that is already formalized that space in the mathematical model...
and it is not blasphemy to imagine it in our visual internal support imagination. even if projecting 32 planes into some 3D cloud of positions, as points. I wonder why people trust one number from the SF magics without blinking an eye, and have trouble with having a stable external non competitive agnostic of evaluation metric system for their position... Is it that the tools are not part of the extended chess math culture, save A0 and LC0? and now some of SF activityies. Note: they can use deep NN as black boxes forever.. and would still get great ELO to show for. I suggest that generalist verus specialist deepmind paper. sorry i lost the exact links and titles. my usual compacting long term memory processes that only retain assocations with some logical probabliity aspect.