Vampires in Chess II - The Chase

@ambrona said in #20:
> That's a nice question! I was planning to add some comments about that in Part III.
>
> No a such tool exists yet, AFAIK. I believe it is possible tho and I plan to build one at some point. However, it is really challenging to capture some vampires. So building a tool that is complete would be a lot of work.
Yes, I see some reasons why this might be a bit complicated to make. I'm not sure this will work at all, but more of an idea like alpha 0, playing games against itself to look for vampires?

Letpchess

#22

Wow! Very nice writing style to create a chess lesson. Thank you!

dboing

edited

#23

@Kings_Army said in #21:
> Yes, I see some reasons why this might be a bit complicated to make. I'm not sure this will work at all, but more of an idea like alpha 0, playing games against itself to look for vampires?
also @ambrona

Actually, this would shape the whole thing into more automatisable framework, which is relate to my more philosophical question about art "vs" method.. if such code existed, one could extract the parts into patterns. Or creating such thing, would be a formalism building pressure, which might side ways help complementing the presentation. All angles of presentation can help, sa does "Vampire" attribution or analogy suggestion. It does not have to become eternal chess terminology, but it catches the attention toward: what could that mean for chess? What kind of prism did the blog author use to look at chess in such a way (the kind I like in the end, as it is an abstraction based on human looking at chess world, the board kind, not the social kind, or does it? kidding).
This (the software or computer tool) was a good question, I concur.

We could help too. If time to put on it. At least we could sound board, right here in this post blog space inherited from discussion forum now buried feature.. (:). Or it could become its own public forum parallel thread.. so as not to be specifically about the blog, but maybe the topic.. Just suggesting.. As long as there is communication in many directions, I think things can advance. The worst case might be having fun trying.

#24

ambrona

#25

@Kings_Army said in #21:
> Yes, I see some reasons why this might be a bit complicated to make. I'm not sure this will work at all, but more of an idea like alpha 0, playing games against itself to look for vampires?

Alpha 0 and all chess engines look for *approximate* solutions to the problem of "what is the best move?". In contrast, to reason about vampires, we would need mathematical proofs, not approximate solutions.

That said, your idea can actually be used to prove that a position is NOT a vampire (e.g. by finding a game leading to its mirror image).

ambrona

#26

@WZD520 said in #24:
> oK SIKE

WDTM?

I have already received a correct solution to the challenge via direct message.

But anyone can still be the first to post the solution here!

dboing

#27

@ambrona said in #25:
> Alpha 0 and all chess engines look for *approximate* solutions to the problem of "what is the best move?". In contrast, to reason about vampires, we would need mathematical proofs, not approximate solutions.
>
> That said, your idea can actually be used to prove that a position is NOT a vampire (e.g. by finding a game leading to its mirror image).

What do you think approximation means, in the NN universal approximation theorems?

this is not like a partial tree truncated approximation with putative leaf evaluation forecasters for the full exhaustive tree.
That kind of approximation is not even shown to converge to the full tree anymore.. The leaf evaluation could have biases that the early or late move reduction based on its shallow usage to fastforward search deeper than wider, can't be proven to converge to complete tree.

One the other hand, NN based approaches (not NNue, as it only tries to reproduce classical leaf evals, is not based on logical chess outomes of full games), use lossless position information (even the game depth practical constraint rules, although it would work without 50 move rules, if it were allowed more sampling in endgames, personal impression of how RL works).

So, in theory of the universal approximation theorem there is a big enough NN, and parameters set value that will be able to approximate any complex function of the position such as the probability of winning given perfect play.

The thing here, is that while probability of winning is possibly infinitesimal model, of a finite (big finite) world of logical outcome in legal space of chess for all possible moves, chess being still finite would mean there is a smallest increment among all the position being compared, that is different from zero for distinct pairs. We just don't know what it is, but SF seems to know as it has to save evaluation cost with NNue input encoding of the positoin information. But using NN big enough, the latent space being allowed as much float precision as needed to represent the mathematical model of the universal approximation theorems, would contain that. in unknown small evaluation increment, embedding the problem into a bigger space like, will contain it.

sorry. got a bit carried. I just think we ought to be careful about the notion of approximation, and exactitude. Mathematics does allow for the infinite concept.

and probabliity theory does contain point mass distributions, those are also intergrable function. that can be approximate as close as needed. If the problem is finite enough like the XOR problem.. many NN will solve it exactly. As in chess one can embeded the logic XOR problem in small but deep enough NN euclidian ambient spaces, and the sampling does not need much to find the exact truth table. Well, if one have only 0 and 1 has output of the approximation function, eventually the finite input will always give the truth table. Only it solves a bigger continuous problem to get there.. And an approximation sequence.. Since the problem is small finite. it does not take long to get there, exactly. I would need to make blog of that.. this seems like an angular conceptual misconception in chess. Perhaps it explains why it could so long before NN were actually considered.
Appromation theory is not a topic of discrete mathematics more often associated to chess mathematics.

In short, even if approximative. It could learn to propose good candiates, for us to polish... and build more theory about. finding pattern that we could prove exactly from this more intuitive machine learning. At some some inductive reasoning if the input data is not with measurement error, will match deductive reasoning. running out of wrong hypothesis to posit.

dboing

#28

I actually did mis the alpha zero. it had to be from @ambrona replies that I noticed (I put a increasing positive reaction of the source post). good point @Kings_Army. but i suggest supervised learning from the existing examples. perhaps not even a big LC0 would be needed (a0 could learn legal chess very well, but it would take some of the expression power of the NN away from the tough non small finite logic problem of best chess, so I read that they hand-crafted with computer logic the legal rules, which is not that difficult (small finite ruleset, but expanding over 64 points etc.. at depth.. the outcomes is what is not small finite).

I figure the mirror problem while difficult for us to generalize (although I am sure that those used to retrograde analysis must have build some skills about finding them, which is what NN do too, btw, they build statistical intuition (redundant phrase), would be of the order of legal chess learning. Although it would be nice to find out. I think one might not need the big LC0 distributed RL machinery. And would not get it anyway, as people are addicted to gladiator use of such potential statistical chess measure instruments. It is not about the elusive best chess deep outcomes problem. it is not as simple as legal chess from the mutlidimesnion basic input vector that is an exact formalism of chess position. I think given the math writing tools I would need to be familiar with, I could show the exact chess math. people how a0 and lc0 input vector is actually the lossless representation, and how it makes the core rules of chess easily fall out from each piece type mobility definition.. That legality does not even have to be computed by growing sequences one node at a time. that the logical rules are easier and more parsimonious to apply in the bigger representation space, as collision of placements. one can have the full 32 piece branching superimposed and substract all the illegal positoin transition with very core rule of chess king graph point occupation, capture and restriction (for the king). It is just that ASCII is a pita.

Kings_Army

#29

We could think about these types of positions like black gets the move first. We could find patterns like ex - If two pawns on the same file both move up two squares, it isn't a vampire. This is because, for example, play could begin like e3, e5, e4, and now black has the move. It's e4, e55 in reverse.

We could continue finding these patterns until we can prove vampires.

dboing

edited

#30

about approximations. Someone wiser than me, reminded me about the notation "..." as a R.H.S. of an equation, but I will present on the L.H.S of it, it might be more telling because this is a stream of character, when reading it, before it goes to ones brain "space:

0.9999999.... = 1. (not even by definition, I mean that "=" does not stand for definition, this is a result).

L.H.S is the notation for the object that has an infinite sequence of decimal values being 9. The equal sign is exact, if using that understanding. It say that one can approximate any real number with a sequence of real number. That approximation word there is not natural language as in Plato's Cavern, where it is derogatory for either the shadow on the wall, or the thing projecting such shadow (I might have misunderstood, as I often get things upside down, or just lose sight of which was which, in the reading upload sequence, kidding in this case).

In that mathematical approximation sense of the word, it is very hard to tell the difference between the approximating process, as we consider its converging point and the thing being approximated.

(which might be reached in finite time, or big finite time to, the limit of a sequence of identical values is also a point of convergence, but it does not have to, depends on the approximating process used).

a circle is getting closer and closer from it polygonal based approximation sequence as the peripheral segmentation gets finer, as fast as we want, but increasingly so (one can imagine, no need for more words).

Then what is approximating what is my question to the proverbial cavern thing (statement, parabole, allegory, analogy, pick one). It might have been that it was not about mathematical approximation. I am being picky. maybe. Thanks for the reminder of that example, that was too deep obvious to me, as deformed from one species of mathematics as I might have been bathing in the past. This thing is, that one has to accept that infinity can be part of the discourse, even in finite context.

To help with that. Remember that one can write the points of the natural numbers on a piece of paper, and then look at the piece of paper and some plane approximation (well maybe there depending on how flat the table is, one can use poor approximation, as the piece of paper is bounded.. and one has to resort to wet brain imagination space to extend it.