@ambrona said in #25:
> Alpha 0 and all chess engines look for *approximate* solutions to the problem of "what is the best move?". In contrast, to reason about vampires, we would need mathematical proofs, not approximate solutions.
>
> That said, your idea can actually be used to prove that a position is NOT a vampire (e.g. by finding a game leading to its mirror image).
What do you think approximation means, in the NN universal approximation theorems?
this is not like a partial tree truncated approximation with putative leaf evaluation forecasters for the full exhaustive tree.
That kind of approximation is not even shown to converge to the full tree anymore.. The leaf evaluation could have biases that the early or late move reduction based on its shallow usage to fastforward search deeper than wider, can't be proven to converge to complete tree.
One the other hand, NN based approaches (not NNue, as it only tries to reproduce classical leaf evals, is not based on logical chess outomes of full games), use lossless position information (even the game depth practical constraint rules, although it would work without 50 move rules, if it were allowed more sampling in endgames, personal impression of how RL works).
So, in theory of the universal approximation theorem there is a big enough NN, and parameters set value that will be able to approximate any complex function of the position such as the probability of winning given perfect play.
The thing here, is that while probability of winning is possibly infinitesimal model, of a finite (big finite) world of logical outcome in legal space of chess for all possible moves, chess being still finite would mean there is a smallest increment among all the position being compared, that is different from zero for distinct pairs. We just don't know what it is, but SF seems to know as it has to save evaluation cost with NNue input encoding of the positoin information. But using NN big enough, the latent space being allowed as much float precision as needed to represent the mathematical model of the universal approximation theorems, would contain that. in unknown small evaluation increment, embedding the problem into a bigger space like, will contain it.
sorry. got a bit carried. I just think we ought to be careful about the notion of approximation, and exactitude. Mathematics does allow for the infinite concept.
and probabliity theory does contain point mass distributions, those are also intergrable function. that can be approximate as close as needed. If the problem is finite enough like the XOR problem.. many NN will solve it exactly. As in chess one can embeded the logic XOR problem in small but deep enough NN euclidian ambient spaces, and the sampling does not need much to find the exact truth table. Well, if one have only 0 and 1 has output of the approximation function, eventually the finite input will always give the truth table. Only it solves a bigger continuous problem to get there.. And an approximation sequence.. Since the problem is small finite. it does not take long to get there, exactly. I would need to make blog of that.. this seems like an angular conceptual misconception in chess. Perhaps it explains why it could so long before NN were actually considered.
Appromation theory is not a topic of discrete mathematics more often associated to chess mathematics.
In short, even if approximative. It could learn to propose good candiates, for us to polish... and build more theory about. finding pattern that we could prove exactly from this more intuitive machine learning. At some some inductive reasoning if the input data is not with measurement error, will match deductive reasoning. running out of wrong hypothesis to posit.