GM Andrew Tang vs Leela Chess Zero

Why does Leela play much better on a GPU than a CPU? Can Stockfish take advantage of GPUs as well?

Sigh.

WikiPedia anyone?

A couple of search terms, relative to "GPU" and "CPU" -- "RISC" and "CISC" (Reduced or Complex Instruction Set Computing) and "FP" (or Floating-Point).

So, here, let me ask you a critical question about handling fractions (i.e. "centipawn" considerations) ... what's the difference between doing a Floating Point calculation of Numerator over Denominator, versus holding both Numerator and Denominator separately in memory (given the vast amounts of RAM at our disposal either on the GPU, or on the CPU's motherboard)???

Hint/Answer: Nothing.

GPU's were originally designed without FP instructions (they're generally a RISC/ASIC hybrid). Barring that fact, modern versions of CUDA have FP instructions build-in at some pre-compiler and AISC points. Irrespective, all you really need is to do the INTEGER calculations IN-MEMORY, keeping track of Numerator and Denominator (of a fraction), such that, after running a single computational cycle (across so many available GPU cores at a given bit-depth, not that bit-depth matters too much), all you really need to do is have whatever computational cycle available (i.e. your GhZ, be it GPU or CPU) to handle both numerator and denominator, and the bring them together for a final output, such as a "centipawn" (hundredth or thousandth of a pawn), which can be calculated using software, in-memroy, possibly faster than it would be to do it on-chip for a specified CPU-based available FP bit-depth. (Versus off-loading the computational power to a series of WORDS computed in-memory via software, using only INT routines; instead of trying to do everything floating-point from the ground-up and being limited by the bit-depth of available registers.)

Otherwise, in older CISC terms, you have registers and pathways capable of passing a certain number of bits. And, if you chain these things together to handle more bits of (floating-point) precision than are available by registers, you have a certain amount of overhead to ensure that all of the stuff you're computing stays properly chained together. This versus just saying: "Numerator is in this register with this series of WORDS, and Denominator starts over-there in the other register at this series of WORDS" ... handle both of them using the same INTEGER considerations ... (blah, blah, blah).

Oh, yeah, and when you're done, give me a decimal (FP) approximation to XYZ bit-depth or degree of precision.

Sorry, I know, someone else already told me earlier this evening that I'm a bit of an asshole. So, not to be an asshole again, but, all of this is all very trivial and basic. (Well, not so "trivial" or "basic" the way I've explained it -- but here you have it in a nutshell.)

GPU's go faster because they work better with INTEGER numbers; a greater number of GPU cores because they don't need all of that gate-wise overhead of CISC processors with fancy-nifty instructions such as various MMX-series or AES instructions to get to the core and then out again, etc.

Hey, what about that big rectangle sitting in front of your face ... you know, your monitor ... that's just X-pixels by Y-pixels, right? And shading the depth of pixels in the Z-dimension is just relative to X-and-Y, right? And, these are all very basic questions and equations (learned in geometry, advanced algebra, and trigonometry) that will operate over fractional parts (i.e. of lighting-up an RGB LED a certain way).

What fraction can't be turned into a two-part decimal representation? Irrational numbers. Can we approximate an irrational number? Sure! How often do we actually need to approximate an irrational number? Always, except for "i" (imaginary numbers when taking roots). So ... let's just do things the "I" way ... that is ... Integers, and if we need "imaginary" units and numbers, we'll add that as an overhead matter after-the-fact.

Happy now? We've dumbed math down to ... well ... now you understand why all that basic math you hate in grade-school is important. (Can't get your gaming or chess fix without maths.)

nikolajtesla

#33

@FireWorks mostly because it uses a different approach to select moves. In a very simplistic explanation: instead of a variant of alphabeta pruning that uses stockfish it uses neural network with monte carlo. And neural network performs significantly better on GPU (10x-50x faster).

Chess engines on GPUs are possible (chessprogramming.wikispaces.com/GPU) but so far the attempts to build a GPU engine were not successful (before alpha chess). Leela looks very promising but so far its rating of ~2700 is really really weak in comparison to stockfish 3444. It will be super lucky to win 1 game out of a 1000 with these numbers.

FireWorks

#34

Well, I barely know what a computer is, so all this is very new to me. I'd love to have Leela compete with Stockfish one day tho, but I guess that's a couple of years away.

MrCharles

#35

@FireWorks -- The whole "Leela" BOT thing directly descends from Leela being based on Google's AlphaZero (Cough-BS-Cough) competing with Stockfish.

Note: The "Cough-BS-Cough" line is due to, or, directly because of Google's self-serving paper in which it attempted to level its "AlphaZero" system (or, engine) against Stockfish ... and was specifically unclear about computational equivalency.

Let me re-state that, and recall some history: Google designed "AlphaZero" to be trained in 4-hours on a number of chess games, and then put it head-to-head with a standard (if not specifically lesser-rated) Stockfish instance. BUT, Google's paper authors explicitly and very specifically left out computational equivalence. They used measures from their own theory of what defines computational power. (Though, Google has so much more computational power it could have given to a standard Stockfish instance than it gave for the puprposes of its self-promotional paper.)

To my knowledge, Leela is an attempt to replicate what Google can do using the same resources used to train its "AlphaZero" instance to play against Stockfish (i.e. that of Neural Network resources to learn and advance).

In simpler terms: Google published a self-promotional paper without providing or citing computational equivalence, and this has led to the whole "Leela" phenomenon. (In a nutshell; Google may be right or wrong, but, it's one big ploy unless or until computational scientists come down on one side or the other to say: "Winner-Winner, Chicken-Dinner.")

asymptoticfreedom

#36

@nikolajtesla

It sounds we agree more than our initial comments might have let on. Yes, there are problems, and one of them is that DeepMind doesn't often (ever?) give implementations of their most famous algorithms. And yes, in my initial post I did misspeak: I meant in 9 hours of training, it outperformed Stockfish after 4 hours (this is the the result in the paper you cited), using very high end computational resources, but 4 hours nonetheless using those resources. I also agree that the main point is AlphaZero (as discussed in the Nature paper), not its applications to Chess.

I am a little less cynical than you are about being able to reproduce results. Results must be reproduced, but at the forefront of any STEM field it necessarily requires significant devotions of time and resources, and can almost never happen by amateurs. One job of the AlphaZero referees at Nature (no peer review on the chess paper yet?) should be to ensure that they are specific enough that a forefront group would be able to reproduce the results with sufficient time and resources. Perhaps we disagree on whether this was done.

The one addition I would make in response to one of your other posts is to not just chalk this up to neural nets and monte carlo --- it is deep reinforcement learning that is regarded as the backbone of AlphaZero, with the neural nets and monte carlo playing a central role within the deep RL backbone.

Regardless of our quibbling about hours and computational comparisons, both of our posts make a key point: there is a literature on this, and readers should decide for themselves.

To those that are interested in what makes AlphaZero and the associated Chess results possible, you should read up specifically on reinforcement learning, which play a more central role than CPU vs GPU vs TPU or neural nets, though these are also important. See, e.g., David Silver's excellent RL lectures http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html.

namakando

#37

Will ID125 be playing against Andrew Tang?

namakando

#38

Because I beat that version

namakando

#39

[Event "Casual Blitz game"] [Site "lichess.org/Nir3SuQN"] [Date "2018.04.20"] [Round "-"] [White "Namakando"] [Black "LeelaChess"] [Result "1-0"] [UTCDate "2018.04.20"] [UTCTime "06:31:34"] [WhiteElo "2007"] [BlackElo "1785"] [BlackTitle "BOT"] [Variant "Standard"] [TimeControl "300+0"] [ECO "C11"] [Opening "French Defense: Burn Variation"] [Termination "Normal"] [Annotator "lichess.org"] 1. e4 e6 2. d4 d5 3. Nc3 Nf6 4. Bg5 { C11 French Defense: Burn Variation } dxe4 5. Nxe4 Nbd7 6. Nf3 Be7 7. Nxf6+ Nxf6 8. Bd3 b6 9. O-O Bb7 10. c3 O-O 11. Qe2 c5 12. dxc5 Bxc5 13. Rad1 Qc7 14. Ne5 h6?? { (0.43 → 6.33) Blunder. Best move was Rfd8. } (14... Rfd8 15. Kh1 h6 16. Bh4 Rac8 17. Rfe1 Bd6 18. f4 a5 19. a3 Bc5 20. f5 exf5 21. Bxf5) 15. Bxf6 gxf6?? { (5.89 → 11.72) Blunder. Best move was Be7. } (15... Be7) 16. Qg4+ Kh8 17. Qf4 Kg8 18. Qg3+ Kh8 19. Ng6+ fxg6 20. Qxc7 Rae8 21. Qxb7 Re7 22. Qe4 f5 23. Qe5+ Kh7 24. b4 Bxf2+ 25. Rxf2 Rd8 26. Rfd2 Rd5?! { (22.27 → Mate in 10) Checkmate is now unavoidable. Best move was h5. } (26... h5 27. Qf6) 27. Qf6 Red7 28. Qxe6 R5d6 29. Qe2?! { (Mate in 6 → 70.85) Lost forced checkmate sequence. Best move was Bxf5. } (29. Bxf5 Rxe6 30. Rxd7+ Kg8 31. Bxe6+ Kf8 32. Rxa7 Ke8 33. Rdd7 Kf8 34. Ra8#) 29... Kg7?! { (70.85 → Mate in 7) Checkmate is now unavoidable. Best move was h5. } (29... h5 30. Bxf5) 30. Bc4 Rxd2 31. Rxd2 Rxd2 32. Qxd2?! { (Mate in 3 → Mate in 5) Not the best checkmate sequence. Best move was Qe7+. } (32. Qe7+ Kh8 33. Qf8+ Kh7 34. Qg8#) 32... b5 33. Bxb5?! { (Mate in 5 → Mate in 6) Not the best checkmate sequence. Best move was Qd7+. } (33. Qd7+ Kf6 34. Qd8+ Kg7 35. Qe7+ Kh8 36. Qf8+ Kh7 37. Qg8#) 33... Kf6 34. h4?! { (Mate in 5 → Mate in 5) Not the best checkmate sequence. Best move was Qd8+. } (34. Qd8+ Ke5 35. Qe7+ Kd5 36. Qf6 h5 37. Bc6+ Kc4 38. Qd4#) 34... g5 35. h5?! { (Mate in 5 → Mate in 5) Not the best checkmate sequence. Best move was Qd6+. } (35. Qd6+ Kg7 36. Be8 gxh4 37. Qg6+ Kh8 38. Bf7 h5 39. Qg8#) 35... f4 36. Bd3?! { (Mate in 5 → Mate in 8) Not the best checkmate sequence. Best move was Qd6+. } (36. Qd6+ Kf7 37. Bc4+ Ke8 38. Be6 a6 39. Qd7+ Kf8 40. Qf7#) 36... Ke5 37. Qe2+?! { (Mate in 5 → Mate in 6) Not the best checkmate sequence. Best move was Bc4. } (37. Bc4 g4 38. Qd5+ Kf6 39. Qe6+ Kg7 40. Qg6+ Kf8 41. Qf7#) 37... Kf6 38. Qe4?! { (Mate in 4 → Mate in 5) Not the best checkmate sequence. Best move was Bc4. } (38. Bc4 f3 39. Qe6+ Kg7 40. Qg6+ Kf8 41. Qf7#) 38... g4 39. Bc4 Kg5 40. Qe5+?! { (Mate in 2 → Mate in 2) Not the best checkmate sequence. Best move was Qe7+. } (40. Qe7+ Kf5 41. Bd3#) 40... Kh4 41. Qxf4?! { (Mate in 2 → Mate in 2) Not the best checkmate sequence. Best move was Qe7+. } (41. Qe7+ Kxh5 42. Bf7#) 41... Kxh5 42. Bf7+?! { (Mate in 2 → Mate in 2) Not the best checkmate sequence. Best move was Qf6. } (42. Qf6 g3 43. Be2#) 42... Kh4 43. g3+?! { (Mate in 2 → Mate in 2) Not the best checkmate sequence. Best move was Qxh6+. } (43. Qxh6+ Kg3 44. Qh2#) 43... Kh3 44. Bc4 h5 45. Bf1# { White wins by checkmate. } 1-0

arex

#40

@Namakando The answer to that question is in the blog post :-)

This topic has been archived and can no longer be replied to.