- Blind mode tutorial
lichess.org
Donate

Was Alphazero beating Stockfish BS?

Both stockfish and leela chess zero are now nearly 100 elo stronger than alphazero.

Alphazero was 20 block network. Leela chess zero uses 30 block network. Best 30 block networks are 100 elo stronger than best 20 block network.

Stockfish uses NNUE eval now. It is a neural network for CPU. Latest stockfish would be 200 elo stronger than stockfish which played alphazero. 300 elo if you take sf8.

Both stockfish and leela chess zero are now nearly 100 elo stronger than alphazero. Alphazero was 20 block network. Leela chess zero uses 30 block network. Best 30 block networks are 100 elo stronger than best 20 block network. Stockfish uses NNUE eval now. It is a neural network for CPU. Latest stockfish would be 200 elo stronger than stockfish which played alphazero. 300 elo if you take sf8.

A good article (5 months old):

"Here are the results for Stockfish NNUE with the network by Sergio compared to AlphaZero.

We know that AlphaZero is +52 Elo to Stockfish 8 according to the papers released by Deepmind. +155-6=839 (SF8) +52 Elo.

We know Stockfish 11 is +166 Elo to Stockfish 8, with Stockfish 12 Dev being 30 Elo stronger at +196 Elo.

And based off of a recent test from myself between Stockfish NNUE (Sergio 2138) vs. Stockfish 12 Dev, I found that SF NNUE is +67 Elo to SF12Dev.

And so, with that, we get these conclusions

  • Stockfish NNUE is +211 Elo to AlphaZero.

  • Leela Chess Zero ID 64341 is +151 Elo to AlphaZero

  • Stockfish 12 Dev is +144 Elo to AlphaZero"

https://www.reddit.com/r/chess/comments/i0gipx/stockfish_nnue_is_211_elo_to_alphazero/

So even Stockfish 11 was stronger then AlphaZero.

A good article (5 months old): "Here are the results for Stockfish NNUE with the network by Sergio compared to AlphaZero. We know that AlphaZero is +52 Elo to Stockfish 8 according to the papers released by Deepmind. +155-6=839 (SF8) +52 Elo. We know Stockfish 11 is +166 Elo to Stockfish 8, with Stockfish 12 Dev being 30 Elo stronger at +196 Elo. And based off of a recent test from myself between Stockfish NNUE (Sergio 2138) vs. Stockfish 12 Dev, I found that SF NNUE is +67 Elo to SF12Dev. And so, with that, we get these conclusions - Stockfish NNUE is +211 Elo to AlphaZero. - Leela Chess Zero ID 64341 is +151 Elo to AlphaZero - Stockfish 12 Dev is +144 Elo to AlphaZero" https://www.reddit.com/r/chess/comments/i0gipx/stockfish_nnue_is_211_elo_to_alphazero/ So even Stockfish 11 was stronger then AlphaZero.

Comparaison has one weak point comparing engine to its earlier version tends to overestimate the gain due to similar playing style. This is unlikely to be true for NNUE and it has completerly new evaluation function but comparing one SF to other will have this problem. But conclusion can be that SF12 is stronger than A0 and difference is definately less than 200 pts

Comparaison has one weak point comparing engine to its earlier version tends to overestimate the gain due to similar playing style. This is unlikely to be true for NNUE and it has completerly new evaluation function but comparing one SF to other will have this problem. But conclusion can be that SF12 is stronger than A0 and difference is definately less than 200 pts

I am better than Alpha Zero. Thank you for asking.

I am better than Alpha Zero. Thank you for asking.

@Megadoggah

"This including running stock fish on a single core"

Where are you getting this from? As far as I know Stockfish 8 was running on 44 cores.

@Megadoggah "This including running stock fish on a single core" Where are you getting this from? As far as I know Stockfish 8 was running on 44 cores.

Comparing Stockfish 12 to AlphaZero is, in many ways, like comparing Carlsen to Fischer: both AlphaZero and Fischer just stopped playing. In this sense comparing them is sorta silly.

That said: it supposedly took AlphaZero 4 hours to learn the game from scratch and become a stronger engine than SF8. If that's the result of 4 hours, what would be the result of 2 years?

Claiming SF12 is the stronger engine seems really speculative at best to me.

Comparing Stockfish 12 to AlphaZero is, in many ways, like comparing Carlsen to Fischer: both AlphaZero and Fischer just stopped playing. In this sense comparing them is sorta silly. That said: it supposedly took AlphaZero 4 hours to learn the game from scratch and become a stronger engine than SF8. If that's the result of 4 hours, what would be the result of 2 years? Claiming SF12 is the stronger engine seems really speculative at best to me.

From the comments to the reddit article:

"I don't have trouble believing that a NN-enhanced Stockfish has the ability to surpass an outdated NN engine, but I think it's a bit misleading that we're considering the differential in ELO solely based on how much better SF is now from older SF, and then assuming the entirety of that translates to its strength over, say, AlphaZero. You'd need games between SF 12 and AlphaZero before concluding the full +211, no?"

This kind of hits the weakness of this whole line of reasoning in the balls. You can't determine how much A0 and SF12 differ in rating solely by considering how SF12 performs against SF8. If these were human players, we'd all agree that determining a rating in this way is complete and utter nonsense.

Determining a rating differential by having an engine play a single other engine is already very thin ice. If player x beats player y, and player y beats player z, you cannot conclude from that that player x would player z. Differences in playing style may well yield a rock-paper-scissors result instead.

Fact of the matter is: SF12 has never played AlphaZero or MuZero. So saying anything about how that would end is unreasonably speculative at best.

From the comments to the reddit article: "I don't have trouble believing that a NN-enhanced Stockfish has the ability to surpass an outdated NN engine, but I think it's a bit misleading that we're considering the differential in ELO solely based on how much better SF is now from older SF, and then assuming the entirety of that translates to its strength over, say, AlphaZero. You'd need games between SF 12 and AlphaZero before concluding the full +211, no?" This kind of hits the weakness of this whole line of reasoning in the balls. You can't determine how much A0 and SF12 differ in rating solely by considering how SF12 performs against SF8. If these were human players, we'd all agree that determining a rating in this way is complete and utter nonsense. Determining a rating differential by having an engine play a single other engine is already very thin ice. If player x beats player y, and player y beats player z, you cannot conclude from that that player x would player z. Differences in playing style may well yield a rock-paper-scissors result instead. Fact of the matter is: SF12 has never played AlphaZero or MuZero. So saying anything about how that would end is unreasonably speculative at best.

"TThat said: it supposedly took AlphaZero 4 hours to learn the game from scratch and become a stronger engine than SF8. If that's the result of 4 hours, what would be the result of 2 years?"

On the NN structure in use it is safe to assume not much stronger. The self play elo had clearly flattened after 4 hours. more training woudl have benefitted some but not much. then again making bigger NN and trainign it... maybe

"TThat said: it supposedly took AlphaZero 4 hours to learn the game from scratch and become a stronger engine than SF8. If that's the result of 4 hours, what would be the result of 2 years?" On the NN structure in use it is safe to assume not much stronger. The self play elo had clearly flattened after 4 hours. more training woudl have benefitted some but not much. then again making bigger NN and trainign it... maybe

@petri999

I'm entirely ensure what you are basing this on. If A0 had trained against various other strong engines over the past 2 years, it might have improved a LOT.

I'm currently having SF12 analyse one of SF8's losses (using 12 cores). But if you ask me, it really doesn't have a clue why SF8 lost that one.

https://lichess.org/TyeVmDkk

@petri999 I'm entirely ensure what you are basing this on. If A0 had trained against various other strong engines over the past 2 years, it might have improved a LOT. I'm currently having SF12 analyse one of SF8's losses (using 12 cores). But if you ask me, it really doesn't have a clue why SF8 lost that one. https://lichess.org/TyeVmDkk

it easy enought to see from picture showing performance over time that further training was verly unlikely to gain more strength. In Go there might have been some but I dont think much.

https://i.imgur.com/7FgJfLE.jpg

From the peer reviewed version of the paper
https://science.sciencemag.org/content/362/6419/1140

Oh and A0 did not train against other engines but only against itself. has trained against other engines it woudl not has been nearly as strong. whole point of A0 was to boot from scratch and not hindered by assumed good moves

it easy enought to see from picture showing performance over time that further training was verly unlikely to gain more strength. In Go there might have been some but I dont think much. https://i.imgur.com/7FgJfLE.jpg From the peer reviewed version of the paper https://science.sciencemag.org/content/362/6419/1140 Oh and A0 did not train against other engines but only against itself. has trained against other engines it woudl not has been nearly as strong. whole point of A0 was to boot from scratch and not hindered by assumed good moves

This topic has been archived and can no longer be replied to.