ELO rating • page 3/6 • General Chess Discussion • lichess.org

tpr

#21

#20
"it takes a lot of cumulated inaccuracies"

No. It takes one mistake.

#20 "it takes a lot of cumulated inaccuracies" * No. It takes one mistake.

petri999

edited

#22

Actually gentleman was talking about games between top engines. As of now they hardly ever do a single mistake leading to a loss. Obviously there is always the one decisive but it would not happen unless situation is already bad enough. Which is evident from computer vs computer matches which end up as draws. Asta sample ICCF correspondence chess championship (which is a my engine against yours) https://www.iccf.com/event?id=100104 only wins and losses are due to one participant dying.

So no mistake losses happening.

Actually gentleman was talking about games between top engines. As of now they hardly ever do a single mistake leading to a loss. Obviously there is always the one decisive but it would not happen unless situation is already bad enough. Which is evident from computer vs computer matches which end up as draws. Asta sample ICCF correspondence chess championship (which is a my engine against yours) https://www.iccf.com/event?id=100104 only wins and losses are due to one participant dying. So no mistake losses happening.

tpr

#23

#22
ICCF is average 5 days per move and is human grandmaster + engines. Your example is the WC Finals.
There are still decisive games in ICCF WC preliminaries, candidates, and semifinals.
That is with the same engines, but lesser humans.
https://www.iccf.com/event?id=68304
That is still 5 days average per move.
For engine vs. engine and at say 3 minutes/move there is ample opportunity for a superior engine to defeat lesser engines.
New versions get better each year and defeat earlier versions.

#22 ICCF is average 5 days per move and is human grandmaster + engines. Your example is the WC Finals. There are still decisive games in ICCF WC preliminaries, candidates, and semifinals. That is with the same engines, but lesser humans. https://www.iccf.com/event?id=68304 That is still 5 days average per move. For engine vs. engine and at say 3 minutes/move there is ample opportunity for a superior engine to defeat lesser engines. New versions get better each year and defeat earlier versions.

Naphthalin

#24

@tpr said in #23:

For engine vs. engine and at say 3 minutes/move there is ample opportunity for a superior engine to defeat lesser engines.
3min/move on a modern 8 core CPU is plenty of time to bring you into the realm of "current SF will draw most of the games against any engine from startpos", be it current Stockfish with 5 days per move or an engine from 10 or 20 years in the future playing on the hardware of the future". Taking the TCEC rating list https://tcec-chess.com/bayeselo.txt you have to go down by >200 Elo from the top to find an opponent engine which would lose to Stockfish from balanced position at 3min/move at least from time to time. And on top of that, the matchup isn't symmetric: the probability of winning (from startpos) against an engine 300 Elo lower rated (in an UHO rating list) is much, much higher than the probability of losing against an engine rated 300 Elo higher, for exactly the reason I already explained.

New versions get better each year and defeat earlier versions.
They defeat earlier versions when using intentionally biased books. The Stockfish regression tests (which you probably are referring to) stopped testing on the older (balanced) books with the release of SF16 in June 2023 because non-draws were becoming so infrequent that it became a waste of resources.

@tpr said in #23: > For engine vs. engine and at say 3 minutes/move there is ample opportunity for a superior engine to defeat lesser engines. 3min/move on a modern 8 core CPU is plenty of time to bring you into the realm of "current SF will draw most of the games against any engine from startpos", be it current Stockfish with 5 days per move or an engine from 10 or 20 years in the future playing on the hardware of the future". Taking the TCEC rating list https://tcec-chess.com/bayeselo.txt you have to go down by >200 Elo from the top to find an opponent engine which would lose to Stockfish from balanced position at 3min/move at least from time to time. And on top of that, the matchup isn't symmetric: the probability of winning (from startpos) against an engine 300 Elo lower rated (in an UHO rating list) is much, much higher than the probability of losing against an engine rated 300 Elo higher, for exactly the reason I already explained. > New versions get better each year and defeat earlier versions. They defeat earlier versions *when using intentionally biased books*. The Stockfish regression tests (which you probably are referring to) stopped testing on the older (balanced) books with the release of SF16 in June 2023 because non-draws were becoming so infrequent that it became a waste of resources.

tpr

#25

#24
As ICCF shows: grandmaster + engine beats grandmaster + engine at 5 days average/move in qualifiers, semifinals and finals.
Openings are not imposed, but directed by the human grandmaster.
So a future engine at say 3 minutes/move that matches the present ICCF grandmaster + engine will always beat a present engine at 3 minutes/move.

#24 As ICCF shows: grandmaster + engine beats grandmaster + engine at 5 days average/move in qualifiers, semifinals and finals. Openings are not imposed, but directed by the human grandmaster. So a future engine at say 3 minutes/move that matches the present ICCF grandmaster + engine will always beat a present engine at 3 minutes/move.

Naphthalin

#26

@tpr said in #25:

So a future engine at say 3 minutes/move that matches the present ICCF grandmaster + engine will always beat a present engine at 3 minutes/move.
Can you specify what exactly you mean here with "will always beat"? Does +2-0=998 from 1000 games already qualify?

@tpr said in #25: > So a future engine at say 3 minutes/move that matches the present ICCF grandmaster + engine will always beat a present engine at 3 minutes/move. Can you specify what exactly you mean here with "will always beat"? Does +2-0=998 from 1000 games already qualify?

tpr

#27

#26
https://www.iccf.com/event?id=68304
+11=125 from 136 games.

#26 https://www.iccf.com/event?id=68304 +11=125 from 136 games.

Naphthalin

#28

Not sure if this is the wrong link or whether you intentionally sent results from games played between 2017 and 2020, as it is completely irrelevant to the discussion how engines on the level of current SF would do against much stronger engines, be it by more time or other improvements.

The first SF NNUE version is from August 2020, and there has been significant progress made after that as well. SF Classical (stronger than the engines the ICCF players used in your tournament) gets trounced like this by modern engines: https://tcec-chess.com/#div=shcg&game=1&season=28

Not sure if this is the wrong link or whether you intentionally sent results from games played between 2017 and 2020, as it is completely irrelevant to the discussion how engines on the level of current SF would do against much stronger engines, be it by more time or other improvements. The first SF NNUE version is from August 2020, and there has been significant progress made after that as well. SF Classical (stronger than the engines the ICCF players used in your tournament) gets trounced like this by modern engines: https://tcec-chess.com/#div=shcg&game=1&season=28

petri999

#29

law of diminishing returns means that in computer search difference between 3 minutes and 5 days is not as big you think. Also the human assistance pretty much useless now after the neural nets gave chess engines "understanding of chess" way beyond of old handcrafted evaluation functions. See for exampole https://lichess.org/@/LazyBot which uses Leelachess policy head to pick "first move that pops into mind" that is enough for 2600 blitz. No human could just clance to board play like that. Obviously there is some implicit lookahead on neural net but remains just opinion as NN's are black boxes.

RuchirNSB

#30

A user called TheFledgling has written a great blog about this a few minutes ago (This is not an advertisement).