@petri999
Like I said: calculating a rating from play against a single opponent is dubious at best. These graphs would have been a lot more interesting if AlphaZero actually competed in a broader competition with many different opponents. Unfortunately, that never happened.
There is no way of telling how much A0 would have benefited from training against different opponents. A flat curve against SF8 most certainly will tell you nothing about that.
@petri999
Like I said: calculating a rating from play against a single opponent is dubious at best. These graphs would have been a lot more interesting if AlphaZero actually competed in a broader competition with many different opponents. Unfortunately, that never happened.
There is no way of telling how much A0 would have benefited from training against different opponents. A flat curve against SF8 most certainly will tell you nothing about that.
@petri999
"Oh and A0 did not train against other engines but only against itself. has trained against other engines it woudl not has been nearly as strong."
Again, to me it's entirely unclear what you are basing this on. I'd say that training against a wide variety of opponents and playing styles would only benefit the learning process, not hinder it.
PS: If diagram A, the results in that diagram, are exclusively the result of self-play it really doesn't make any sense. You can't measure performance and/or rating from self-play. Ratings are fundamentally relative to other players. Without other players you're not measuring anything. That's like being in a spaceship without windows moving at a constant speed, and then asking yourself 'how fast am I going?'. That question is fundamentally impossible to answer.
@petri999
"Oh and A0 did not train against other engines but only against itself. has trained against other engines it woudl not has been nearly as strong."
Again, to me it's entirely unclear what you are basing this on. I'd say that training against a wide variety of opponents and playing styles would only benefit the learning process, not hinder it.
PS: If diagram A, the results in that diagram, are exclusively the result of self-play it really doesn't make any sense. You can't measure performance and/or rating from self-play. Ratings are fundamentally relative to other players. Without other players you're not measuring anything. That's like being in a spaceship without windows moving at a constant speed, and then asking yourself 'how fast am I going?'. That question is fundamentally impossible to answer.
Well you can make hypothesesis that playing against different opponents could have helped. I do think so and certainly it was never done. it was not trained with SF8 it was tested against SF. so cannot say for certain you are wrong BUt had it been trained against SF8 it would formed a very narrow view to chess i.e specializing to play agains SF8. Similarly had there been several engines then it would specialized to beating those engines.
Another big reason for self play is that it is computationally possible. One training game smust short ans consume only small amount computing resources. On self play it is possible to use on TPU/TPU2 cluster to run unlike with using
And i do base my opinions on having read all the papers published all the from AlphaLee. which is good example Go progaram trained on PRO-go player games could beat best humans quite well but was patzer compared to totally self trained ones. At the time there were no program even near strength of best humans so it analogous ontraing by best traditional chess engines
Well you can make hypothesesis that playing against different opponents could have helped. I do think so and certainly it was never done. it was not trained with SF8 it was tested against SF. so cannot say for certain you are wrong BUt had it been trained against SF8 it would formed a very narrow view to chess i.e specializing to play agains SF8. Similarly had there been several engines then it would specialized to beating those engines.
Another big reason for self play is that it is computationally possible. One training game smust short ans consume only small amount computing resources. On self play it is possible to use on TPU/TPU2 cluster to run unlike with using
And i do base my opinions on having read all the papers published all the from AlphaLee. which is good example Go progaram trained on PRO-go player games could beat best humans quite well but was patzer compared to totally self trained ones. At the time there were no program even near strength of best humans so it analogous ontraing by best traditional chess engines
Can someone give me accepted facts please in this thread? I am interested.
Can someone give me accepted facts please in this thread? I am interested.