- Blind mode tutorial
lichess.org
Donate

Why Opening Statistics Are Hard

Nice work algebraically clarifying the sources of bias.

My win rate against 1...d5 depends on how good I play against 1... Nf6.

A nice insight. I think it's key to the difficulties with estimation. M might be played because the player is generally booked up, but it also could reveal knowledge of this particular opening. Very hard to estimate the effect.

Even though the thesis is, "it's hard". You writing out the formulae helps the rest of us clarify our thoughts on the topic, so thanks for the post!

Nice work algebraically clarifying the sources of bias. > My win rate against 1...d5 depends on how good I play against 1... Nf6. A nice insight. I think it's key to the difficulties with estimation. M might be played because the player is generally booked up, but it also could reveal knowledge of this particular opening. Very hard to estimate the effect. Even though the thesis is, "it's hard". You writing out the formulae helps the rest of us clarify our thoughts on the topic, so thanks for the post!

Gracias por el articulo,muy interesante las estadisticas de las estimaciones y la probabilidades hechas.Enhorabuena

Gracias por el articulo,muy interesante las estadisticas de las estimaciones y la probabilidades hechas.Enhorabuena

Another thought: players could have different ratings per opening (for example by ECO code) then there wouldn't be a need for an adjusted estimator.

Another thought: players could have different ratings per opening (for example by ECO code) then there wouldn't be a need for an adjusted estimator.

I am 24 year old mathematics professor and i still didnt understand the math

I am 24 year old mathematics professor and i still didnt understand the math

Looks like a standard math Olympiad question

Looks like a standard math Olympiad question

Absolutely love this work. Lines up perfectly with my analytical investigations. Would love to see you publish a book with this, maybe analyses of the top 100 statistically significant tabiyas or something to that effect.

Absolutely love this work. Lines up perfectly with my analytical investigations. Would love to see you publish a book with this, maybe analyses of the top 100 statistically significant tabiyas or something to that effect.

Yes, it’s really hard to develop a precise formula. For one, you need A LOT of data to support it.

Yes, it’s really hard to develop a precise formula. For one, you need A LOT of data to support it.

@Donderasie said in #8:

Yes, it’s really hard to develop a precise formula. For one, you need A LOT of data to support it.

Yeah, we all know about the branching factor but it's still startling how quickly your sample shrinks when you want to compare positions even just a few moves in.

@Donderasie said in #8: > Yes, it’s really hard to develop a precise formula. For one, you need A LOT of data to support it. Yeah, we all know about the branching factor but it's still startling how quickly your sample shrinks when you want to compare positions even just a few moves in.

Why using linear regression to model an outcome on {0, 0.5, 1}, (or [0, 1] if we decide to encode it as regression)? Logistic seems much more suited.
To be frank, a few years ago I concluded that using the rating delta of the move was the best thing, because it corrected for stronger players choosing a bad line to put the lower rated opponent out of book. (I also concluded that using the winrate of the move was particularly bad). The bias towards zero doesn't bother me at all, because it still probably give correct relative ranks between the move (the decision here is to choose the move, not quantify something that is very abstract). This bias also doesn't bother me as there is a lot of uncertainty anyways due to opening theory evolving over time. Let's say some line was discovered in 2021 in a particular opening that completely changes the evaluation of an important line. Then all your estimates based on data older than 2021 will be biased. Since we won't, as a statistician, make the work of recording all opening theory evolution in each and every line and correct estimators based on them (because this is way too much more to be feasible), we are let with analyzing inherently noisy data. If the data is known to have some wrong observations (but we can't filter out of the wrong ones, as is the case here), then it doesn't make sense to dismiss estimators just because they have some limitations.

My recommendation is to look at the rating gain of a move in a position per different skill brackets (<2000, 2000-2300, 2300-2500, 2500+ for example). The rating gain allow you to see which move is better in the skill brackets. If the ranking changes between the skill brackets, it means some move are better/worse against good opponents, indicating whether the move scores well because it is inherently good or because it's difficult to play against it at some levels. (And do not trust a rating gain if you have too little games played with that move. Ideally you want to filter for recent games, and keep just enough history to have de decent samplesize on the moves).
This process gives you a lot of information to choose the moves you want to play, not too hard to follow, even if on paper there is bias, this bias is much less anyways than the uncertainty coming from the data quality issues.

Why using linear regression to model an outcome on {0, 0.5, 1}, (or [0, 1] if we decide to encode it as regression)? Logistic seems much more suited. To be frank, a few years ago I concluded that using the rating delta of the move was the best thing, because it corrected for stronger players choosing a bad line to put the lower rated opponent out of book. (I also concluded that using the winrate of the move was particularly bad). The bias towards zero doesn't bother me at all, because it still probably give correct relative ranks between the move (the decision here is to choose the move, not quantify something that is very abstract). This bias also doesn't bother me as there is a lot of uncertainty anyways due to opening theory evolving over time. Let's say some line was discovered in 2021 in a particular opening that completely changes the evaluation of an important line. Then all your estimates based on data older than 2021 will be biased. Since we won't, as a statistician, make the work of recording all opening theory evolution in each and every line and correct estimators based on them (because this is way too much more to be feasible), we are let with analyzing inherently noisy data. If the data is known to have some wrong observations (but we can't filter out of the wrong ones, as is the case here), then it doesn't make sense to dismiss estimators just because they have some limitations. My recommendation is to look at the rating gain of a move in a position per different skill brackets (<2000, 2000-2300, 2300-2500, 2500+ for example). The rating gain allow you to see which move is better in the skill brackets. If the ranking changes between the skill brackets, it means some move are better/worse against good opponents, indicating whether the move scores well because it is inherently good or because it's difficult to play against it at some levels. (And do not trust a rating gain if you have too little games played with that move. Ideally you want to filter for recent games, and keep just enough history to have de decent samplesize on the moves). This process gives you a lot of information to choose the moves you want to play, not too hard to follow, even if on paper there is bias, this bias is much less anyways than the uncertainty coming from the data quality issues.