Calculating the Sharpness of Different Players

ChessChess engineSoftware Development
How well does the sharpness score agree with the human intuition about sharpness?

I have recently written about a way to use Leela Chess Zero to get a sharpness score for a chess position and used it to compare the sharpness of different openings. The results mostly agreed with my understanding of the sharpness of the openings.
Now I want to use this score to see how sharp a player plays in a single game or on average.

How to calculate the sharpness of a player?

The most obvious idea is to take the average sharpness scores of the positions during the game. However, the problem with that approach is that the sharpness of a position depends on both players.
To illustrate this, imagine you want to evaluate the sharpness of a player playing with the black pieces. If their opponent decides to play the King's Gambit, there will almost certainly be a very sharp position on the board, regardless of what kind of moves Black is playing.
In order to avoid that problem, I decided to look at the change of sharpness after each move.
This is simply the sharpness after a move has been played minus the sharpness before a move has been played. For example, the starting position has a sharpness of 0.468. After 1.e4 the sharpness is 0.471, so White would have a sharpness change of 0.003 for 1.e4. The sharpness after 1.d4 is 0.450, so White would have a sharpness change of -0.018.
In the end, I take the average of all the changes in sharpness after every move a player has played in order to get their sharpness change score.
Looking at that change in sharpness as opposed to the sharpness of the position itself should make it possible to isolate the play of an individual player.

Sharpness for many players

The first thing I decided to test was looking at the average sharpness change per move of many players.
I did this by analysing the games of all Candidates tournaments since 2013 (not including the 2024 Candidates). This led to the following sharpness change scores for the players:
Average sharpness change per move for the players in the Candidates tournamentsNote that the players have played a different number of games and all games are only from the candidates.
Some sharpness values seem a bit surprising, like Carlsen having the second highest sharpness or Firouzja having the lowest sharpness.
This can have many reasons, the main one is probably the small sample size. Looking at every game of a player would give a much better picture of their sharpness, but it would also take much too long to analyse all the positions. Note that long draws also reduce the sharpness quite a bit, especially with a smaller sample size.
I also only looked at Candidates tournaments, so some players might have played more conservatively in them compared to their usual play. In a similar way, the tournament situation can also play a role when only looking at games from a few tournaments.
Overall I think that I would need to analyse many more games of the individual players to make the sharpness change scores more comparable. I'll certainly look more into this in the future but as I said, this takes a lot of computing time.

Sharpness in a match

In order to remove some of the problems mentioned above and make the sharpness change of different players more comparable, I decided to look at a match between players.
Looking at a match makes the sharpness of different players easier to compare since they are playing the same games. When one game is a long draw without any changes in sharpness, it affects the average sharpness change for both players, so the scores should still be comparable.
There is still the problem with different match situations, but I'm unsure how this can easily be solved.
The first match that came to mind was Tal-Botvinnik, 1960 since the players had such different styles and I hoped that this would be reflected in the sharpness scores. And indeed, it was:

Average sharpness change

Tal's sharpness change is significantly higher than Botvinnik's which is a good illustration of the different styles of the two players.

2014 Magnus vs 2019 Magnus

The final thing I wanted to test for now was how the sharpness change of one player might change over time.
The first example that came to mind was Magnus Carlsen. In 2019, he had an amazing year and the most striking thing was that he changed his style quite a bit and played more for the initiative. I decided to compare his games from 2019 to the games of 2014 since he also played very well in that year and therefore the quality of his games should be similar.
I expected that Carlsen's play was much sharper in 2019 than in 2014 and this is also what the sharpness change score says:

Average sharpness change
Carlsen 20140.062
Carlsen 20190.122

Overall I’m very happy that the average sharpness score agrees quite well with my intuitive feeling about the sharpness of the players.

Final Remarks

In my previous post, I looked at the distribution of the centipawn loss per move as opposed to the more common metric of average centipawn loss since I felt like some insight gets lost when only computing the average.
Why then am I only looking at the average of the sharpness change score?
Whenever testing something new like this, I want to keep the first tests as simple as possible. I could have looked at the distributions (I will certainly look at them in the future) and might have gotten a more nuanced perspective of the sharpness of the players. However, this would have made comparisons between the players much more difficult. My foremost goal was to make sure that the sharpness change score agrees with my human understanding of sharpness. Using a singe number makes such a comparison much easier.
Another important point to consider when looking at the sharpness is its relation to the quality of play. As an example, imagine a completely drawn position in which one player makes a mistake and is much worse after that. Due to the mistake, the sharpness will be higher and therefore the player will be classified as sharper according to the measure I presented.
Now one can ask if this is a big issue with the score.
I think that it depends on the way you look at it. However, I don't think that this is too much of a problem since I think that all these engine metrics like accuracy or sharpness shouldn't be looked at in isolation. Instead, they should be looked at together to get a better picture of the game.
I’m very interested to know what you are thinking about the average sharpness change score, so let me know!
If you've enjoyed this post, check out my Substack.