Attempt to classify openings broadly

Most of low-intrmediate and intermediate chess players try to learn as many openings as possible to develop their opening repertoir. It is very important to know at this stage which openings are popular and widely used by exprienced players. One approach is to start with individual moves, building the whole opening tree. Another one is to memorize hundreds of individual openings. I wanted to create some classification i between.


Firstly, I decided to classify positions rather than moves. This allows us to reduce the number of combinations and not to take in account the order of moves. Secondly, I decided to take first three moves, because in these moves players have some freedom to choose their preferred opening, later on they must response to the oppnent's play. Thirdly, I decided to study positions of white and black separatedly, in other words, to classify positions and moves of white, discarding positions and moves of black. This is, of course, a very big simplification; however it allows to reduce the number of categories significantly, and, as I said, in first moves the players play more or less independently. It's up to you to decide whether the classification is useful.

DEFINITIONS. STARTER MOVES are 20 moves available from the starter position, they are a3, a4, b3, b4, c3, c4, d3, d4, e3, e4, f3, f4, g3, g4, h3, h4, Na3, Nc3, Nf3, Nh3. I will call CENTRAL MOVES a subset of 10 of them, namely c3, c4, d3, d4, e3, e4, f3, f4, Nc3, Nf3, basing on their importance and frequncy. By the term FEATURE I understand one non-starter move, preceded by one starter move, or some specific position requiring two starter moves in specific order. In most cases, first three moves either have no features (containing only three starter moves) or one feature and one starter move. For example Bb4 is a feature. If white made the move Bb4 in the first three moves, he or she should spend one of three moves for e3 or e4, one move for Bb4 and (most probably) one other starter move. This turns feature Bb4 into a category. (As I mentioned, I discard order of moves, so it can be 1.e4 2.Bb4 or 1.e4 3.Bb4 or 2.e4 3.Bb4) The games when two features are developed within first three moves are possible (say, 1.e4 2.Bb4 3.e5, giving Bb4 feature and e5 feature), but either they are left unclassified because of infrequency or a second feature is explicitly included in the list of allowed moves for a given category. I treat also some combinations of two starter moves as features, say c4 Nc3, because they must be done in specific oreder and not independently.

METHODS. I created a Python script for classification. I used a game database of March 2017 from lichess, selected from there games where both players have rating not below 2000, main time is not below 300 seconds, games contain not less than 9 plies and game scores are given. This selection gave me 125445 games which is suitable for the first test of the approach.

Now enough with introduction, let us see what I got.

CATEGORY 1. CENTRAL QUIET GAMES (22.35%). Formal definition: all three first white moves must belong to the central moves (see definitions) and no white piece should be taken by black within first three moves and white does not develop Nc3+c4 feature or Nf3+f4 feature. Very wide and varying category, however, it contains some very frequent positions. Position 1a, d4+Nf3+c4 in some order, gives 5.6% of all games. Position 1b, d4+Nc3+e4, gives 4.56% of all games. Position 1c, e4+Nf3+Nc3, gives 2.6% of all games. Other 55 positions belonging to this category were found in the data, they give 9.58% of all games.

CATEGORY 2. CENTRAL SHARP (14.03%) Formal definition: all three first white moves must belong to the central moves, at least one white piece should be taken by black within first three moves and white does not develop Nc3+c4 or Nf3+f4 feature. This category has a leading position 2a: white plays e4+Nf3+d4 in some order, while black takes d4 pawn (position of other black pieces is not counted, as usual). This position gives as much as 9.36% of all games. Other 73 positions of white pieces beloging to this category found in the database give 4.66% of all games.

CATEGORY 3. QUEEN RIDER, f4+Nf3 feature (11.90%) Formal definition: first three moves must contain c4 and Nf3, all three first white moves must belong to the set [starter moves]+[c4, Nf3, cxd5]. This category also has a leader: position 3a c4+Nc3+d4 in some order with no takings gives 8.42% of all games. Other 29 positions found in the database belonging to this category give 3.47%.

CATEGORY 4. KINGSIDE BISHOP, features Bc4 or Bb5 or Bd3 (10.70%) Formal definition: first three white moves contain Bc4, Bb5 or Bd3 and all three white moves must belong to the set [starter moves]+[Bc4, Bb5, Bd3]. This category has two sharp leading positions. Position 4a, e4+Nf3+Bb5 in some order, gives 4.27%. Position 4b, e4+Nf3+Bc4 in some order, gives 3.77%. Other 60 positions give 2.66%

CATEGORY 5. SCANDINAVIAN-LIKE, feature exd5 (6.67%). Strict definition: first three white moves must include exd5 move and all three white moves belong to the set [exd5, Bb5, dxc6, Bc4]+starter moves. This category does not include only strict scadinavian countergambit, it just says that white takes d5 pawn within first three moves. Say, a game 1. e4 d6 2. d4 d5 3. exd5 falls into this category. This subcategory does not have very sharp leading positions. Totally 35 positions found in the database.

CATEGORY 6. QUEENSIDE BISHOP, features Bg5 or Bf4 (6.24%). Strict definition: first three moves must include Bg5 or Bf4, and all three initial moves belong to the set [Bg5, Bf4, Bh4, Bxf6]+[starter_moves]. This subcategory has two sharp leading positions. Position 6a d4+Nf3+Bf4 gives 2.01%. Position 6b d4+Bf4+e3 gives 1.75%. Other 53 positions give 2.48%

CATEGORY 7. FIANCETTO (5.57%). Strict definition: white must play either combination g3+Nf3 or combination g3+Bg2 in their first three moves, and all first three moves must belong to the set [g3, Nf3, Bg2]+starter moves. This category has only moderate leading positions, totally 41 positions in the database.

CATEGORY 8. KINGSIDE PROMOTION, e5 feature, (5.00%). Strict definition: first three moves must include e5, all three first moves must belong to [e5]+[starter moves] set. This category was a moderate surprise for me. Position 8a e4+e5+d4 gives 3.08%, other 25 positios in this category give 1.91%.

=== The categories above give 82.45% accumulated ==

CATEGORY 9. SIDE GAMES (2.96%). Strict definition: all three initial moves must belong to the starter moves, but the game explicitly does not fall into categories: central quiet, central sharp, fiancetto, queen rider, king rider. Obviously, one of side moves must be done, (otherwise it will fall into central games, queen rider or king rider categorie), but not combination g3+Nf3 (it would fall into fiancetto). All three moves are starter moves, so no feaure was developed. It is a very broad category with 259 positions and no sharp leading positions, the moves like a3, b3, h3, b4 are common.

CATEGORY 10. KING RIDER (2.42%) Strict definition: combination f4+Nf3 is developed within first three moves and all three moves must be starter moves.

CATEGORY 11. THREE-QUARTER-INDIAN, Nbd2 feature (2.34%). Strict definition: Nbd2 must be done within first three moves and all first three white moves must belong to the set [Nbd2]+[starter moves]. Another surprise for me.

CATEGORY 12. QUEENSIDE PROMOTION, feature d5 (2.12%). Strict definition: d5 must be played within first three white moves and all three first moves must belong to the set [d5]+[starter moves].

CATEGORY 13. QUEENSIDE FIANCETTO, feature Bb2 (1.54%). Strict definition: Bb2 must be played within first three moves and all three moves must belong to Bb2, Bxe5, Bxf6 + starter moves.

CATEGORY 14. QUEENSIDE ATTACKED, feature cxd5 without knight (1.06%). Strict definition: cxd5 must be played within first three moves, all moves must belong to the set cxd5 + starter moves and the game must not explicitly fall into queen rider category. The latter condition excludes Nc3 move.

CATEGORY 15. KINGSIDE ATTAKED, feature dxe5 (1.05%). Strict definition: dxe5 must be played within first three moves, all first three moves must belong to dxe5 + starter moves set.

CATEGORY 16. KNIGHT ATTACK, feature Nxe5 (0.68%). Strict definition: Nxe5 must be played within first three moves, and all first three moves must belong to Nxe5 + starter moves set.

UNCLASSIFIED (3.38%) The classification could be continued either by allowing combinations of the above non-initial moves, or by adding more non-initial moves. The former approach will allow to classify games like e4+e5+Bc4. Polish opening g4+Bg2 is not uncommon among unclassified games, it does not fall into side game catgory because of Bg2 requirement, also it does not fall into fiancetto category because of g3 requirement. The next non-initial move is dxc5, giving no more than 0.2%. So the classification stops here.

What is the point of these categories?
Are you sure no categories overlap?

I'm sure that categories do not overlap, because I checked it programmaticaly (if they overlapped due to error, percentage of overlap is shown).
The purpose is to do classification broader than list of separate positions (although some positios are very common). For example, category 5, including exd5, says that white do this move quite often within first three moves. One can study why is it so, how can white do if black play d5, what are benefits and disadvantage of this move, which files and ranks it opens etc. On the other hand, dxc5 is not worth to learn, because it occurs infrquently.
The broader answer: the categories help to learn (of course for low intermediate and intermediate players).

By the way, tpr, you have rating around 2000. How did you learn openigs? Do you observe moves like e5 (5%) in the very beginning, or Nbd2 (2%)? Do you use them or how do you answer them playing black?
How would you improve classification?

It doesn't really matter about all the openings. Its good to know the mainlines and main variations of about 6 popular openings, but not too much in depth.

If you know what to play against 1. e4, 1. d4, 1.c4, and 1.Nf3 you will probably be fine as black, and for white I recommend one opening to play (although if your opponent plays a different defense that you know then you could play it from the white side).

Even if you don't know the opening (since there are several dubious or unpopular openings, then you should be fine by just playing by the opening principles, etc.

Knowing how to play every opening is not as important as knowing the most basic ones at standard level (such as the Ruy Lopez) and being able to play good moves, since having a response for every single move in chess (even in the opening) is impossible.

Good Luck with your openings :)

"How did you learn openigs?"
Mainly by experience: just play what seems logical and after the game look it up.

Openings are overrated. Tactics and endgames are far more important.

@achava_06 This is exactly what I've read in many places and statistics: how to play against e4, c4, d4 and Nf3. But this is a first one move. Is not it nicer to know how to play against, say c4+d4+Nc3 position of white (seems to be English opening, although my terminology may be wrong), knowing, that white develop this position in as many as 8% games?
Is it worth for black to prevent white from developing c4+d4+Nc3 at any cost, or black can find weaknesses in this positin later?

By the way, how to play against fiancetto? Nf3 may lead to g3+Bg2 and bishop on the main diagonal, or lead to Bc4 or Bb5.

My goal was to develop some classification in between knowing and classifyig only first move and knowing hundreds of openings.

By the way, I started to think on e4 move. Seems it can be very effectively answered by black's d5. However, it is the top move among grandmasters.

Thanks everyone for comments.

My approach still seems me somewhat useful.

@MrPushwood ECO is too detailed, contains many permutations. I'm just a beginner and wish to have a panoramic view. (And I expect many other beginners also want it.)

You can't post in the forums yet. Play some games!