jomega
Strategic Test Suite (STS): Introduction
STS is an interesting collection of 1,500 chess positions organized by strategic theme.The Strategic Test Suite (STS) is a series of themed test suites by Dann Corbit and Swaminathan Natarajan initially created circa 2008. Ferdinand Mosca updated the suite and wrote an STS-rating system. That code was last updated in 2019.
The positions were designed to aid chess engine programmers in evaluating their program's understanding of strategical concepts. The positions are also interesting for evaluation by humans!
The EPD file from Mosca has 1,500 positions organized in the following themes with 100 positions per theme:
| N | Theme |
| 1 | Undermining |
| 2 | Open Files and Diagonals |
| 3 | Knight Outposts |
| 4 | Square Vacancy |
| 5 | Bishop vs Knight |
| 6 | Re-Capturing |
| 7 | Offer of Simplification |
| 8 | Advancement of f/g/h pawns |
| 9 | Advancement of a/b/c Pawns |
| 10 | Simplification |
| 11 | Activity of the King |
| 12 | Center Control |
| 13 | Pawn Play in the Center |
| 14 | Queens and Rooks to the 7th Rank |
| 15 | Avoid Pointless Exchange |
I recently ran the STS-rating system with the latest versions of Stockfish, Lc0, and Komodo 12.1.1. The results were:
| Engine | Version | Score (%) | Rating |
| Stockfish | 14 | 84.7% | 3530 |
| Komodo | 12.1.1 | 81.4% | 3382 |
| Lc0 | v0.27.0 | 61.8% | 2507 |
In the EPD file, the positions have a best move (scored 10 on an arbitrary scale) and several alternative moves with a smaller score. I modified the STS-rating code so that it would output how many of the positions the engine failed to find the best move. I then ran Stockfish again, but this time gave it 3 secs per position. That is almost 40 times the allowed time per the STS-rating allowed time for my machine! With that setting, Stockfish missed entirely 67 positions. Stockfish failed to get the supposed best move on 220 positions.
I decided these positions Stockfish failed would be interesting to investigate from a human perspective. I'll describe my findings on this in coming blog posts.
Links
- The Strategic Test Suite (STS) home page.
https://sites.google.com/site/strategictestsuite/
- The STS-rating code.
https://github.com/fsmosca/STS-Rating
- STS discussion and links on the chessprogramming.org site.
https://www.chessprogramming.org/Strategic_Test_Suite