Developer update: 275% improved game compression

SebVC

#51

@Vetinari_Computer There is always un upper bound ;)

Usolando

#52

With such brilliant developers, lichess is the best chess site and improving everyday. You guys are awesome!

revoof

#53

@residualinsight Thanks, I'll take you up on that offer. It might take a while: Now that the first version of the game compression is live I'll have to finish some other stuff. Then I'll go back to your paper and actually try to understand it. (So far I've only read the introduction).

@Vetinari_Computer I would say it can't be much slower than the current version. I made a GitHub issue to gather non-backwards-compatible optimization ideas, so we can pick the best ones when we revisit the compression at some point in the future: github.com/ornicar/lila/issues/4129

@heroku Embarrassingly I made the graphs in the blog post with LibreOffice. If you mean speed measurements @isaacly did them with JMH.

bigAI

#54

Impressive.

BOT Vetinari_Computer

#55

In that case my better compression idea probably isn't worth it as it's connected to quite a computational effort. (also, clicking through the already given suggestions in the issue, those simply look better as well)
The case for sacrificing speed would be that in the long run the CPU usage would only grow in O(N) with (hopefully) more people coming to lichess, while the memory would grow in O(N^2). But given this already very good compression the constants might make it not worth it anyway.

The only thing I could see giving considerable gains still is generating the Huffman tree for each move based on some generated probabilities for each move to be played there specifically. But that sounds like a considerable slowdown and it's not even obvious whether there's a lot to gain with that.

MrCharles edited

#56

@residualinsight #44

> "it passed peer review -- and then they asked for money to publish so I just put it on Arxiv"

Thanks for giving them the "pay me finger." (I'm reading now.)

@revoof

After scanning and reading through a bit of the paper, it might be applicable in this case. (Maybe.)

"There is also a hypothetical motivation behind this approach; which is the possibility of creating generic alphabets for specific data domains allowing very rapid on-the-fly compression and decompression while maintaining reasonable rates of compression."

And since the alphabet for encoding the moves in a chess game is limited, the "dictionary" approach, as mentioned in Section 6 (Discussion) might help further refine compression. (It would actually be interesting to see if another run through SPSA after a refinement like this finds better compression ratios, or, if things come out about the same.)

grelf

#57

Amazing work, thanks for sharing it. Were you inspired by this old "Guru of the Week" solution: http://www.gotw.ca/gotw/072.htm? Or do great minds think alike?

revoof

#58

@grelf Intresting read, and no, I wasn't aware of that article. The concept of using entropy encoding on legal move indexes isn't too original, though, so I am not surprised it has been mentioned before. (In fact I linked another article in the blog that did so: triplehappy.wordpress.com/2015/10/26/chess-move-compression/). It's all about picking and tuning known ideas so that it gives good compression with reasonable performance.

residualinsight

#59

@revoof Your implementation sounds great. I have no specific suggestion on any changes at all.

--- It's just an idea about a potential of Huffman encoding which I think in general has not been explored. The reason for suggesting to talk, is because I don't generally consider accademic papers so easy to read (and I certainly wouldn't get bogged down in the maths) -- but the key concept is quite nice, and I could probably explain it much easier in person.

Anyway I love playing on Lichess, and thank you for making that more feasible for everyone, and for sharing the process.

I do like Huffman encoding, so it's nice to see it being used in a practical way on Lichess.

This topic has been archived and can no longer be replied to.