@ant_artic
Thank you for taking an in-depth look at the code. I will have a look at all you suggestions when I get the time.
Also, having large jumps seems to make it converge faster. I guess overshooting is useful to quickly get good and bad players away from 1500. Might not actually be useful when you seed the ratings with previous ratings. I actually did bayesian optimization on the hyper-parameters for fun. I also computed the optimal "K" values for 1000 steps of optimization on the blitz dataset. Here is a graph of ln(K) over 1000 steps:
Thank you for taking an in-depth look at the code. I will have a look at all you suggestions when I get the time.
Also, having large jumps seems to make it converge faster. I guess overshooting is useful to quickly get good and bad players away from 1500. Might not actually be useful when you seed the ratings with previous ratings. I actually did bayesian optimization on the hyper-parameters for fun. I also computed the optimal "K" values for 1000 steps of optimization on the blitz dataset. Here is a graph of ln(K) over 1000 steps: