lichess.org
Donate

Rating Inflation

Do bots improve from playing? I think that the rating systems might be best studied (if enough data over long periods), that way for their pure statistical transient propreties. The individual improvement compononent could be absent (unless the bots has had parameter tuning or changes and kept the same name, might need to figure out what defines a bot).

My understanding is that there is no on-line learning for bots. But maybe I am wrong. If not, they have a set skill level, and any evolution or fluctuations would be from the aleas for the pool variations and the rating system mechanics (harwired stochastic processes). I have not read all the arguments yet, only responding to #13. ("arguments" not in the dispute sense).
@StingerPuzzles Rating inflation means that at any rating level, naturally, you will be in a lower percentile. Think like this, you are 1700, now the average rating level increased to 1600 from 1500, this will decrease your percentile as a 1700 even though your rating did not change. You can think this as a shift in distribution curve, that's what I think I observed here.
there is no online learning on bot for various reasons:
- if there would loads of games against same opponents the bot would learn to play against them i.e could start to make bad moves
- point of bot it to provide stable opponent
- learning takes huge amount of games in computer sense so learning better done offline so

so dboing's argument stands change in bot rating is either deflation/inflation or noise of inherently involved in rating process. Some of noise cold come from biased sample of opponents but I don't see that overly likely
@petri999 I think it is likely because from personal experience, I think that if a person is playing ranked games with other people they tend to stick to that and generally people who rarely play against normal people online prefers bots. I have no data, but my own experience is telling that to me. Hence inflation may not effect bot ratings since they will always play with people with ? ranks or no ranks in my opinion.
@egeus

Stinger is right about the percentile thing in the sense that you can’t judge the absolute skill level of a 1700 now and then has changed coz you are working with incomplete information.
What you can say for sure tho is that there are people that had too high rating after the Netflix effect coz a lot of new 1500 would flood points into the 1500-1600 region. With enough games being played everyone being >average would carry those points somewhere up into the rating system where it eventually „vanishes“.

Glicko 2 is the best system to decrease this effect (I win 5 points vs ? Player and he loses 100 e.g.) but if there suddenly exist millions of new accounts it will still make a slight impact.

Now why is 1700 suddenly a different % even tho Glicko is supposed to balance itself out?

The reason for that is that some people improved a lot during Corona and new strong players showed up on lichess I assume.
too high rating is expression that has little use in rating pool. Point "flooded"to 1400-1600 region playing new comers does not stay there. they get extra points and stronger and weaker players get more points from winning them and hence the extra points soon spread across the pool and everyone on average gets little higher. And that is not being overrated as the ratings are only relative there is no correct absolute value to any specific skill level
Are there glicko or other ratings artificial populations simulations out there, with various input scenarios about population, flow in, flow out. perturbation from stready state with some "COVID" drop in the tank of many newvcomers (themselves with some hypothetical distributions). I mean pure controlable systems. game encounters would behave as rating pair would as probability of rating1 winning given opponent rating2 (how to write that in English...).

I know it would not be needed for mathematical proof, but for pedagogical purposes and hands-on interactive exploration, it might make the dependencies and notions considered here more tangible. Push here, get a protrusion there (or not), etc....
@dboing there is sort of actual experiment i.e Lichess itself. If you look at median rating on any pool of Lichess it is about 1500 which really cannot be anything but artifact of rating system and in particular starting rating of newcomers.

Just googling "bradley terry model simulation inflation" did not produce anything I guess coming up with another set of keywords coudl do better

This topic has been archived and can no longer be replied to.