lichess.org
Donate

Working out rating inflation

Rating inflation has always existed. With this past year and a half, there have been far more newcomers to chess and by extension lichess as well. This means that rating inflation is (should be) more pronounced than normal. With every spike in rating I've become weary of celebrating it as it could easily just be rating inflation and not chess knowledge. I'm wondering if there's a way to figure out how much inflation there is so I (and others as well I guess) can accurately measure improvement?
P.S. (Improvement comes from learning new ideas/being disciplined, but rating is a far more tangible way to measure this).
@borisspasskyfan no... :( We have very few tournaments over here, also only playing fellow South Africans kind of makes it a closed system, so even getting a rating would be inaccurate somewhat. Also few have fide rating to begin with...
Let's compare the dirt on shoes to the inflation added to an account. The rating you want if it is to represent true strength is relative to the cleanliness of the shoe (not necessarily the expense, i.e. just because you play a KG line or difficult KID doesn't make you a better player).

If people come into a store with all that dirt, it muddies the floor. But, just like the ratings getting more accurate, the shoes will also get cleaner as you brush off the dirt while walking around the store. The goal should be hopefully not to have a dirty store.

Professional shoppers are trained to brush off their shoes before entering, leaving less dirt to be distributed around the store. This is where the Glicko system "shines". It is a great formula for higher rated players, like members at a country club or VIP area.

It doesn't bode so well with your average commuter who has to walk in puddles, commute on the train, or walk on the streets. Their shoes get dirty.

So how do we dish out points to lower level players? I'll use 2000 as the threshold point and try to be closer to a FIDE or USCF rating of 2000 (not an online 2000).

My view is that the 200-400 (yes I got 400 once on an account) jump in rating points is worse than a bull in a china shop. You're spraying mud on the walls and splattering the ceiling with dirt. It doesn't help the store at all. Saying you will come in with less dirt next time because your rating will go down doesn't change the possibility of you creating a new account and doing the same thing.

This is why titled players who are known are not going to be that "dirty". So what is the solution? Ross Perot said if government is like an inner tube and it has holes in it, then you need a new inner tube. Or maybe that was just Dana Carvey on SNL saying that. But the truth is there.

You need a new intertube. Reward players for wins on a more consistent basis and with more points. Instead of inflating a win initially with 200-400 points, give 10 points per win. If someone is really higher rated, all they have to do is win 10 or a ratio that warrants it (like 11:1 or 14:4). This will give them a 100 point increase.

Those that feel they should be 1800 instead of 1500 could start out at 1800 then instead. Instead of 10 point rewards, you give 5 point rewards (just as an example). If you really think you are hot stuff on the board, then ok, you then use the Glicko System as it is now for 2000+ opponents. However, if you lose you lose just like it is now. You go down 200 initially.

This will allow players like myself in the 1500-2000 range a chance to move up faster with wins instead of playing ad nauseam games when my opponent blunders their queen but refuses to resign at the 1600 level. When you are making that kind of error you shouldn't be paired with less dirty shoe wearers.

But hey, I know someone will now reply with the Glicko sermon and state how it supposedly works wonders and we can reach Nirvana if we just put up with the numerous useless games. I am sorry but I don't want to be like some of my opponents who have been members since 2016 and only got to 1650 in 5 years, still blundering away major pieces.

The only solution I see for the current system is to close out your account, create a new one. Jump to the top and work on staying there. I wish it wasn't that way. I wish I could earn 10 points until I reached a certain level, and then it would taper off. The inflation factor to get to say 1700 or 1900 would diminish greatly unless you won more games at those levels.

It will NEVER help a 1300 rated player jump to 1600 unless they know why that 1600 player lost. If the 1600 player lost because they goofed up at the 1600 level, then how would a 1300 know any better? If the 1600 player is actually a 1300 player also but has an 300 point inflated rating, then it's a useless game. You might as well flip coins and decide a winner that way.

And another point which highlights a "stampede" effect is what happens when a 2000 rated player is paired with another 2000 rated player and they are both starting at 1500? One of those players is probably going to lose. They will end up under 1500 and be like 1350? next game. Will they get paired with another 2000 or someone closer to 1350? If closer to 1350, then why on earth do we want a 2000 rated player playing someone 650 points lower? That is insanely stupid.

It doesn't matter if it is just initially that way. New accounts can be created, so the flow of traffic is 2000 rated players trampling over 1350 players. Of course they will win. Of course the 1350 will lose. There is nothing to learn when you get pummeled against someone 650 higher rated than you.

Long story short, reward lower rated players less points initially and make them earn the status. Use something like 1500, 1700, 1900 as markers to decide giving less points to sort out the inflation. I would rather be given 2 points for a win at the 1850 level if I knew I was 1850 and not yet 1900.
@GuessMyRating I appreciate the comparsion and the suggestions, but I was more looking for a way to measure it rather than prevent it.
For precise measurement, you need a never changing calibration tool. The best that comes to my mind is a kind of a frozen standard engine. Same software, same hardware for eternity. Let this engine play quite regular in the pool, best without other players knowing about its special status. If the rating of this calibration engine goes up, this would be a sign for general inflation.
The issue with measuring rating inflation will be finding a stable point of comparison. With currency systems, inflation is often measured against certain commodities like grain, milk, etc. If a loaf of bread is $1 one year and $2 the next, then we can say that 100% inflation has occurred. With money, inflation is intuitive, but still fairly difficult to measure because many things can affect prices. But still, let's keep using our money comparison.

What do rating points purchase? They buy us a percentile. My own lowly rating puts me at the 38th percentile of the Lichess community; 1450 points is around the 50th percentile; and 3246.78 points will buy you that sweet, sweet 100th percentile (GM DrNykterstein). "Purchasing power" varies by game mode too of course, just like currency strength varies by country. So if you want to measure rating inflation, just look at how the percentiles change over time. If you're growing as a chess player faster than inflation, then your percentile will increase. If inflation outpaces your growth, then you'll gain points but lose percentile.

EDIT: This assumes that the distribution of strength in the community remains constant, which will not necessarily be the case. The bot suggestion by sheckley is a much better solution so long as it plays in the same queue as everyone else.
"My own lowly rating puts me at the 38th percentile of the Lichess community"

"3246.78 points will buy you that sweet, sweet 100th percentile (GM DrNykterstein)"

But realistically speaking, we are on Lichess because we are not titled players. I know titled players do play on here, AGAINST OTHER QUALIFIED TITLED PLAYERS!!

That is a huge difference. A titled player playing under 2000 rated players usually is done through something like a Twitch stream where they can earn some cash. They aren't just playing for the heck of it.

To adequately determine under 1500 rated players (as in USCF or FIDE) you need to see an overall assessment of playing strength. Inflating a rating on day one because it fits some formula for another group doesn't help in this case. Using your money comparison, we don't take currency from say China or India and use a 1:1 correspondence to transfer RMB or Rupees. You need ₹75.31 to make a dollar. You need 6.52 RMB to make a dollar. A Mcdonald's worker in India would make ₹40/hour and a Chinese employee would earn 20 RMB. That means ~3 hours compared to less than 2 hours for the Indian.

Purchasing power should then be more analogous to centipawn loss than rating number. What can you do given your initial mistakes and blunders. If you gain a significant advantage, then the engine doesn't count a 5 point loss a blunder when you are up 15 points.

Can you imagine going into a 5 star hotel for a job in their restaurant or banquet area to only be demoted to the McDonald's across the street? That is what these inflated ratings are like initially. If you know people are going to lose after getting a 200-400 hike, it only feeds their ego. You might as well start everyone at 3000 then because the numbers don't matter. You could just lose and lose and lose and lose and lose, etc...

If you just award 10 points per game and two equally rated players play 10 games, that's 100 points between them. If one of them wins 60% or more, then it is an indication there is something the losing side needs to work on. I would be more than happy to go down ~60+ points knowing my new opponents were more closely in sync with my rating.

However, if I lose to a player who has a true strength of 2000 or more, I don't need to bounce down to 1500. What I need is to play 1600-1800 instead. The way it is now, I have no idea if that 1600-1800 player is actually that or a 2000 Lichess sheep in wolf's clothing.

This topic has been archived and can no longer be replied to.