Skip to main content

Estimating playing strength

Have you ever felt like your chess rating doesn't represent your actual playing strength? Sometimes we want to be able to estimate playing strength based on individual games rather than rating (which changes more slowly).

During the past few months, I've been taking a number of online courses and learning python for data analysis. In one of the courses, the final project allowed me to choose my own dataset. So surprise surprise! I chose something chess related. (Not really surprised, are you?)

When we play games online, getting a computer evaluation is just a few clicks away. And a commonly used statistic is the average centipawn loss, or simply the average deviation from the computer's best move. Many of us tend to think that centipawn loss (CPL) is a good estimate of playing strength. And, of course, it gives some indication, but it's far from a perfect predictor.

Fellow chess/statistics blogger Patrick Coulombe has investigated the correlation between rating and CPL and concluded that the correlation is not very strong. I therefore concluded that other factors need to be taken into consideration when trying to estimate playing strength.

My initial plan was to download all the games played during April from the lichess database, but when I realized that the file was about 160GB, I changed my mind. I chose a smaller dataset, and built my analysis on about 5000 games from the lichess yearly classical arena, played two weeks ago. The advantage of choosing this as a dataset is that all the games have the same time control. Sure, using millions of games would have been fun, but the amount of data would just be too impractical for a normal laptop computer.

A simple plot of rating vs CPL produces a similar result as Patrick found in his analysis. However, the large number of datapoints makes a normal scatterplot difficult to read, so I chose a different kind of plot.

In this plot, the shading indicates the "concentration" of data points. A darker color means more games. The plot has a blob-like shape, which suggests that the correlation between the variables is not very strong. But there is a clear orientation to the plot, and the green line indicates the main relationship between rating and CPL. I was of course tempted to use the slope of this line to try to predict playing strength, and at the end of this post, you can see how that turned out.

Another attempt at understanding the data is to add the opponent's rating to the analysis. In the diagram below, the players' ratings are given on each axis, and the average CPL is indicated by colors. Just a reminder, an average CPL of 300 means that a player, on average, blunders the equivalent of a piece on every move.

As the diagram shows, the red end of the color spectrum is concentrated around the lower rating levels, and the darker shades of blue are mostly found at the higher rating levels. However, there are many red spots scattered around the entire plot, which shows that even strong players can make horrible blunders.

Another statistic that could be a predictor, is the blunder rate. In this case, I have defined a blunder as a move that gives a CPL of 150 (1.5 pawns) or more. I have counted the number of blunders and number of moves, and the blunder rate is simply the average number of blunders per move.

As you can see from the plot, the scale goes up to 0.5, which means that every other move is a blunder. Here, we see a slightly different picture. Strong players are almost exclusively in the blue zone, which indicates blunder rates of 10-20%. Players below 1500 are mostly in the yellow and red parts.

This reminds me of a quote from Garry Kasparov:
Masters blunder three times per game, 
amateurs blunder three times per move
In the final part of my project, I did a multiple regression analysis to see how well the playing strength can be predicted with more variables. I won't go into details here, but the final formula is as follows:
Rating = 1655 - 0.20*CPL -0.45*RatingDiff + 8.55*nmoves -22*nblunders

RatingDiff is the difference in rating between players, nmoves is the number of moves, and nblunders is the number of blunders. This means that 1655 is a baseline and for each move that is played, your estimated strength increases with roughly 8 points, and for each blunder it drops by 22 points.

I tested this model on a number of my own games, and found that it is fairly good (from a statistical point of view).

This diagram below shows how the rating varies in my own games (observed), in the regression estimate and in the prediction based on the green line in the first diagram (see above). The boxes indicate where the majority of games are located.

 

We can see that the regression estimate gives a somewhat higher result compared to my actual ratings, but approximately the same variation. However, the estimates that are based on CPL alone gives quite extreme values, which suggests that it has very poor accuracy.

So the model has an acceptable accuracy, but there is a downside: The unexplained variation is so large that the estimate from one game has an uncertainty of +/- 400 rating points. This makes the estimate quite useless for individual games. A larger sample will improve the precision, but in order to reduce the uncertainty to +/- 50 rating points, you need about 40 games. From a statistical point of view, this is not problematic, but from a practical point of view, this would be rather pointless. Over 40 games, your rating would adjust properly, and you'll have a good estimate of playing strength right there.

So to round off this long and complicated post, I have come to the conclusion that estimating playing strength from game statistics is possible, but not very useful.





Comments

Popular reviews

Chaos on the board

Have you ever felt tired of chess? Maybe you’ve been uninspired or perhaps you think it’s too much work. To paraphrase a famous movie: All work and no play makes you a dull person. If that is the case, I may have a cure. You need a fun chess book. Something that will rekindle your joy for the game and inspire you to play creatively. I give you Tiger’s Chaos Theory ! This book found me at a time when I needed it the most. My inspiration and motivation to study chess was way down. And like a bolt from the blue, the Swedish Chess Federation approached me and wanted me to review this book. And the kind people at Quality Chess agreed to send me a review copy. An offer I couldn't refuse. So after this happy turn of events, I have a new review for you. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from this book? Grandmaster Tiger Hillarp Persson, a multi-time Swedish chess champion, has al...

Under the surface

I did something different. I bought a chess book without doing any research. I decided to reward myself with a new book after having written ten reviews. So I asked my friends on Twitter for suggestions, and someone suggested that I take a look at the book Under the surface by Jan Markos. Since the book is quite new, I couldn't find much information about it, so I decided to blindly trust the recommendation. Luckily, I was not let down. What can you expect from this book? I am not the only one who has done something different. Jan Markos did the same when he wrote Under the surface . He takes a quite philosophical approach to chess, which should probably be expected from a former student of philosophy. This comes across quite clearly in his choice of chapter titles. The names "Magnetic Skin", "Anatoly Karpov's Billiard Balls" and "On the Breaking Ice" are not the most transparent chapter titles in the world. But once you get under the surfa...

Judgement and planning

Some books "fly under the radar" and do not get the same attention as the evergreen classics. But sometimes, there is gold in old mines. And I found a little golden nugget while shopping for used books. A book written for amateurs, by (arguably) the best amateur of all time; former world champion Max Euwe. Sounds promising, right? If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from this book? In my previous review of Chess Fundamentals , said that few world champions have written books for beginners and intermediate players. Capablanca is, of course, one exception. And another is Max Euwe. Euwe is not the most well-known world champion. He was in his prime in the 1930s and 1940s, most notably in 1935 when he dethroned none other than the great Alexander Alekhine. Although many have suggested that Alekhine only lost because of heavy use of alcohol, beating him is no small feat (rega...

Master of strategy

During the past two years, I’ve been working on improving my strategic/positional play. In this process, I have read a number of books, and two books that have long been on my reading list are the strategy books by Johan Hellsten. So when the Swedish chess federation requested reviewers for two of these books, I didn’t hesitate. I am happy that I was given the opportunity to review these books, and hope this review can be of help to you as a reader. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from these books? Johan Hellsten has created a name for himself as one of the leading experts of chess strategy in modern times. His series of strategy books ( Mastering Opening Strategy , Mastering Chess Strategy and Mastering Endgame Strategy ) have received glowing reviews from many parts of the chess world. So it feels good to finally dig into these nuggets. His endgame book is still in my boo...

Understanding middlegames

Have you ever found yourself unable to find a move in the middlegame? Of course you have. We all have. About ten years ago, this was a recurring problem for me, which led to a lot of frustration. My conclusion was that I needed to learn how understand middlegames. So what better way than to read a book with the title Understanding Chess Middlegames ? Sound like the perfect remedy, right? Ok, let's find out. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron!   What can you expect from this book? I first read this book when I was just starting to study chess seriously. I bought it after reading a recommendation in an online forum. I would say this was ten years ago, but the book came out 2011, so it couldn't have been before that. Anyhow, my playing strength was probably around 1200 (I was unrated at the time) and I was having trouble choosing moves in non-tactical positions. Basically, I was playing without ...