Skip to main content

Estimating playing strength

Have you ever felt like your chess rating doesn't represent your actual playing strength? Sometimes we want to be able to estimate playing strength based on individual games rather than rating (which changes more slowly).

During the past few months, I've been taking a number of online courses and learning python for data analysis. In one of the courses, the final project allowed me to choose my own dataset. So surprise surprise! I chose something chess related. (Not really surprised, are you?)

When we play games online, getting a computer evaluation is just a few clicks away. And a commonly used statistic is the average centipawn loss, or simply the average deviation from the computer's best move. Many of us tend to think that centipawn loss (CPL) is a good estimate of playing strength. And, of course, it gives some indication, but it's far from a perfect predictor.

Fellow chess/statistics blogger Patrick Coulombe has investigated the correlation between rating and CPL and concluded that the correlation is not very strong. I therefore concluded that other factors need to be taken into consideration when trying to estimate playing strength.

My initial plan was to download all the games played during April from the lichess database, but when I realized that the file was about 160GB, I changed my mind. I chose a smaller dataset, and built my analysis on about 5000 games from the lichess yearly classical arena, played two weeks ago. The advantage of choosing this as a dataset is that all the games have the same time control. Sure, using millions of games would have been fun, but the amount of data would just be too impractical for a normal laptop computer.

A simple plot of rating vs CPL produces a similar result as Patrick found in his analysis. However, the large number of datapoints makes a normal scatterplot difficult to read, so I chose a different kind of plot.

In this plot, the shading indicates the "concentration" of data points. A darker color means more games. The plot has a blob-like shape, which suggests that the correlation between the variables is not very strong. But there is a clear orientation to the plot, and the green line indicates the main relationship between rating and CPL. I was of course tempted to use the slope of this line to try to predict playing strength, and at the end of this post, you can see how that turned out.

Another attempt at understanding the data is to add the opponent's rating to the analysis. In the diagram below, the players' ratings are given on each axis, and the average CPL is indicated by colors. Just a reminder, an average CPL of 300 means that a player, on average, blunders the equivalent of a piece on every move.

As the diagram shows, the red end of the color spectrum is concentrated around the lower rating levels, and the darker shades of blue are mostly found at the higher rating levels. However, there are many red spots scattered around the entire plot, which shows that even strong players can make horrible blunders.

Another statistic that could be a predictor, is the blunder rate. In this case, I have defined a blunder as a move that gives a CPL of 150 (1.5 pawns) or more. I have counted the number of blunders and number of moves, and the blunder rate is simply the average number of blunders per move.

As you can see from the plot, the scale goes up to 0.5, which means that every other move is a blunder. Here, we see a slightly different picture. Strong players are almost exclusively in the blue zone, which indicates blunder rates of 10-20%. Players below 1500 are mostly in the yellow and red parts.

This reminds me of a quote from Garry Kasparov:
Masters blunder three times per game, 
amateurs blunder three times per move
In the final part of my project, I did a multiple regression analysis to see how well the playing strength can be predicted with more variables. I won't go into details here, but the final formula is as follows:
Rating = 1655 - 0.20*CPL -0.45*RatingDiff + 8.55*nmoves -22*nblunders

RatingDiff is the difference in rating between players, nmoves is the number of moves, and nblunders is the number of blunders. This means that 1655 is a baseline and for each move that is played, your estimated strength increases with roughly 8 points, and for each blunder it drops by 22 points.

I tested this model on a number of my own games, and found that it is fairly good (from a statistical point of view).

This diagram below shows how the rating varies in my own games (observed), in the regression estimate and in the prediction based on the green line in the first diagram (see above). The boxes indicate where the majority of games are located.

 

We can see that the regression estimate gives a somewhat higher result compared to my actual ratings, but approximately the same variation. However, the estimates that are based on CPL alone gives quite extreme values, which suggests that it has very poor accuracy.

So the model has an acceptable accuracy, but there is a downside: The unexplained variation is so large that the estimate from one game has an uncertainty of +/- 400 rating points. This makes the estimate quite useless for individual games. A larger sample will improve the precision, but in order to reduce the uncertainty to +/- 50 rating points, you need about 40 games. From a statistical point of view, this is not problematic, but from a practical point of view, this would be rather pointless. Over 40 games, your rating would adjust properly, and you'll have a good estimate of playing strength right there.

So to round off this long and complicated post, I have come to the conclusion that estimating playing strength from game statistics is possible, but not very useful.





Comments

Popular reviews

Rapid chess improvement

Would you like to gain 400 rating points in 400 days? That is what Michael de la Maza did. And he wrote a book about his progress and the methods that got him there. Quite an appealing idea, and many players at the beginner and intermediate level will be enticed. I was. Are you? In 2001, Michael de la Maza wrote an article in Chess horizons called "400 points in 400 days" . In the article, he outlines the main components of his training program and gives a short account of his own progress. He managed to to get from about 1300 to 2000 in just two years, which is quite an accomplishment. His ideas were later elaborated upon and presented in the book  Rapid chess improvement. This book was published quite a few years ago, and can be difficult to find. But it has been republished in a new form, together with two other books in A chess course: from beginner to winner . If you decide you want to buy Rapid chess improvement , this three-in-one volume may be a good option. ...

Master of strategy

During the past two years, I’ve been working on improving my strategic/positional play. In this process, I have read a number of books, and two books that have long been on my reading list are the strategy books by Johan Hellsten. So when the Swedish chess federation requested reviewers for two of these books, I didn’t hesitate. I am happy that I was given the opportunity to review these books, and hope this review can be of help to you as a reader. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from these books? Johan Hellsten has created a name for himself as one of the leading experts of chess strategy in modern times. His series of strategy books ( Mastering Opening Strategy , Mastering Chess Strategy and Mastering Endgame Strategy ) have received glowing reviews from many parts of the chess world. So it feels good to finally dig into these nuggets. His endgame book is still in my boo...

Winning chess strategies

Chess strategy is one of the most elusive and difficult parts of chess. Compared to the direct and transparent world of chess tactics, strategy can be confusing and opaque. At the patzer level, tactical strikes that gain material or lead to checkmate can be obvious (provided that one understands the tactic). However, in many cases, strong players claim that a move is obvious although it doesn't result in material gain or even a semblance of an attack. We, the patzers of the world, scratch our heads and wonder how on earth anyone can find such moves obvious. How can we take steps towards this deeper level of chess understanding? Picking up a basic strategy book is a good first step. And that's what I did here. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from this book? As a part of my ambition to deepen my understanding of chess strategy, I took on a mission of reading (and rev...

Attack the king

One of the finest chess books ever written. Those are tall words from the back cover of this classic book. But The art of attack in chess is one of those books that keep popping up in lists of best chess books. It is highly regarded by many players and trainers, so the initial statement is probably not all wrong. This is a book that has been on my reading list for several years, but it has taken some time for me to actually pick it up. I've been curious about what the hype is all about, and now that I've finally found out for myself, I am ready to share that insight with you. So let's dive in! What can you expect from this book? Before we begin, let me just get one detail out of the way. One thing that struck me when I picked up the book is the title, which seems to be missing a "the". When reading about the book online, it is referred to both with and without "the". When I looked up the original book (this one is a revised edition, edited by Joh...