Skip to main content

Estimating playing strength

Have you ever felt like your chess rating doesn't represent your actual playing strength? Sometimes we want to be able to estimate playing strength based on individual games rather than rating (which changes more slowly).

During the past few months, I've been taking a number of online courses and learning python for data analysis. In one of the courses, the final project allowed me to choose my own dataset. So surprise surprise! I chose something chess related. (Not really surprised, are you?)

When we play games online, getting a computer evaluation is just a few clicks away. And a commonly used statistic is the average centipawn loss, or simply the average deviation from the computer's best move. Many of us tend to think that centipawn loss (CPL) is a good estimate of playing strength. And, of course, it gives some indication, but it's far from a perfect predictor.

Fellow chess/statistics blogger Patrick Coulombe has investigated the correlation between rating and CPL and concluded that the correlation is not very strong. I therefore concluded that other factors need to be taken into consideration when trying to estimate playing strength.

My initial plan was to download all the games played during April from the lichess database, but when I realized that the file was about 160GB, I changed my mind. I chose a smaller dataset, and built my analysis on about 5000 games from the lichess yearly classical arena, played two weeks ago. The advantage of choosing this as a dataset is that all the games have the same time control. Sure, using millions of games would have been fun, but the amount of data would just be too impractical for a normal laptop computer.

A simple plot of rating vs CPL produces a similar result as Patrick found in his analysis. However, the large number of datapoints makes a normal scatterplot difficult to read, so I chose a different kind of plot.

In this plot, the shading indicates the "concentration" of data points. A darker color means more games. The plot has a blob-like shape, which suggests that the correlation between the variables is not very strong. But there is a clear orientation to the plot, and the green line indicates the main relationship between rating and CPL. I was of course tempted to use the slope of this line to try to predict playing strength, and at the end of this post, you can see how that turned out.

Another attempt at understanding the data is to add the opponent's rating to the analysis. In the diagram below, the players' ratings are given on each axis, and the average CPL is indicated by colors. Just a reminder, an average CPL of 300 means that a player, on average, blunders the equivalent of a piece on every move.

As the diagram shows, the red end of the color spectrum is concentrated around the lower rating levels, and the darker shades of blue are mostly found at the higher rating levels. However, there are many red spots scattered around the entire plot, which shows that even strong players can make horrible blunders.

Another statistic that could be a predictor, is the blunder rate. In this case, I have defined a blunder as a move that gives a CPL of 150 (1.5 pawns) or more. I have counted the number of blunders and number of moves, and the blunder rate is simply the average number of blunders per move.

As you can see from the plot, the scale goes up to 0.5, which means that every other move is a blunder. Here, we see a slightly different picture. Strong players are almost exclusively in the blue zone, which indicates blunder rates of 10-20%. Players below 1500 are mostly in the yellow and red parts.

This reminds me of a quote from Garry Kasparov:
Masters blunder three times per game, 
amateurs blunder three times per move
In the final part of my project, I did a multiple regression analysis to see how well the playing strength can be predicted with more variables. I won't go into details here, but the final formula is as follows:
Rating = 1655 - 0.20*CPL -0.45*RatingDiff + 8.55*nmoves -22*nblunders

RatingDiff is the difference in rating between players, nmoves is the number of moves, and nblunders is the number of blunders. This means that 1655 is a baseline and for each move that is played, your estimated strength increases with roughly 8 points, and for each blunder it drops by 22 points.

I tested this model on a number of my own games, and found that it is fairly good (from a statistical point of view).

This diagram below shows how the rating varies in my own games (observed), in the regression estimate and in the prediction based on the green line in the first diagram (see above). The boxes indicate where the majority of games are located.

 

We can see that the regression estimate gives a somewhat higher result compared to my actual ratings, but approximately the same variation. However, the estimates that are based on CPL alone gives quite extreme values, which suggests that it has very poor accuracy.

So the model has an acceptable accuracy, but there is a downside: The unexplained variation is so large that the estimate from one game has an uncertainty of +/- 400 rating points. This makes the estimate quite useless for individual games. A larger sample will improve the precision, but in order to reduce the uncertainty to +/- 50 rating points, you need about 40 games. From a statistical point of view, this is not problematic, but from a practical point of view, this would be rather pointless. Over 40 games, your rating would adjust properly, and you'll have a good estimate of playing strength right there.

So to round off this long and complicated post, I have come to the conclusion that estimating playing strength from game statistics is possible, but not very useful.





Comments

Popular reviews

Under the surface

I did something different. I bought a chess book without doing any research. I decided to reward myself with a new book after having written ten reviews. So I asked my friends on Twitter for suggestions, and someone suggested that I take a look at the book Under the surface by Jan Markos. Since the book is quite new, I couldn't find much information about it, so I decided to blindly trust the recommendation. Luckily, I was not let down. What can you expect from this book? I am not the only one who has done something different. Jan Markos did the same when he wrote Under the surface . He takes a quite philosophical approach to chess, which should probably be expected from a former student of philosophy. This comes across quite clearly in his choice of chapter titles. The names "Magnetic Skin", "Anatoly Karpov's Billiard Balls" and "On the Breaking Ice" are not the most transparent chapter titles in the world. But once you get under the surfa

Tactics for post-scratch players

Which is the best chess book ever? As a chess community, we repeatedly ask this kind of question, for various categories. And one of those categories is, of course, tactics. So which is the best tactics book ever? The answer to this question depends on the playing strength of the reader. But if we consider the fact that the majority of chess players are in the middle of the bell curve, the best books should logically be among the ones written for an intermediate audience. With that said, this might be the best tactics book ever written. A bold statement, perhaps. Read on to find out why I recommend this book. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from this book? Chess tactics from scratch was originally published as Understanding Chess Tactics  – as indicated by the book's subtitle. This book is actually what inspired the now famous "woodpecker method". In his book Pu

The best book for patzers?

I have been playing chess since I was a kid, but until about 10 years ago (2009), I had not even considered reading chess books. For some reason, this changed. I cannot remember why, but I decided I wanted to learn more about chess and probably pick up a book or two. I searched the Internet and consulted a few online chess forums, and was recommended the book The Amateur's Mind by Jeremy Silman. I bought the book, and that was the start of my growing collection of chess books. The Amateur's Mind has had a tremendous impact on my understanding of the game. Prior to Reading this book, I had no idea about how to evaluate a position or how to play the opening properly (or any other phase of the game, for that matter). I saw my results improving dramatically, and gained a couple of hundred rating points in just a few months. At this point, I only played online, so I did not have a "proper" rating. But regardless, I learned a lot from reading the book. What can y

Master of strategy

During the past two years, I’ve been working on improving my strategic/positional play. In this process, I have read a number of books, and two books that have long been on my reading list are the strategy books by Johan Hellsten. So when the Swedish chess federation requested reviewers for two of these books, I didn’t hesitate. I am happy that I was given the opportunity to review these books, and hope this review can be of help to you as a reader. If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from these books? Johan Hellsten has created a name for himself as one of the leading experts of chess strategy in modern times. His series of strategy books ( Mastering Opening Strategy , Mastering Chess Strategy and Mastering Endgame Strategy ) have received glowing reviews from many parts of the chess world. So it feels good to finally dig into these nuggets. His endgame book is still in my boo

Judgement and planning

Some books "fly under the radar" and do not get the same attention as the evergreen classics. But sometimes, there is gold in old mines. And I found a little golden nugget while shopping for used books. A book written for amateurs, by (arguably) the best amateur of all time; former world champion Max Euwe. Sounds promising, right? If you like these reviews, please consider supporting my work. Visit my patreon page for details. Become a Patron! What can you expect from this book? In my previous review of Chess Fundamentals , said that few world champions have written books for beginners and intermediate players. Capablanca is, of course, one exception. And another is Max Euwe. Euwe is not the most well-known world champion. He was in his prime in the 1930s and 1940s, most notably in 1935 when he dethroned none other than the great Alexander Alekhine. Although many have suggested that Alekhine only lost because of heavy use of alcohol, beating him is no small feat (rega