From: Larry Kaufman COMCAST NET> Date: 26 oct 2002 Subject: Re: Elo system ----- Original Message ----- From: "Sam Sloan" ISHIPRESS COM> To: TECHUNIX TECHNION AC IL> Sent: Saturday, October 26, 2002 1:03 PM Subject: Re: Elo system > At 04:31 PM 10/26/2002 +0200, Albrecht Heeffer wrote: > >Hello all, > > > >Chessbase.com features an article by statistician Jef Sonas with > >a proposal to improve the ELO rating system. He advocates a simpler linear > >formula to calculate the expectancies and argues that this provides > >a better predication than the current ELO system. I find his article > >quite convincing. The shogi world could be first to adopt the Sonas > >system. > >http://www.chessbase.com/newsdetail.asp?newsid=562 > > > >Greetings, > > > >Albrecht Heeffer > > This is a very interesting article and well worth reading. However, it does > not apply to shogi at all. > While I am not in full agreement with Sonas, my objections have nothing to do with whether the game is chess or shogi. The only statistical difference between the games is the (virtual) lack of draws in shogi; it is not quite clear how this would impact the shape of the expectancy curve. > The author correctly notes that there are differences between Elo's formula > and reality. For example, according to the Elo Formula, if two players are > 200 points apart, the highe player will score 75%. > > However, it has been discovered that in reality the highe player > will only score 73%. This has been know for a decade and the USCF rating > system was long ago adjusted accordingly. > Not so. The USCF table actually uses 76% for a 200 point difference. This is based on theory, not on actual results. The actual result quoted (73%) is influenced by the fact that lowe players are more likely to be improving and therefore unde ; also by a statistical phenomenon that Sonas refers to which is too abstract for explanation here. But trying to match the observed numbers by modifying the table is not the right answer, in my opinion. It might just lead to a never-ending spiral of spreading ratings. > I also agree that the K-Factor does not matter much. Regardless of whether > K is set at 5 or 50, the ratings will still come out about the same in the > long run. > This should be true in theory, though for statistical reasons that we are just beginning to explore it seems that higher K factors do produce slightly wider rating spreads in practice. For example, USCF ratings are spread further apart than FIDE ratings (USCF uses larger K factors), though the reasons for this are not completely clear. > The reason this does not apply to shogi is that differences in strength are > sharper. When I was playing as a 2-dan player regularly, whenever I was > paired against a 1-dan player I won 100% of the time. I think that this > experience is fairly typical. That would translate to about 400 points in > the Elo System. > The Internet rating systems assume a 75% (200 point) difference for a Dan rank, which is typical in Japanese clubs. Our Pan-Atlantic system uses a somewhat smaller range for a rank (an average of 150 points between 1 Dan and 5 Dan) which was a compromise between European and American practices (we in the U.S. had followed the Japanese 200 point practice). As a consequence, while the 4 Dan rank seems to be about the same in the West as in Japan, the lower Dan ranks are harder to earn in the West than in Japan. Many Western kyu players have been surprised to be told that they are Dan level when in Japan. > Amateur Shogi Players range from a bottom of about 7-kyu to a top of about > 7-dan, although some players are highter than that top or lower than that > bottom. That is 14 ranks and only counts amateurs, not professionals. > > In chess, the lowest rating among adults (not counting scholastic players) > is about 1000 and the highest rated player in the world is Garry Kasparov, > who is rated about 2800. In chess, one grade is 200 points. Class B is > 1800, Class A is 1800 and Expert is 2000. So there are nine grades in chess > from the weakest player in the world at 1000 to the strongest player in the > world at 2800. > > Finally, in chess it is possible to do a statictical study of 100,000 > games. One World Open Chess Championship in Philadelphia, held every July > 4, will produce results for more than 5000 rated games. In shogi, very few > results are recorded. Larry Kaufman is working on shogi ratings but I doubt > that he has the results of more than a few hundred games in his database. > It's true that our base of game results is not so large (though more than a few hundred). However this only really matters for rating handicap games. If we are talking about even games, we simply define a 200 point interval as giving a 76% result (using the same formula as the USCF ratings). There is no great need to see if this is true in practice or not; the ratings are calculated in a way whereby this must be at least approximately true (except for the factors mentioned above). It's possible that the shape of the curve differs a bit from the one we used (the Logistic curve), but I'm confident that any deviation from this shape, for either chess or shogi, is not significant enough to affect ratings significantly. We did need the data in order to determine the handicap values, and while of course more data would be nice, I am confident that we are at least reasonably close to the proper values. > Sam Sloan Larry Kaufman