Quantification of luck Einstein forum
8 replies. Last post: 2020-02-19Reply to this topic Return to forum
Force majeure at 2018-05-06
Have you ever estimated by any analytical (or by own intuition) impact of the luck in EWN? I imagine, that espiecially when having big game database and/or good AI program it could be feasible.
Florian Jamain at 2018-05-06
jytou at 2018-05-07
I have been doing a lot of tests to make my bot recently. From my own experience:
- two players that are very different in strength (a very good one and a very bad one) can be distinguished reliably with less than 100 games, sometimes it takes as little as 20 games to be really sure which is the best player.
- players that are still different in strength but not as much may need more games to separate them, as a “series of bad luck” can heavily shift the balance toward the weaker player. In my tests, I assume that 1000 games is a good discriminating number for those players,
- players that are even more similar in strength (although not exactly equal) may need even more games (it can go up to 10k games) before reaching a stabilized and reliable result that will never change, no matter how many further games you play.
So yes, luck plays a very big role, especially when the players are of rather similar strengths, which is not a big surprise for a dice game. ;)
Chicagos at 2018-11-29
Yes my bot wins “only” 92% of games against a random bot, so luck is a huge factor compared to even other games of luck like backgammon
Force majeure at 2020-01-28
Out of curiosity, I’ve compared EwN to 3 games with similar number of players on the ranking list and taking part in championship. Chess, Hex, and TwiXt are almost without a luck factor (I assume that discussion concerning first move may be discarded for purpose of this analysis).I assume, that games of similar popularity shall imply similar distribution of players' skills, therefore I collected rating for 1st, 10th, 20th, 50th, 100th and 200th player on the rating list, for each game respectively. Then, I’ve calculated implied winning probabilities in matches between 1st vs 10th, 10th vs 20th etc.. For chess, hex and twixt values are similar, so I’ve taken their mean as “AverageSkillGame”.In last step I assumed, that winning percentage in EwN may be simplified as:WP(EwN) = LuckFactor * 50% + (1 – LuckFactor) * WP(AverageSkillGame)
Solving this equation shows, that depending on players' position on the rating list, LuckFactor is around 45-70%.
Of course I’m well aware of limitations of above 10 minute analysis, however I think it may give some approximation of luck’s impact on this funny, but at the same time interesting game :D
Shall you have any questions or ideas, just let me know.
Force majeure at 2020-01-28
I have also taken einstein.ch.26.1.1 result (due to its vast number of participants), and calculated correlation between rating as of tournament start date, and collected points. Correlation amounts to 26%.
Simulating results of this league based on Monte-Carlo method using rating-implied probabilities for each individual game, the average correlation is around 60-70%.
In order to obtain correlation around 26%, Luck Factor shall be around 65-70%, which is close to an upper bound of my previous estimation. For top tier players, estimation based on method described in above post amounted to approximately 45%.
Chicagos at 2020-02-18
Force majeure, your analysis is very interesting. How exactly you estimated the winning propabilities between players? Did you take the ELO number of two players to calculate the winning propability or real games (combination?), I am very interested in your invented method :)
Force majeure at 2020-02-19
Chicagos, it’s very simple. To obtain rating-implied winning probability I take players' rating and LittleGolem’s rating formula. Instead of assuming result 0 or 1 to calculate rating change, I assume no rating change to imply theoretical winning probability <0;1> that solves the equation. For example, if player stronger by 100 ELO wins the game, he receives 11.5 ELO – otherwise he loses 20.5. If in the long run he wins 64% of the games, his rating will remain at the same level, therefore 0.64 is his rating-implied winning probability.