4 in a row recent championship stastics 4ir forum
31 replies. Last post: 20090201
Reply to this topic Return to forum
wccanard at 20090129
I downloaded all the games from the top division of the last ten 4ir championships (i.e. ch11 to ch20). Here is a brief statistical analysis of the results.
Some games finished by timeout. These are not ideal because perhaps the player who timed out left the siteif this is the case then the result of the game might not accurately reflect who was winning. In fact 17 games finished by timeout in ch19, whereas none at all did in ch16. I decided to throw away the games which finished by timeout (many of them only had a few moves played, in fact many had at most 1 move played).
OK, so after throwing these games away, I had 335 games left.
Of these 335, the shortest games were 11 moves long (both finishing with red resigning). I looked at these games and both seemed “genuine” (i.e. red was in a mess when they resigned, rather than leaving the site and resigning all their games). I am still a bit worried about the data being contaminated by people leaving the site and resigning all their games first though, regardless of whether they were winning or not [in these cases it might be the case that “the wrong player won”]
But let’s press on. Of these 335 games, 124 ended with fourinarow, 183 ended with someone resigning, 13 ended up as a draw with the entire board filled, and 15 ended up with a draw agreed. In particular, 28 were draws. Of the remaining 307, 219 were wins for P2 and 88 were wins for P1.
209 of the 335 games started with one of the moves 1,2,3,4, and the other 126 started with one of 5,6,7,8. Because of the symmetry of the game, I decided to “renormalise” the 126 games that started with 5,6,7,8 by flipping them around so that they also started 1,2,3,4.
After this renormalisation, every game now of course started with one of the moves 1,2,3,4. The stats are:
3 games started with 1
32 started with 2
58 with 3
242 with 4
So for some reason, the experts seem to think that it’s better to go near the middle on move 1. Let me write this as
[3,32,58,242].
How many times did player 1 win, after each of these opening moves?
[1,6,22,59]
Player 2’s stats for winning are
[2, 26, 35, 156]
and the draw stats are
[0, 0, 1, 27]
Finally, 140 of the 335 games started with the opening 4,4,4,4,4,4.
Any questions? [i.e. “I can give you some more figures if you ask.”] 
Ray Garrison ★ at 20090129
interesting study. of the 140 games that started 444444, what were the results ?

wccanard at 20090129
I’m going to call it d1 d2 d3 d4 d5 d6, that opening, because I just appended some code which enabled the moves to be input and output in “chessboard notation”, which is something I find far more intuitive [it’s ‘inbuilt’! Note that it’s not the system here!]
OK so after d1 d2 d3 d4 d5 d6, the stats are: 140 games in database, P1 wins 29 times, P2 wins 97 times, and 14 ties. 
Ray Garrison ★ at 20090129
great, thanks. As I already knew, the odds favour P2 in almost all variations. From your analysis, the best percentage move for P1 is c1 (38%) rather than d1 (30%), although this info is probably skewed because there are far more practical examples of d1 than c1 for the first move. I suspect c1 has preformed better because it is more “offbeat” and can catch P2 unprepared. So my question is, analyzing the 58 games starting with c1, what where p2 's responses and results?

wccanard at 20090129
Let me give a game a value of 1 if player 2 wins, a value of 0.5 if it’s a draw, and a value if 0 if player 1 wins. After 1.c1 there were only three responses played in the database:
1...c2 was played 12 times, with a score of 0.0833333
1...d1 was played 44 times, with a score of 0.7840909090
1...g1 was played twice, with a score of 0.
[I could give you finer information but I’m being lazyfor example you can only see from this that P2 scored 1/12 after 1...c2, but you can’t see whether it was a win or two ties.] 
wccanard at 20090129
So in fact we can conclude from this that probably, to the experts, 1.c1 says "I’m hoping you’ll blunder with 1...c2 rather than play the killer 1....d1, after which I’m in an even worse position than after 1.d1 d2"

my_immortal at 20090129
Very interesting data, which supports the fact which also Ray mentions, that 4iar is a second player win on this type of board. (8x8)
So in all fairness, championships should be decided by each player playing 2 games against each opponent, one of each color. This has maybe been debated before, but if not, I would like to propose, that all 4iar championships should be carried out in that manner. 
wccanard at 20090129
My feeling about discussing this here [playing 2 games each] is:
(a) it’s very offtopic. Pretty much every good player knows that P2 has a big advantage in 8x8 connect 4; the data above isn’t really telling anything new about this.
(b) the issue is not restricted to just connect 4 [dots and boxes, and chess, both have advantages for P1, and hex with a swap rule is provably a P2 win!]
© It will never happen on this site anyway, because the problems with the suggestion outweigh the benefits. It would be such a hassle changing things and would make championships even longer and even more spaced out and because the webmaster is typically quite inert.
So perhaps the main forum is a good place to talk about this. 
Ray Garrison ★ at 20090129
I like immortal’s idea of a double round robin championship, but wccanard is right on all three accounts. So, getting back to the statistics, I would love to know more, for example:
A) what are p1 responses in your database after 1.c1 d1, and what are the “scores” (01 as defined above).
B) what is the best percentage move for p1 after the opening sequence 1.d1 d2?
C)what is the best percentage move for p1 after the opening sequence d1d2d3d4d5d6?
D) would it be too difficult for you to post a data table that summarizes all of the opening data of the 335 games? (say, the first ten moves of each game and the results) I guess if you could do this, then I could answer A, B, and C just by looking at this proverbial data table!
wccanard, thanks for doing this statistical work...you don’t have to answer any of my questions actually, but I get the feeling that you enjoy it! 
wccanard at 20090130
I do enjoy it but let me also stress that it’s now very easy to answer your questions. The time investment was just taking my dots and boxes python database and changing the classes so that they were connect 4 games instead (which in practice meant teaching python the rules of connect 4). I spent about 2 hours on it last week and 4 hours on it yesterday, plus another hour writing some code which parsed data here. But now this is all done, I can answer each of A,B,C above by typing 2 lines of python codeone to extract the relevant games from the database and the other to do the statistical analysis (which is a function I’d already written).
OK so
A) There are 44 games starting 1.c1 d1, and the stats after that are
[‘c2’, 18, 0.75]
[‘d2’, 25, 0.8]
[‘f1’, 1, 1.0]
All probabilities are for P2. So 2.f1 was played once and it lost, and c2,d2 were played 18,25 times respectively, with, on the face of it, c2 faring a bit better for P1.
B) That turns out to be a good question. There are 242 games starting 1.d1 d2, and the stats for P1 move 2 are
[‘c1’, 33, 0.60606060606060608]
[‘d3’, 164, 0.72865853658536583]
[‘e1’, 39, 0.64102564102564108]
[‘f1’, 3, 0.66666666666666663]
[‘g1’, 2, 1.0]
[‘h1’, 1, 1.0]]
so there are 6 responses in the championships, but g1,h1 are both rare and didn’t work ever. It’s perhaps of interest to note that of the four remaining ones, the one that’s by far the most popular (2.d3) is also the one that’s the least successful!
C) After 1.d1 d2 2.d3 d4 3.d5 d6 we have
[‘a1’, 5, 1.0],
[‘b1’, 3, 0.66666666666666663],
[‘c1’, 17, 0.79411764705882348],
[‘d7’, 21, 0.8571428571428571],
[‘e1’, 39, 0.64102564102564108],
[‘f1’, 41, 0.75609756097560976],
[‘g1’, 1, 1.0],
[‘h1’, 13, 0.65384615384615385]]
so the answer to your question is “e1”again not quite the most popular move, but nearly. 
wccanard at 20090130
Now as for (D), my problem is that I can’t summarise data that “I don't understand”. I don’t know what the experts want from the data because I am the exact opposite of an expertI only started learning this game in 2009 (it was my new year’s resolution!). But I would be happy to give you the 5 meg pickled python variable containing all 355 games, and the 7k script that contains some useful functions. Then you could have answered © yourself by typing (H is the database):
>>> G=[h for h in H if h.moves[:6]==[4,4,4,4,4,4]]
>>> stats(G,7)
[[‘a1’, 5, 1.0], [‘b1’, 3, 0.66666666666666663], [‘c1’, 17, 0.79411764705882348], [‘d7’, 21, 0.8571428571428571], [‘e1’, 39, 0.64102564102564108], [‘f1’, 41, 0.75609756097560976], [‘g1’, 1, 1.0], [‘h1’, 13, 0.65384615384615385]]
The problem is that, after 10 moves, the 335 games have evolved into 133 distinct positions. Actually, I guess that isn’t so many is it. But still, if I posted them all here, what would do you? Analyse by hand? How would you want them ordered? Maybe this would work.

trincot at 20090130
What is the longest variant that has been played in more than one game?
Nice stats! 
wccanard at 20090130
Let me try and post a solution to D. And then, after that, an explanation.
a1 d1 d2 d3 d4 d5 d6 d7 e1 d8 1 0.0
a1 d1 d2 d3 d4 d5 e1 e2 e3 e4 1 1.0
a1 d1 f1 d2 f2 f3 f4 f5 f6 f7 1 1.0
b1 b2 b3 b4 e1 d1 e2 e3 e4 e5 1 1.0
b1 c1 b2 b3 b4 b5 f1 g1 f2 f3 1 1.0
b1 c1 c2 c3 c4 c5 a1 a2 c6 c7 1 1.0
b1 c1 c2 c3 c4 c5 a1 a2 h1 f1 1 1.0
b1 c1 c2 c3 c4 c5 c6 c7 e1 e2 1 1.0
b1 c1 c2 c3 c4 c5 c6 c7 f1 g1 2 1.0
b1 c1 c2 c3 c4 c5 f1 g1 e1 e2 1 1.0
b1 c1 c2 c3 c4 c5 f1 g1 g2 g3 1 1.0
b1 c1 c2 c3 c4 c5 g1 f1 e1 e2 1 1.0
b1 c1 e1 e2 e3 e4 e5 c2 b2 c3 1 1.0
b1 c1 f1 c2 c3 c4 b2 b3 b4 c5 1 1.0
b1 c1 f1 e1 c2 e2 e3 e4 c3 e5 1 0.0
b1 c1 f1 f2 b2 b3 b4 c2 c3 g1 1 1.0
b1 c1 f1 f2 c2 c3 c4 c5 e1 g1 1 0.0
b1 c1 f1 f2 c2 c3 c4 f3 f4 g1 2 1.0
b1 c1 f1 f2 c2 c3 c4 g1 c5 f3 1 0.0
b1 c1 f1 f2 c2 c3 c4 g1 g2 b2 1 1.0
b1 c1 f1 f2 c2 c3 c4 g1 g2 g3 1 1.0
b1 c1 f1 f2 c2 c3 e1 d1 c4 e2 1 1.0
b1 c1 f1 f2 c2 c3 e1 d1 e2 c4 1 1.0
b1 c1 f1 f2 c2 c3 e1 g1 e2 c4 1 0.0
b1 c1 f1 f2 c2 c3 e1 g1 g2 f3 2 1.0
b1 c1 f1 f2 f3 f4 c2 c3 c4 c5 2 0.5
b1 c1 f1 g1 c2 c3 c4 c5 c6 b2 1 1.0
b1 c1 f1 g1 c2 c3 c4 c5 c6 c7 2 1.0
b1 c1 f1 g1 c2 c3 c4 c5 d1 d2 1 1.0
b1 c1 g1 f1 f2 f3 f4 f5 f6 f7 1 1.0
b1 c1 g1 f1 g2 g3 f2 f3 d1 c2 1 1.0
b1 d1 d2 d3 d4 d5 b2 a1 b3 b4 1 0.0
c1 c2 c3 c4 c5 c6 d1 e1 e2 e3 3 0.0
c1 c2 c3 c4 d1 e1 e2 e3 e4 c5 2 0.0
c1 c2 c3 c4 d1 e1 e2 e3 e4 e5 6 0.0
c1 c2 d1 e1 c3 c4 a1 b1 b2 e2 1 1.0
c1 d1 c2 c3 c4 c5 d2 d3 d4 c6 1 1.0
c1 d1 c2 c3 c4 c5 d2 d3 d4 d5 4 1.0
c1 d1 c2 c3 c4 c5 g1 d2 d3 d4 1 1.0
c1 d1 c2 c3 d2 c4 d3 c5 c6 d4 1 1.0
c1 d1 c2 c3 d2 d3 d4 c4 f1 c5 2 0.0
c1 d1 c2 c3 d2 d3 d4 d5 d6 d7 4 1.0
c1 d1 c2 c3 d2 d3 d4 d5 d6 f1 1 0.0
c1 d1 c2 c3 d2 d3 d4 d5 f1 f2 4 0.625
c1 d1 d2 d3 d4 d5 b1 b2 d6 d7 3 0.666666666667
c1 d1 d2 d3 d4 d5 b1 b2 e1 d6 2 1.0
c1 d1 d2 d3 d4 d5 b1 b2 f1 f2 3 1.0
c1 d1 d2 d3 d4 d5 b1 b2 g1 b3 1 0.0
c1 d1 d2 d3 d4 d5 b1 f1 b2 a1 1 0.0
c1 d1 d2 d3 d4 d5 d6 d7 b1 b2 3 0.666666666667
c1 d1 d2 d3 d4 d5 d6 d7 b1 c2 1 0.0
c1 d1 d2 d3 d4 d5 d6 d7 d8 g1 1 1.0
c1 d1 d2 d3 d4 d5 d6 d7 f1 f2 5 1.0
c1 d1 d2 d3 d4 d5 d6 d7 f1 g1 1 0.0
c1 d1 d2 d3 d4 d5 d6 d7 g1 g2 2 1.0
c1 d1 d2 d3 d4 d5 d6 d7 g1 h1 1 1.0
c1 d1 d2 d3 d4 d5 d6 d7 h1 g1 1 1.0
c1 d1 d2 d3 d4 d5 f1 f2 b1 b2 3 1.0
c1 d1 d2 d3 d4 d5 f1 f2 d6 d7 5 1.0
c1 d1 d2 d3 d4 d5 h1 g1 f1 f2 1 1.0
c1 d1 d2 d3 d4 d5 h1 g1 h2 h3 1 1.0
c1 d1 d2 d3 g1 d4 g2 g3 g4 d5 1 1.0
c1 d1 f1 f2 c2 c3 c4 c5 d2 d3 1 1.0
c1 g1 g2 f1 e1 f2 f3 f4 g3 g4 1 0.0
c1 g1 g2 g3 g4 c2 d1 e1 e2 e3 1 0.0
d1 d2 c1 b1 b2 b3 b4 b5 b6 b7 2 0.75
d1 d2 c1 b1 b2 b3 b4 b5 d3 d4 3 0.333333333333
d1 d2 c1 b1 b2 b3 b4 d3 d4 a1 3 1.0
d1 d2 c1 b1 b2 b3 b4 d3 d4 c2 1 0.0
d1 d2 c1 b1 b2 d3 d4 c2 c3 a1 1 0.0
d1 d2 c1 b1 d3 d4 b2 b3 c2 c3 1 0.0
d1 d2 c1 b1 f1 e1 e2 d3 d4 e3 1 1.0
d1 d2 c1 b1 f1 e1 e2 e3 e4 d3 1 1.0
d1 d2 c1 e1 a1 b1 b2 b3 b4 b5 2 0.5
d1 d2 c1 e1 d3 d4 d5 d6 d7 e2 1 1.0
d1 d2 c1 e1 d3 d4 d5 e2 c2 c3 1 1.0
d1 d2 c1 e1 d3 d4 e2 e3 e4 c2 3 0.5
d1 d2 c1 e1 d3 e2 a1 b1 b2 e3 1 1.0
d1 d2 c1 e1 d3 e2 c2 c3 d4 d5 1 1.0
d1 d2 c1 e1 e2 d3 d4 a1 a2 a3 1 1.0
d1 d2 c1 e1 e2 d3 d4 a1 d5 d6 1 1.0
d1 d2 c1 e1 e2 d3 d4 c2 c3 a1 1 0.0
d1 d2 c1 e1 e2 d3 d4 e3 e4 a1 2 0.75
d1 d2 c1 e1 e2 d3 d4 e3 e4 d5 1 1.0
d1 d2 c1 e1 e2 d3 d4 e3 e4 e5 3 0.5
d1 d2 c1 e1 e2 e3 d3 e4 d4 e5 1 1.0
d1 d2 c1 e1 e2 e3 e4 d3 d4 a1 2 0.75
d1 d2 c1 e1 e2 e3 e4 d3 d4 e5 3 0.5
d1 d2 c1 e1 e2 e3 e4 e5 d3 c2 2 0.0
d1 d2 c1 e1 e2 e3 e4 e5 g1 g2 1 0.5
d1 d2 d3 d4 c1 b1 b2 b3 b4 d5 1 1.0
d1 d2 d3 d4 c1 b1 c2 c3 c4 b2 1 1.0
d1 d2 d3 d4 c1 b1 c2 c3 c4 f1 1 1.0
d1 d2 d3 d4 c1 b1 f1 e1 e2 e3 3 0.666666666667
d1 d2 d3 d4 c1 e1 e2 e3 e4 c2 3 0.5
d1 d2 d3 d4 d5 d6 a1 b1 a2 a3 2 1.0
d1 d2 d3 d4 d5 d6 a1 b1 b2 b3 2 1.0
d1 d2 d3 d4 d5 d6 a1 b1 c1 f1 1 1.0
d1 d2 d3 d4 d5 d6 b1 c1 c2 c3 2 1.0
d1 d2 d3 d4 d5 d6 b1 e1 a1 c1 1 0.0
d1 d2 d3 d4 d5 d6 c1 b1 b2 b3 7 1.0
d1 d2 d3 d4 d5 d6 c1 b1 c2 c3 3 0.5
d1 d2 d3 d4 d5 d6 c1 b1 f1 e1 5 0.6
d1 d2 d3 d4 d5 d6 c1 e1 c2 c3 1 1.0
d1 d2 d3 d4 d5 d6 c1 e1 e2 e3 1 1.0
d1 d2 d3 d4 d5 d6 d7 d8 e1 f1 1 1.0
d1 d2 d3 d4 d5 d6 d7 e1 b1 a1 10 0.9
d1 d2 d3 d4 d5 d6 d7 e1 b1 e2 1 1.0
d1 d2 d3 d4 d5 d6 d7 e1 e2 e3 9 0.777777777778
d1 d2 d3 d4 d5 d6 e1 f1 b1 c1 21 0.547619047619
d1 d2 d3 d4 d5 d6 e1 f1 f2 f3 18 0.75
d1 d2 d3 d4 d5 d6 f1 e1 e2 e3 39 0.794871794872
d1 d2 d3 d4 d5 d6 f1 g1 c1 e1 1 0.0
d1 d2 d3 d4 d5 d6 f1 g1 g2 g3 1 0.0
d1 d2 d3 d4 d5 d6 g1 f1 g2 g3 1 1.0
d1 d2 d3 d4 d5 d6 h1 e1 b1 a1 2 1.0
d1 d2 d3 d4 d5 d6 h1 e1 e2 e3 8 0.6875
d1 d2 d3 d4 d5 d6 h1 g1 g2 g3 3 0.333333333333
d1 d2 d3 d4 d5 e1 b1 e2 b2 b3 1 0.0
d1 d2 d3 d4 d5 f1 f2 f3 f4 h1 1 0.0
d1 d2 d3 d4 e1 f1 b1 c1 c2 c3 5 0.8
d1 d2 d3 d4 e1 f1 b1 c1 f2 f3 1 1.0
d1 d2 d3 d4 e1 f1 e2 e3 e4 f2 1 1.0
d1 d2 d3 d4 e1 f1 f2 f3 d5 f4 2 0.5
d1 d2 d3 d4 e1 f1 f2 f3 f4 d5 3 0.666666666667
d1 d2 d3 d4 e1 f1 f2 f3 f4 f5 8 0.1875
d1 d2 d3 d4 g1 f1 f2 f3 f4 d5 2 1.0
d1 d2 e1 c1 d3 c2 e2 e3 d4 d5 1 1.0
d1 d2 e1 f1 d3 d4 b1 c1 b2 b3 1 1.0
d1 d2 e1 f1 d3 d4 d5 f2 b1 c1 1 1.0
d1 d2 e1 f1 d3 d4 f2 f3 d5 f4 2 0.5
d1 d2 e1 f1 d3 d4 f2 f3 f4 d5 3 0.666666666667
d1 d2 e1 f1 f2 f3 f4 d3 d4 d5 1 1.0
d1 d2 e1 f1 f2 f3 f4 d3 d4 f5 1 0.0
d1 d2 e1 f1 f2 f3 f4 d3 d4 g1 22 0.681818181818
d1 d2 e1 f1 f2 f3 f4 f5 d3 d4 8 0.1875
d1 d2 e1 f1 f2 f3 f4 f5 f6 f7 1 1.0
d1 d2 e1 f1 f2 f3 f4 f5 h1 h2 3 0.5
d1 d2 f1 e1 d3 d4 f2 f3 f4 d5 2 0.5
d1 d2 f1 e1 f2 f3 d3 d4 f4 d5 2 0.5
d1 d2 f1 e1 f2 f3 f4 f5 f6 f7 1 1.0
d1 d2 g1 e1 e2 e3 e4 d3 d4 d5 1 1.0
d1 d2 g1 f1 f2 f3 f4 f5 h1 h2 1 1.0
d1 d2 h1 e1 e2 e3 e4 e5 e6 e7 1 1.0

wccanard at 20090130
OK, so what is the above? Well, there’s a line for every 10move opening played in the database [after throwing out games that ended with timeout and reflecting any games that started in columns e,f,g,h]. The lines are sorted in some sort of “alphabetical order” in a way that I’m sure will be clear to you. Important note: two games that are in the same position but which got there in different ways [move transpositions] are regarded as different, for the purposes of this part of the exercise, because I realised that if I lumped them together at this point then you’d find it much harder to search the table by hand (you wouldn’t be able to find your favourite 10move opening because it turned out that it was played once in a weird way which happened to be alphabetically before the normal way you play it).
OK, now, for each line, obviously the first 10 entries are the first 10 moves [using chess notation, not the daft notation here]. And now listen carefully: I now consider all games that were in this position after 10 moves, rather than all the positions in which those 10 moves were played in that order. And the last two entries in each row are: how many games in the database were in that position after 10 moves, and the average score for P2.
Let me go through one example to indicate explicitly how I’m dealing with move transposition. There is one game in the database that started
d1 d2 e1 f1 d3 d4 f2 f3 f4 d5
and two that started
d1 d2 d3 d4 e1 f1 f2 f3 f4 d5
Note that the positions are the same after 10 moves. Those two “distinct” openings correspond to two lines in the table. But for each of the lines in the table, the stats for all three of the games in the database is shown. 
wccanard at 20090130
@trincot: I have to parse your question carefully because I’ve just noticed that some games are proper subsets of other games ;)

wccanard at 20090130
Hah! There are two 64move games that finished with exactly the same position :)

wccanard at 20090130
Trincot: the answer to your question is 36. The games
http://www.littlegolem.net/jsp/game/game.jsp?gid=421813&nmove=36
and
http://www.littlegolem.net/jsp/game/game.jsp?gid=421834&nmove=36
are not only in the same position after 36 moves, but the moves were all made in the same order. 
wccanard at 20090130
And here are two games that finished in the same position:
http://www.littlegolem.net/jsp/game/game.jsp?gid=421813
[I just mentioned that one!] and
http://www.littlegolem.net/jsp/game/game.jsp?gid=555263 
wccanard at 20090130
This game
http://www.littlegolem.net/jsp/game/game.jsp?gid=736853
is a proper subset of this game
http://www.littlegolem.net/jsp/game/game.jsp?gid=736838
yper: I don’t know what you were referring to when you said “game numbers please”? Have I answered your question? 
MichaeI X at 20090130
Thank you, so far.
Do I understand “proper subsets” correctly:
the “subset” game was just resigned earlier than the “other” one ?
Which leads me immediately to the question:
— are there games resigned by the wrong player?
(I understand your database is built from the top division games of the last championships)
And if there are none, which I assume:
I admire your skills to identify same positions along different paths.
Let’s also call all the subset games identical to the longer ones.
— How many duplicate games are there among those 335 samples ? 
wccanard at 20090130
Sort of important question: do the experts here have any opinion as to the “quality” of the data points I’ve chosen? I went for the last 10 championships because I was slightly scared that the earlier championships could be being played between e.g. people who knew far less opening theory. But I don’t know anything about how far opening theory has moved on in the last few years or how many people know it and knew it. Should I have been using more data, or less? It would be nice to access jijbent data but this would involve (a) my doing some more coding and (b) a lot of faffing around trying to get data downloaded from the jijbent site [the wonderful thing about the way the data is stored here is that there are certain tournaments that only have good players in e.g. league 1 of a championship, and all of the data is easily available via one link. That’s not the case at jijbent.]. I’m not sure I want to embark on that at all.

wccanard at 20090130
Michael X: the subset gameyes. It just made trincot’s question a bit harder to answer in some sense.
“are there games resigned by the wrong player? ”
If I interpret this as “are there two games, one a proper subset of the other, in which different players won” then the answer is “the only time in the database that one game is a proper subset of another is the case I mentioned above”. But in fact even if I interpret the question as liberally as I possibly can, the answer is still “no”. There are five instances of games which finished at a position P which was reached in another game, and the second game continued beyond P, but in all five instances the outcomes of the two games were the same. Similarly there are two instances of two different games both finishing in the same position, but again both times the results were the same each time.
“ I admire your skills to identify same positions along different paths. ”
Here’s how to do it: in my fourinarow class (a python class), I carry around not just the moves, but also every single position that the game was ever in. It’s a lot of baggage and if there were 1,000,000 games in the database then it would be intolerably slow.
“How many duplicate games are there among those 335 samples ?”
I think I’ve just answered that, haven’t I. In fact let me be a bit more precise. Numbers here are the numbers in my database, not numbers of games here, so they mean nothing to anyone but me.
Games 3 and 94 finished after 64 moves with a tie (but the moves were played in different orders; those are the games I mentioned above that finished in the same position). Game 23 finished after 48 moves, in the same position as game3after48moves. Game 80 finished after 36 moves, in the same position as game94after36moves. All four of those games were ties.
Game 193 finished two moves earlier than game 178 in the same position [that was the “proper subset” game I mentioned abovenot only were the final positions essentially the same, but the moves were played in the same order].
Finally, games 295 and 332 were 58 moves long and finished in the same position with a P2 win, and game 315 was in the same position after 58 moves, and finished two moves later, also with a P2 win. 
wccanard at 20090130
Those last games I’m talking abuot (332,315) are
http://www.littlegolem.net/jsp/game/game.jsp?gid=932808
http://www.littlegolem.net/jsp/game/game.jsp?gid=932826
http://www.littlegolem.net/jsp/game/game.jsp?gid=879032
Probably Bernard Herwig couldn’t believe his luck ;) 
wccanard at 20090130
@yper: I just reloaded the entire database from this site, this time keeping track of the LG game numbers. So there will be no more of this “game 332 in my database” rubbishI can easily find out the LG numbers.

wccanard at 20090130
When I said “game number x”, for x in the following vector:
[3, 94, 23, 80, 193, 178, 295, 332, 315]
I was talking about the golem game with number in the following vector:
[421813, 555263, 421834, 555248, 736853, 736838, 879032, 932826, 932808]
Sorry to obfuscate things. If only one could edit forum postings here! 
trincot at 20090201
It would also be an interesting exercise to take the games where P1 did not win, and identify the first wrong move made by P2. Of course this is out of the scope of automated processing, although if two games start the same way and one ends with P1 as winner, while the 2nd doesn’t, and if P2 was the one who made the deviating move, then that move is a candidate “wrong move”...

wccanard at 20090201
That’s an interesting point! [typo: you mean “P2 did not win” in the first sentence, of course]. In fact I have done a bit of this sort of analysis already. I have tried really hard to learn some standard openings. I have seen that in practice this isn’t doing me much good against the strong players, because I play 9 moves and now all I have is “P2 usually wins in this line”, but of course I can’t win yet, because I don’t understand the principles of the game yet. But I know several examples of “wrong moves” now, and have them written down. Probably the top players all know these traps already though.