Twixtbot vs. The World TWIXT PP
31 replies. Last post: 2020-01-09Reply to this topic Return to forum
MisterCat at 2019-12-24
Idea. Twixtbot has been dominating the human contingent of players here. It analyzes millions of games using distinctly non-human strategy, ie. backwards thinking. By playing games out and tabulating final results, it uses final results to make move-by-move decisions. Humans do not think that way, and can not.
But can forward thinking, when applied to the max, post a challenge to the bot?? Forward thinking is what humans are using when they play, and can be viewed anyplace on Twixt Commentator; the lines, sometimes a few moves deep from the present position, and sometimes ten or 20 moves deep. This type of analysis looks at the present position and strategically plans ahead, based on placement of pegs on best points, and creation of upcoming link chains.
But I can not hope to compete. Can the COMBINED EFFORTS OF ALL TWIXT PLAYERS HERE create a challenge and defeat the Twixtbot in a fair game? I say ‘fair’, even though some people might consider collusion to be unfair. Firstly, I really don’t think Twixtbot will mind. Secondly, this ‘game’ should be thought of as ‘an experiment’, and not a contest. The final result will be recorded in the record of my dummy ID created for the purpose: MCx. The game is not rated, and I will never use MCx to play any ‘real’ games – only as an analysis tool. Btw – anybody can do this.
The game is already created. Visit Twixt Commentator and look for game # 2139555
I will try to create the hyperlink for convenience below:
I hope that worked.
I am hoping to get this game started by New Year’s Eve. I will provide the service of monitoring comments at Twixt Commentator and, of course, making the moves when they come in. I will be sure to not lose by time forfeit. I promise to be diligent.
The idea is that THIS GAME IS OPEN TO ALL. Feel free to make comments and post analysis. Personally, I don’t really want to be responsible for making any final decisions on moves; I am hoping that strong players will come to a consensus in their analysis. I will ONLY make an adjudication and decide on a move if we are about to lose by time forfeit. Really – we don’t want that, since MY decision might be a good one, but is more likely to be faulty.
I am hoping that all of the strongest players here – you know who you are, the 2000 and above group (now, or formerly belonging), will regularly join in. This gives The World the best chance to win, I think, but ALL ARE WELCOME to participate!
So what do you think, folks? Everyone is invited to the party. All I have to do now to start the game is issue a fresh challenge to the Twixtbot; the game will commence, and I will regularly check for posts on Twixt Commentator to decide on each and every move, starting from move #1. I am even inviting Bony Jordan to comment – so long as it is the HUMAN player; I don’t think that Twixtbot should comment on playing against itself. We have already seen how those games go.
MisterCat at 2019-12-24
I tried, but failed, to hyperlink directly to the Twixt Commentator page for the game. Oh well. Perhaps somebody can do that for me. Meanwhile, the link above brings you to the game on LG, and then you can click on Twixt Commentator to get there.
David J Bush at 2019-12-24
Good idea. Certainly no engine should be used against the bot here, but I see nothing wrong with examining past games, especially early in the game. I hope you agree.
Alan Hensel at 2019-12-24
Games with Top-10 players always felt to me like “whoever makes the first mistake loses”, so with people openly checking each other’s moves, hopefully we can avoid making any particularly egregious errors. Because TwixtBot isn’t gonna be making any of those.
David J Bush at 2019-12-25
Regarding how much time to take on the moves, your “stub” account has what, 20 vacation days? The opening moves are crucial and there will almost certainly be opportunities to build our time back up later in the game. Perhaps the real bottleneck that will limit our analysis will be our own patience. How about 5 days for each of the first 5 moves we make? That’s 120-36=84 hours deducted with each such move. Our third move will eat 1 vacation day, our fourth and fifth will each eat 4 more. Likely we will finish our moves sooner, but my point is, as long as we speed up the pace afterwards, this will give us time to look at stuff, and we won’t forfeit. You could always establish a new stub account for a new game.
ypercube at 2019-12-25
I’m not even in the top-20 but I’m in!
ypercube at 2019-12-25
MisterCat at 2019-12-25
Once I start the real game, I’ll post a new link and game number for it. We need to post analysis on the fake game above, since you can’t post comments at the Commentator while a game is in progress. I am still shooting for a start before New Year’s, so let’s continue to bat around ideas on that first move. Several people have expressed concerns about timing out. I promise to monitor the times closely, and check on this every day. If it ever looks like we must move quickly, I’ll press our contributors for some quick advice. Remember – we start with 240 hours (10 full days), and with 36 hours added after every move, I believe that we’ll have enough time to, at the very least, play a carefully considered set of opening moves.
David J Bush at 2019-12-27
1. MC, will you use your account to play the game, or the stub? If the latter, are you amenable to the notion of possibly using up vacation days on that account early in the game to give us more time to look at opening ideas?
2. Personally I would like any move made by our team to pass a peer review process. Any specific objection would ideally be answered to everyone’s satisfaction. But I recognize that may not be practical for the first few moves. I am willing to go with whatever first move Alan accepts, or to have the bot move first. This is just my opinion of course.
3. Maybe another statistic that could help us is, out of the 400 or so TB games against humans where TB moved first, what moves did it make?
technolion at 2019-12-27
For this to work out, we would need a system that would take votes on moves from human players. The votes should be weighted by the human’s LG twixt rating. The move with the most vote-points would then be automatically played after a fixed time period.
Alan, you seems to be a great programmer – Do you have some spare time “between” the years? ;-)
David J Bush at 2019-12-27
Okay, IF we are going to have a process like that, which would take time to implement, it would make sense to adopt a conservative time management policy, which means don’t lose vacation days early in the game, if we can avoid it.
MisterCat at 2019-12-27
1. I will start the game using my MCx stub. I don’t want the record of this game in Mister Cat’s profile, unrated as it is. I view this game as an experiment, though I am hoping for the humans to score a victory. There is nothing ‘special’ about what I am doing here – anybody CAN do the same. I am doing it FIRST, and am willing to put in the time and effort to manage things.
2. This is the ONLY game that MCx will be playing, for now, so vacation days can be used as needed. I agree that the opening stage of the game is of crucial importance, but am really hoping that the team can manage to agree on moves before there is any time emergency. As I said, if I SEE an emergency approaching, I will get on everybody’s case. I am not planning on enforcing any sort of ‘official’ time limit; my feeling is to just see how it goes. I am also hoping that the posts at commentator will allow the team to agree on moves – the idea is to post a suggestion with convincing reasoning and analysis, so as to be persuasive.
3. Alan is the stat man here, so I’d ask him about this.
I certainly thought about a system like you describe, but I can’t implement it, and I don’t think anybody else is inclined to; not at this stage. Think of this game as an informal experiment; let’s just see what happens.
I really, REALLY hope to NEVER have to be the one to make a final decision about a move, since my decision is quite likely to be flawed, Naturally, I will be paying the most attention to the strong players who participate (like, for instance, YOU), but this is supposed to be FUN; everyone is invited to participate. Perhaps a move suggested by a 1300 player is not a good one, but it’s only a suggestion; it does not mean that it will be played. And what’s that phrase: ‘Out of the mouths of babes’... etc. ? Who knows what suggestion will prove to lead to our victory, or our downfall!
I am using New Year’s Eve as my deadline to start, and we’ll see what posts and suggestions appear at the Commentator before then. Note that once I start the REAL game, I’ll note that game number HERE, and also provide a direct link.
Alan Hensel at 2019-12-27
If I had time to develop more Twixt software, I’d also be playing Twixt...
There have been “vs. the World” games in Chess, but the take-away that I took away is that voting in the context of a deep game gets you mediocre moves. What you need for a deep game is deep discussions. That is what makes MisterCat’s plan a good one.
If there comes a turn where the discussion isn’t converging on a move, or isn’t converging fast enough, the highest rated player should make the decision.
David J Bush at 2019-12-27
That’s hopefully a link to a low tech spreadsheet of first moves by TB against humans, going back 14 months. The four circles indicate human victories. Either the human played there and was not swapped, or tb played there and was swapped. The most recent human victory was by Alan, first move K7. Maybe we could move there on move 1.
MisterCat at 2019-12-29
I don’t know if we can hope to repeat Alan’s win from last February; we are playing a much improved Bot at this point. I expect that same can be said for the few other wins – Twixtbot learns from it’s mistakes, improves its neural net – and also, there is a certain degree of randomness.
At any rate, it will be confusing to have to look back and forth between The Commentator and The Forum for game analysis, so I propose that specific analysis of moves be kept over at The Commentator. No problem using this forum post for general commentary. Once I start the new, REAL game, I will post the game number here, with links to the game and also to Commentator. We must post running comments and suggestions at the fake game (above), since it is completed. Commentator will not let you post any comments for games that are in progress, which is why I am doing it this way.
TwixtBot at 2019-12-30
The good news (?) is that Twixt Bot sort of peaked out in June, and hasn’t improved since then.
David J Bush at 2019-12-31
Does this peak have a theoretical Elo rating? Would it make sense to regard such a rating as a measure of the depth of Twixt PP?
TwixtBot at 2019-12-31
“Does this peak have a theoretical Elo rating?” - yes, but I don’t know what it is. See below.
“Would it make sense to regard such a rating as a measure of the depth of Twixt PP” - no.
So the way these Alpha-Go type bots work is they have a neural net which is trained up over a ridiculous amount of computations. You can give the neural net a position, and it gives you an approximate evaluation and a quick estimate of what the best moves are. On top of this you build a search tree that is very similar to what the previous generation of MCTS Go bots built.
To make Twixt Bot (or any bot, really) smarter, you can either give it a better neural net, or more bites at the neural net. Specifically, I give Twixt Bot 50,000 evaluations of the neural net to choose its move. On my computer, that takes about 5 minutes. You could hook up 100 computers each designed to run the neural net, and speed that up by, if not a factor of 100, enough to pick an equivalently good move in a few seconds. Alternatively I could give it 500k evaluations, and then it’d take close to an hour per move, and play an even stronger game than it does currently.
When you start training your neural net, you decide how big the net will be. The bigger the net is, the smarter it will be, but the longer it takes to run a single evaluation, and the longer it takes to “converge” to a point where it doesn’t improve any more. So when I say I peaked out in June, this is what I’m talking about: the net size I chose got as good as that size could get.
When Google says they fit a really good Go or Chess net in 3 days, I mean, it’s true, but they also drop what I estimate to be like $1 million dollars worth of compute farm time to do it.
Anyway, if we wanted to compute an elo rating for current Twixt Bot, we would do something like play a “100 eval” or a “1 eval” version of TB, something weak enough that it loses enough games to humans to have an anchor. Then we could battle “100 eval” vs. “150 eval” to estimate the elo of 150; then battle “150 eval” vs. “250 eval” to estimate the elo of 250 eval, and so on up the line until we get to 50,000.
David J Bush at 2019-12-31
Thanks very much! I guess you mean more bytes not bites, meaning more memory for the neural network. Sorry for being pedantic. So, a bigger and/or faster net would likely peak at a higher rating, which says more about the hardware and software than it does about the game itself. Did I get that right?
MisterCat at 2019-12-31
THE ABOVE POST IS IN ERROR.
Sorry about that, but I started the game using the wrong account. I promptly resigned the game, so NOTHING TO SEE THERE.
I will start a new game, using the correct ‘MCx’ account, later on today. Check back later.
MisterCat at 2019-12-31
OK, all done, as promised. I checked, and double checked, so as to not screw up.
The real game is # 2140447
A link to the game in progress is here: The World vs. Twixtbot
A direct link to the analysis page is here: game analysis at CommentatorThe World team is to move first, and technically we have 24 days plus vacation. I sure hope it does not take that long!
MisterCat at 2019-12-31
always an error; it is why I lose:
we have 240 hours; that’s 10 (ten) days, plus vacation. of course, every time we move, 36 hours is added to our clock.
TwixtBot at 2020-01-01
@DavidBush – no I mean bites with an ‘i’. TB playing here gets to ask its neural net about exactly 50,000 positions, no more, no less.
MisterCat at 2020-01-01
From comments made by spd_iv at The Commentator analysis, he proposes
1. starting a 2nd game to run concurrently with this one, but with opposite sides; my feeling is that it opens up the possibility of using Twixtbot’s live analysis against it, which strikes me as unfair to the poor bot. I already checked, and know he does not mind a gang of humans teaming up against him, but would rather not be facing himself! Also, just managing this experiment is proving to be stressful and time consuming for me; please, one game for now. What I have done here can be done again by anybody, but let’s just see how it goes.
2. spd_iv proposed a move hierarchy, where higher rated players' comments are valued more; I addressed this somewhat above when technolion proposed something similar, but more complicated. I am trying this out as a ‘fun experiment’, and really want to keep things informal – not ‘official’. Instead of being outvoted or outpointed on move selections, it is really my hope that the discussions and analysis of moves at Commentator will CONVINCE people to agree on the soundness of certain moves, and faults with others. Again, let us just see how things go, and after the game is finished, perhaps we’ll do it again using different arrangements (and a different master of ceremonies?).
That Commentator thread is going to get REAL, REAL long, so let us try to keep general comments about the game here, and SPECIFIC analysis of moves over there. Remember – this experiment is open to all: anybody can suggest a move or comment. Maybe the team will like the suggestion, maybe not, but I say the more the merrier; let’s have FUN here. Also, for those checking out the comments, there is probably much to learn about advanced level play from the posts being made!
David J Bush at 2020-01-02
For those of you who find the prospect of scrolling scrolling scrolling, rawhide, only to discover we haven’t made the first move yet, rather discouraging, here’s an update:
We decided humans will play the first peg on the board.
We will probably play 1.d6 or 1.j6.
We haven’t decided which one yet.
If you just want to see the game in progress, scroll up slightly from here and click on the twixtbot vs world link.
If you have any question please post it here. I will try to answer quickly. If you are asking about a specific variation, you would get a better answer on the analysis page if you don’t mind all the scrolling, because you can click on the moves and see positions interactively.
ypercube at 2020-01-06
Moves so far:
First (odd) White is TheWorld
Second (even) Black is TwixtBot
lguser at 2020-01-07
maybe instead of chatting on twixtcommentator, you could open a discord server for this
MisterCat at 2020-01-07
I don’t know what a discord server is.
At any rate, The Commentator seems to be working out OK. It’s got the built in board, so we can look at moves and analyze. Alan CREATED The Commentator for this reason – albeit NOT to analyze ongoing games, but that’s a quibble. Also, with his handy HACK (see the next thread on this topic), you can easily scroll down to see the latest comments.
Thanks for the suggestion, whatever it may be.
Florian Jamain at 2020-01-09
This is not hard, I created one quickly https://discord.gg/HGSAu9w
ypercube at 2020-01-09
No, we don’t need discord. Commentator is fine.