Monday, 27 August 2012

August 22: The Worst (Research) Day of My Life

It took me a while to decide to make this posting since I have to confront past events that I have tried hard to block out. The passage of time may have softened the blow, but the memories are still painful.

In my previous posting, I talked about the twentieth anniversary of the Tinsley-Chinook match for the World Checkers Championship. Please read that posting first as it sets the stage for this one.

The best-of-40-games match started with four draws. Tinsley won game 5 and missed a win in game 7. Chinook played beautifully in game 8 and won – only the fourth loss by Tinsley since 1950! Chinook scored again in game 14. After 16 games, we were in the lead by a score of two wins to one. The checkers world and the media were agog at the possibility of the computer winning the match. That was Friday August 21.

My apologies for the length of the following excerpt from One Jump Ahead (Springer-Verlag, 2007). I found it difficult to shorten the text significantly without impacting the “drama”.

While walking to breakfast with Paul the next day, I remarked to him that almost everything had gone our way thus far in the match. (Note how I conveniently forgot about games one, five, and seven.) Was it even conceivable that we might win? When was the bubble going to burst? Unfortunately, these innocent remarks turned out to be prophetic.

We returned from breakfast and had the hotel staff unlock the door to the playing hall. When the doors swung open, we were blasted by a stifling heat. Nothing looked untoward, but the heat was so overwhelming that we began to sweat. We immediately contacted the hotel, who sent someone to find out what the problem was. Apparently it was hard to get good weekend help in London. The security guard who locked the room at night failed to follow instructions. Turn the lights off and the air conditioning on. Simple! Unfortunately, our instruction-challenged guard had turned the air conditioning off and left the lights on. Given that the playing hall was a closed room with no windows and poor ventilation, the result was predictable. It was even worse than that because the room housed numerous computers and a projector—all heat generators (especially our SGI 4D/480 [computer]). An embarrassed hotel staff hastily did all they could to cool down the room, but there was nothing that could quickly resolve the problem. Even after opening the door and bringing in some fans, the room remained unpleasantly hot. The start of game seventeen was delayed for thirty minutes. This was an awkward situation for the hotel since it was a weekend, and a large number of spectators were waiting to enter the room.

Game seventeen began, but I quickly became quite uncomfortable sitting onstage. The pitcher of ice water that Tinsley and I shared was emptied and then re-filled. Clearly, the heat bothered Tinsley too. It was completely out of character for him to breach the game etiquette by remarking to me how hot it was. I got up and went over to David Levy to discuss the problem. Marion and I were sitting on an elevated stage beside the hot computers. It was very unpleasant, and something had to be done. David appreciated the seriousness of the situation, disappeared for a few minutes, and then reappeared with a senior member of the hotel staff. The game was interrupted for over an hour as frantic attempts were again made to cool down the room. Two fans were brought in and installed beside the game board—one pointed at Tinsley and the other at me. When play resumed we made a few perfunctory moves and then agreed to a quick draw; anything to get out of the room.

Programmer’s log, Chinook project, day 1,178 Saturday, August 22, 1992

After a long interruption we start game eighteen. It isn’t hot in the room anymore, but it still is uncomfortable. The heat doesn’t bother Chinook, as we build up a nice advantage (+31). Normally I would start fantasizing about a win, but maybe the two wins against Tinsley have made me complacent. All I’m thinking about is the day’s bizarre events.

Hmm. Marion is teasing us again. Chinook says it’s up 54 points. I shake off my lethargy and start getting interested in the game. Hey! Chinook is getting the first king and has an obvious advantage. Now we’re up 69 points. Could a third win be around the corner? A few more moves, and the advantage is still there. ... Chinook starts computing. I have to wait a few seconds for anything to appear, since we don’t print anything until the program reaches thirteen plies [a ply equals one move by one player]. The wait is worthwhile; Chinook has good news for me: ... A +86 score—it must be another Chinook win. Look! Chinook’s analysis says that Tinsley is so desperate that he has to sacrifice a checker. Ho hum. These victories are getting pretty routine now.

But... at the start of the search, Chinook prints out the following message:

MT database: f4-e5

This position is in Chinook’s database of Tinsley games! In other words, Tinsley has encountered this position sometime in the past. Since his preceding moves weren’t in our Tinsley database, this game represents a different move sequence than he has played against before. Maybe this transposition back into one of his games will confuse him. Of course, he has lost so few games that it’s unlikely he would knowingly walk into a losing line of play. Chinook’s score suggests otherwise. Maybe our databases are about to turn this supposed draw into a win.

DEPTH 15 [+93] ...
DEPTH 17 [+86] ...

Chinook doesn’t think it has enough time to complete nineteen plies, so it decides it’s time to move. Before doing so, it tries my new PV (principal variation) extension trick. The program plays down the first four moves in the above line (the so-called principal variation) and checks to see that everything is what it should be. It does this by searching that position an extra two plies. This is supposed to be insurance that nothing untoward is happening.

Extend PV 2 ply, starting 4 moves down the line
PVextension fail at 4

Oops. After playing down the first four moves..., the additional search shows that the score is going to drop by a significant amount, thirty points being the minimum threshold. The program backs up a ply and does another search to nineteen plies deep to see whether the score drop still holds:

PVextension fail at 3

Darn. We back up a ply and try again...

PVextension fail at 2

And again...

PVextension fail at 1

We now know that a nineteen-ply search results in a score that is less than or equal to 56 points (86-30=56). I knew the high score was too good to be true. Since the score is changing in a major way, Chinook will now allocate as much time as it can to complete the nineteen-ply search and find out how serious the problem is. It takes less than two minutes to get the verdict: [+33] …

From the dizzying heights of anticipated success, my hopes are rudely brought back down to Earth. In all likelihood it’s a draw after all. Does Marion know that he’s tormenting me so?

Time up but search unstable! Extend search for 380 more seconds

Chinook decides it wants to finish the nineteen-ply search before moving. The program has lots of time left on its clock, so why not? It decides to spend at most another 380 seconds trying to finish the search. I glance at the clock and see that Chinook has already spent thirteen minutes on this move but has twenty-two minutes remaining. Actually, I wish the program would just move so that we can get this game over quickly, go for a nice lunch, and then come back to a (hopefully) comfortable playing hall.

Time up but search unstable! Extend search for 95 more seconds

That’s odd. The program has now spent nineteen minutes on this move, unusually long for a nineteen-ply search. Well, who cares? Another ninety-five seconds won’t make any difference.

Time up but search unstable! Cannot have another extension

Chinook is programmed only to extend the time twice (a modification that came about as a result of the Lafferty loss on time). The ninety-five seconds has expired, so now the program should move... any second now. It usually takes a few seconds to coordinate the parallel program, so once that’s done... uh, hello? Chinook?

I can’t begin to describe the feeling of nausea that instantly sweeps over my body. Perhaps for the first time in my life, I truly know what it means to feel heart-stopping panic. Chinook is supposed to move and it won’t. The program has fourteen minutes left on the clock to make its last four moves. If we don’t make those moves, then we forfeit the game.

I bolt from the chair and head into the audience to find Paul. One look at my face tells him part of the story. I breathlessly explain the situation to him. We check the machine and find that Chinook is still busy computing away—doing what, I don’t know. Unless the program moves... I glance at the clock—less than ten minutes and counting. The spectators sense that something has gone wrong, and I can hear a murmur rising from the audience. Tinsley seems oblivious to what’s happening. He continues to stare at the position, looking as if he’s lost in another world.

“How’s everything going?”

I whirl around to see who might be cheerily asking such an annoying question at this inopportune moment. It’s Rob Lake, fresh off the airplane from Canada. He certainly picked a fine time to arrive. He senses the seriousness of the situation and stands back, watching helplessly.

When the “time up” message occurs, Chinook is supposed to notify all the computers to stop searching and make a move. The message appeared, but none of the computers have stopped searching. In desperation I hypothesize that somehow the “time up” notification has been lost. I’m not quite sure how this could happen, but it is plausible. I’m grasping at straws. With three minutes left on the clock, I hunt for the command that allows me to send a “time up” message to Chinook (an interrupt in computer jargon). Finding it, I type in the command. I’m not sure how ethical it is for me to do this, but I’ll worry about that later. All that matters is to make the time control. There’s only a minute left on the clock. The program instantly replies:

I move: 17. ... d2-c3 Value = 33

Three moves to go and only a minute left on the clock. This calls for fast fingers at the keyboard. Marion immediately replies with the forced capture d4×b2. My fingers tremble as I type the move in. I hit return expecting Chinook to immediately reply with a1×c3. Nothing. Nothing?? No response. I don’t know what else I can do. I rush offstage to find David Levy and ask for a time-out. He doesn’t know why I made my request, but he sees my panicked state and immediately agrees. He comes onstage, stops the clocks, and then asks me what’s going on. It takes me a moment to collect my composure before I tell him what has happened.

In computer chess tournaments, computer versus computer, the participants are allowed fifteen-minute time-outs if they can demonstrate that a problem is due to circumstances beyond their control. Clearly, if there’s a bug in the program, you can’t get a time-out—you must do whatever the program says to do. Sometimes, however, problems occur that shouldn’t be held against the computer. For example, if the computer is using a phone line and the line gets disconnected, should the program be penalized? This has nothing to do with the quality of your program; these are events beyond your control. In the computer chess world, we allow the side with difficulties to stop the game, fix the problem, and then continue. Usually the problem can be easily resolved; for example, if a phone connection drops (just re-dial). Sometimes, however, the problem can’t be solved, and a forfeit results. For example, Deep Thought forfeited a game in the 1994 North American Computer Chess Championship when there was a power outage in the building that their computer was housed in. After waiting in vain for several hours for the power to come on, they had to concede the game.

David made the correct decision according to computer-chess precedent, but per- haps not the right decision given the inexperienced checkers audience. He told us to take our time and find out what’s wrong. If the fault is with Chinook, then we forfeit. If the fault isn’t with Chinook, then we can restart the program to continue at move 18 of the game with one minute on our clock.

Rob, Paul, and I gather around the computer screen to see what’s wrong. We try communicating with Chinook, but get no response. We have no idea what it’s doing or why it isn’t responding. Finally, in desperation, we decide to kill the program with a “core dump.” Terminating the program this way causes the state of the program to be saved on disk so that we can postmortem it. This will allow us to autopsy the Chinook corpse to identify the cause of its paralysis.

Surprise! Chinook refuses to die. Paul, Rob, and I know that when you kill a computer program it should die and go away (not unlike the real-world analogy). Chinook won’t die—it’s very much alive and computing—whatever it’s computing. This is a bigger surprise because this just isn’t supposed to happen. We can’t seem to do anything else on the computer, so we do the only thing that gives us back control. The SGI 4D/480 is powered off, left to rest for a few minutes, and then powered back on.

What went wrong? I immediately jump on the room temperature as the cause. The computer has been “cooked” all night long. Is it possible that the extreme heat caused the computer to malfunction? Possible, but hard to prove. Paul comes up with a suggestion as to how a software bug in Chinook might cause this problem to occur. He is speculating because we don’t know for sure whether his suggestion could even happen. Meanwhile, the computer reboots. We log in, start up Chinook, and it runs as if nothing were wrong.

Norm can’t help out on the technical side, so he goes wandering through the crowd. He keeps hearing the same thing. The people in the audience don’t see why we should be allowed to stop our clock in the middle of a game. That would never happen in a human event. There seems to be unanimous consensus that Chinook should forfeit. Norm brings this distressing news back to me.

So, gentlemen, it’s decision time. Is this a problem with Chinook? If so, then we should forfeit. Or is it a problem beyond our control? If so, then we should ask to continue the game.

Paul and Rob don’t say anything. The uneasy silence is broken by Norm. In the interests of good sportsmanship, he says, the honorable thing to do is resign the game. The problem may be our fault. Even if it isn’t, no one, perhaps including Tinsley, will understand why we should be allowed to take a time-out and fix things. People will accuse us of every dirty trick in the book—doing anything to win. Norm is brief, but very much to the point. Again, an uneasy silence prevails.

“I think we should resign.”

That’s the hardest sentence I’ve ever said in my life. I look up and see the others slowly nod in agreement. I have to collect myself for a moment; the emotional impact is too intense. Norm tells me that Tinsley, Levy, and Keene are eating lunch in the hotel restaurant. I ask everyone on the team to accompany me to the restaurant. We find the trio in a back corner of an otherwise deserted restaurant.

I walk up to the table extend my hand to Marion and say, “The Chinook team resigns.” Marion looks up glumly and apologizes for winning a game this way. David looks at me and immediately says, “You did the honorable thing.” Raymond Keene congratulates Marion on evening the match score at two wins apiece. Tinsley writes,

“I must confess to having some very mixed feelings about it all, even though I was being given a new start—so to speak.”

Marion Tinsley (London, 1992).
[Checkers grandmaster] Richard Pask was in attendance that day (taking a nine-hour round-trip train ride). He later reflected on the fateful decision to resign.

“The decision taken to forfeit game 18, although clearly very painful for you, was undoubtedly the correct one, for a whole host of reasons. For one thing, it retained the high standards of integrity and decency associated with the entire Chinook team, and won my public applause, on behalf of all players, at the time. Norman Treloar was correct in perceiving anger amongst the players present, but would perhaps have been surprised at its intensity. While Tom Landry’s statement that a failure to award the game to Marion would be “the worst decision in the history of the world,” was regarded as somewhat over the top, it nevertheless reflected the strength of feeling in the playing room.”

There’s less than an hour to the next game, and we have no idea what to do with Chinook. The program is up and running again and seems fine. We decide to let it play a game against itself and go for lunch. I’m in shock as all four of us walk to a fast-food restaurant. Paul talks out loud, speculating on possible software explanations for the symptoms of the problem. I don’t hear much of what he’s saying; I’m lost in my own world. Everything has been going so well—almost too well. Seventeen games under our belt and Chinook is in the remarkable position of being in the lead. The fall from the dizzying heights of success is brutally painful. It happened so suddenly. On a single move I experienced the exhilaration of thinking we were going to win, the disappointment of seeing yet another draw looming, and the devastation of a forfeit loss.

Over lunch the discussion quickly turns to the temperature problem in the playing hall. Did the carelessness of a hotel employee cost Chinook this critical game? The room temperature might have been over 30 degrees C (86 degrees F) for most of the night. In theory this shouldn’t have been a problem, but the coincidence seems remarkable. In three months of testing, over eighty tournament games… and twenty test games…, nothing like this had happened before. If the program had forfeited on any other day we would have assumed it to be a software problem. But today of all days, there’s this exceptional condition that casts doubt. Is it software? Or is it the machine? Did we screw up? Or did a hotel employee? None of us knows.

After lunch we head back to the playing hall to start game nineteen. I suddenly realize that because of the abrupt termination of game eighteen, the final result hasn’t been included in Chinook’s log file. With a heavy heart I edit the file and append a line to the end:

RESULT: Chinook Loses (forfeit)

I spent the rest of the weekend trying to find the problem with Chinook. In the end, we never found it. All I know is that something went terribly wrong on that fateful day, and it did not happen beforehand or afterwards. Only on August 22.

With the best-of-40-games match tied at two wins apiece, Tinsley paid us the highest compliment. He said he was going to play cautiously and not take any chances; he would wait until we made a mistake. His strategy paid off. Chinook erred in game 25 and Tinsley meticulously took advantage of the opportunity. He then played drawish checkers for the remaining games, nursing his one game lead. In the final game of the match, I desperately needed a win so I programmed Chinook to treat a draw as a loss. Chinook found the draw, rejected it, made a feeble attempt to complicate the game, and then lost. Final score: Tinsley four wins, Chinook two wins, and 33 draws.

But that’s not the end of the story. August 22nd was not finished with me.

Accepting the “Runner Up” Trophy (London 1992). From left to right: Paul Lu, Rob Lake, Jonathan Schaeffer, and Norman Treloar.

No comments:

Post a Comment