Othello/Alpha-Beta Research

Introduction

The following is my Spring 2001 Advanced Artificial Intelligence research project on the zero-sum two player game, Othello. Even though it may appear as though Othello is a fairly simple game (well, it is when compared to GO), there still are many important aspects of the game to consider. The most important of these are the evaluation function and searching algorithms. Why are these important? First of all, the game would be nothing without an evaluation function. And there are many interesting aspects of the evaluation which can greatly affect both efficiency as well as game play. Second, a good searching algorithm can fulfill the ideal properties of a good heuristic: providing a good answer in a reasonable amount of time.

But most important of all - I find these two aspects of the game to be very interesting. There has been alot of research on alphabeta searching techniques over the past thirty years, and many advances with it more recently. There are many different algorithms that have been developed as a result of this research - I feel I can learn from these sources and their respective algorithms. Therefore, I have decided to center my research around the evaluation function and advanced alphabeta techniques.

Rules of Othello

Really, there are no "rules" to Othello. However, gameplay goes as follows. the board begins with four disks in the center: two black and two white. Black traditionally goes first. A player may place a disk of their color anywhere on the board that meets the following two conditions: 1) there must be a disk of the opponent's color directly adjacent to the square in which the disk is placed; and 2) on the other side of the opponents's disk, there may be more disks of the opponent's as well, but they must end in another disk of the player who is making the move. That player then flips all of the opponent's disks from the point where the move was made until the next disk of the player's color. This flipping is done in all eight directions (up, down, left, right and diagonals). The player with the most disks at the end of the game wins.

Evaluation Function

An evaluation function places a value on a given state of the board. From this value, it is possible to determine whether the next move will be a good or bad one, simply by comparing the evaluations. If the current board has an evaluation of 53 and then 60 after the next move, then the move was a good one. So from the evaluation function, one can look at many possible moves and choose the best one. Choosing an evaluation function, therefore, is really one of the most important parts of making the game. A good one will make the program very hard to beat while a bad one makes winning trivial - rendering the program virtually useless (the purpose of a good game is to make the player have fun yet still be challenged right?).

Since the winner to any Othello game has to end the game with more pieces on the board than their opponent, it is very important to count the number of pieces on the board as part of the evaluation function. Note that I said "part." As any experienced Othello player will tell you, simply counting the number of pieces that a player has on the board will give no real indication of how well that player is doing at that point in the game. See, Othello, like many other games, has key places on the board that will give a player an advantage over their opponent. Typically, these places include the four corners of the board as well as the four edges. Why are these considered key places on the board? Because the more stable pieces a player has, the more advantage they will have. For example, say that a player has claimed all four edges of the board (including corners). That player now has twenty-eight stable pieces to connect other pieces with during gameplay. Not only this, but that player already has almost half of the board claimed - having five more pieces flipped in their favor at the end of the game will be a win! The flip side to disk stability is disk mobility. It is in a player's best interest to maximize the number of possible moves they will have at their next turn (mobility) as well as minimize the number of possible moves their opponenet will have.

In addition to having good moves, like any game, Othello has bad moves. The only major bad moves in the game are ones that will turn a corner or an edge over to the opponent. So these too must be accounted for in the evaluation function. However, in a negative manner. That is, if the current player has one of these positions (for example, the square directly diagonal from a corner), then a value can be subtracted from their evaluation for that board state (assuming the corner has not yet been taken).

Edge Tables

Edge tables help out the evaluation function by evaluating the edges of the board. They can be created in a number of different ways. But whichever way they are created, they only need to be modeled on one edge, and then mirrored to all four edges when evaluating a board. Some people have tried creating a table of all possible edges states in the game and assigning an evaluation value to each one. Then they simply feed that table the current edges state and it returns an evaluation value. This is a fairly good method however it seems pretty inefficient - all possible states on one edge is a table with 3^8=6561 entries. Another method is simply placing a weight on each square to determine an evaluation value for the edge - then looking at which values are present in the current edge. The problem with this method is that there are alot of different situations that could occur on an edge (6561, remember?). So a fair middleground method (which I haven't seen implemented yet) is to look at the edges the way that Go does. That is, examine all possible states at the corners (giving 3^4=81 states) and then assign static weights to the middles of the edges (four entries on each one). This also takes into account the square which is diagonal from the corner. This square is extremely bad to have if the corner has not yet been conquered - it leaves a player very open to losing that corner. Placing this is square in the first-mentioned edge table would yeild an even higher number of table entries (3^10=59049 entries). So the middleground solution seems like a viable one.

Regardless of which method is chosen to implement the edge table, and even if its the less efficient one - the cost is worth it. Remember above where I noted that the corners and the edges are the most powerful parts on the board? Well, edges tables not only account for the edges, but they will double-count the corners. Since a corner will be counted on each edge, it will naturally receive a higher weight than other positions - which is as it should be since corners are the most powerful positions.

Move Ordering

Move ordering is a function which will help to make the program itself more efficien. Move ordering is simply ordering the moves based upon how good they are - either putting best ones first or worst ones first. This doesn't seem as though it would do much. But below, I discuss some advanced alphabeta pruning algorithms in which move ordering will greatly reduce exectuion time of a program. A very common method for implementing move ordering is simply to have an ordering table. This table would be a two-dimensional array of size 8x8 which has static weights assigned to certain parts of the board - these wiehgts will enable the program to order say corner moves before edge moves. An example of how important this is can be seen even in the normal wide-window alphabeta algorithm where moves that have an equal evaluation value are not tried (they are pruned). This can be eliminated by placing an extra weight to those moves depending on where they are in the board, and then ordering them as such.

*Mini-Max Searching

The mini-max searching algorithm mimics the way that human players think about the Othello game. Typically, one thinks to themself: "If I move here, he can move there, then I can move here. But if he moves there then I'm in trouble..." So its only natural to go from one game stage to the next while trying to maximize the evaluation of both players. However, since a max for an opponent is equivalent to minimizing the opportunities of the current player - it is common to alternate modes between maximizing and minimizing.

At a given game state, if the program wants to find the best move that it can make (given the knowledge that the opponent will try to do the same), it maximizes that state, minimizes the next set of states, etc. This means looking at all possible board states that are one (program's turn) move away from the current state, then looking at all possible board states that are one (opponent's turn) move away from the next state, etc. This continues until a given depth limit is reached (typically set by the difficulty setting of the game as well as the game state: opening game, midgame or endgame). Once this occurs, the program then evaluates all board states at the current level and chooses the maximum or minimum of them, according to what the previous level is. If it is a min level, then after the minimum board has been chosen, all boards at the min level are evaluated, and the maximum of them is chosen, etc.


minimax tree
Example: Here, the depthlimit is three. The nodes which are chosen at each level have been circled in red. At the bottom level, max will take the maximum of the available moves at each node (indicated by a number). As you can see, we find the best values (and thus which move to make) at the first sub-sub-tree. Therefore, we essentially waste our time looking at the other two sub-tres. This problem will be avoided in just a minute.

Here is a link to an Othello applet which simply institutes mini-max searching. I did not write it - see below for more information on that.

NegaMax Searching

NegaMax searching is simply a variation of minimax searching. One of the pains of minimax searching is that the program must remember what level it is on - either a max or a min level. Then it must traverse the tree, choosing nodes accordingly. This involves having two separate functions in the program which look very similar except that maxes/mins are flipped. Then these two functions simply proceed to call eachother recursively. Since the opposite of a max to the player is a min, the program is always trying to maximize for either the player or the opponent. So, it makes sense to simplify matters by only having one function called negamax which always maximizes yet takes the negative of the evaluation at each level.

Example: Look at the above minimax tree. At the max level, the program shall choose the same pieces. But when they are passed to min, they are negated. So now the values are: -6, -19 and -25 (for the first subtree). Always maximizing means that -6 is chosen - which is the same position (move) as would have been acheived by choosing 6 in minimax. That -6 is then negated at the min level, yielding a positive 6. Then the next min subtree chooses -4 as the max of -4 and -37. This continues until the max (root) node is taking the max of 6, 4, 6 and 12. Remember, evaluations are negated on each level. In each case, 12 is chosen as the best move in the tree.

*The AlphaBeta Algorithm

You may notice in the tree above that, if we look at the first level, we know that it is unnecessary to evaluate any node other than 12, since max would not choose that node at this level anyway. This is the motivation behind the alphabeta algorithm - pruning the tree by removing nodes we know would not be chosen. The alphabeta algorithm is a method for speeding up the minimax searching routine by pruning off cases that will not be used anyway. This method takes advantage of the knowledge that every other level in the tree will maximize and every other level will minimize. It works as follows: start off with alpha=-infinity and beta=infitity; traverse the tree until the depth limit is reached; assign a value for alpha or beta based upon what level preceeded the depthlimit level. Whenever max is called, it checks to see if the evaluation of the move it has been given is greater than or equal to beta. If it is, then max returns - that value would not have been chosen by min anyway and neither would the subtree that max would have created, so it is a waste of time searching through them. The same logic occurs with min except that it checks if the move it has been given is less than or equal to alpha
. alphabetatree
Example: Here, the depthlimit is again three. Cutoffs are indicated by squiggley lines, chosen nodes are indicated by circles. The backed-up values are show by alpha and beta at each level of the tree(except for level 2, they wouldn't fit). This way, the change in alpha and beta can be seen. Also, although evaluations are shown at each level for each node, they are not evaluated at every level. Follow the algorithm below to see which ones are actually evaluated (the bottom level nodes).

Here is a link to an Othello applet which utilizes Alpha-Beta for searching. The code for this program was shamelessly borrowed from Faster Lu. He used minimax searching - I removed that, used the GUI and put Alpha-Beta in for searching. Sound easy? It wasn't.

int max(move,alpha,beta){
	apply(move);				//update board with move given
	if(depth==depthlimit) return evaluation(move);
	s[] = nextMoves(depth);			//find all available moves
	for(i=0;i!=length(s);i++){
		alpha = max(alpha,min(s[i],alpha,beta));
		if(alpha >= beta) return alpha;
	}
	return alpha;
}
int min(move,alpha,beta){
	apply(move);				//update board with move given
	if(depth==depthlimit) return evaluation(move);
	s[] = nextMoves(depth);			//find all available moves
	for(i=0;i!=length(s);i++){
		beta = min(beta,max(s[i],alpha,beta));
		if(beta <= alpha) return beta;
	}
	return beta;
}

Null-Window AlphaBeta

Alphabeta can be called with a variable number of window size between alpha and beta. The smaller the window size, the larger the number of cutoffs there will be - anything that falls outside of the window is a cutoff. Null window alphabeta means that alpha = beta-1, the smallest window size possible - with the most cutoffs. It has been proven that a return value R of alphabeta with a window of will be one of three cases: This implies that a single null-window call will not give the exact minimax value in one call. That is why programs such as C*, SSS*, and DUAL* all have repetitive calls to a null-window alphabeta that eventually find the true minimax value. The potential problem here is searching through a tree which has already been searched - high inefficiency. The transpositon table explanation below will explain how to alleviate this problem.

Transpositon Tables

Transposition tables help to speed up the efficiency of the program. When using NegaScout or any other sort of null window search, it is possible (and happens often) that nodes will be revisited. This seems extremely unefficient - it is if you just do NegaScout without a transpositon table. What the transposition table does is it stores the best move reachable from a given game state as well as how far the tree was traversed below that state to achiece that best move. So when NegaScout researches through a tree and comes to a duplicate state, rather than looking below it, it simply has to look at the transpositon table to see what the best possible move is. Then it can return from that level of the tree.

In addition to holding the best move value and the depth of the tree that was searched to find that value, the transposition table may also hold the fail high, fail low and minimax values. That is, if using one of the null-window method described below, it is worthwhile to know whether the search failed high, failed low or found the minimax value. This information can be used at the next iteration of the algorithm(see below).

Transposition tables are often implemented using a hash table. This information I found easily. What I couldn't find much information on is how the values are hashed to the table. That is, I figured that there were a couple of different ways to hash each board sate that is reached. One way is to hash the positon number and piece count. But this method means that the entire board state (a 8x8 array) needs to be stored in the hash table and then checked against - since simply hashing the position number and piece count does not guarantee a unique hash key. Another method is to sum up all values on the board and hash that value - then only the best move and the level reached to get it are stored in the hash entry. This also guarantees a unique hash key since it accurately represents each unique board state.

//AlphaBeta with transition table (TT)
//move.f- = fail high value in TT (g is a lower bound on the true minimax value)
//move.f+ = fail low value in TT (g is an upper bound on the true minimax value)
int alphabeta'(move,alpha,beta){
	int g,c;
	if(retrieve(move)){		//look for move in TT and see if it was searched to sufficent depth
		if((move.f+ <= alpha)||(move.f+ == move.f-)) return move.f+;
		if(move.f- >= beta) return move.f-;
	if(move == leaf) move.f- = move.f+ = g = eval(move);			//maximum search depth reached
	else{
		g = -infinity;
		s[] = nextMoves(move);
		for(i=0;((g < beta)&&(i != depthlimit));i++){	//search until cutoff or all nodes are reached
			g = max(g,-alphabeta(s[i],-beta,-alpha));
			alpha = max(alpha,g);
		}
		//save data in TT
		if(g <= alpha) move.f+ = g;
		if(alpha < g < beta) move.f+ = move.f- = g;
		if(g >= beta) move.f- = g;
	}
	store(move);								//store in TT
	return g;
}

DUAL*/SSS*

Assuming the above alphabeta' algorithm that utilizes a transposition table and assuming a descendingly sorted open list (move ordering, as in the A* algorithm) it is possible to hone in on the true minimax value by doing one of vaious types of null-window alphabeta calls. One such example is SSS* which begins with R = infinity and then repeatedly making R smaller based on it failing low at the alphabeta call.

Many practitioners have shyed away from using SSS* because: some find it hard to adapt or understand since it implicity uses a sorted open list (as with the A* algorithm); is slow since it must maintain this sorted open list; and it is thought that the memory requirements increase exponentially with the search depth.

int SSS*(move){
	int R = infinity;
	int g;
	while(g != R){
		g = R;
		R = alphabeta'(move,g-1,g);
	}
	return R;
}
DUAL* is very similar to SSS* except that instead of searching in one direction (starting at infinity and working downward), it searches in two directions - hence the name DUAL*. This is accomplished by starting at -infinity instead of infinity, using an asceningly sorted list, and swapping max and min operations in alphabeta'.
int DUAL*(move){
	int R = -infinity;
	int g;
	while(g != R){
		g = R;
		R = alphabeta'(move,g,g+1);
	}
	return R;
}

*NegaScout

The NegaScout algorithm works alot like Alpha-Beta except that it will research trees it has already seen with a wider window. It works by doing one complete "wide window" search on the the first sub-sub tree. It has to do this to get alpha and beta values. After these values are found, beta is set up to be only one greater than alpha (a null window search). Then the next sub-trees are searched with these alpha and beta values. This method never expands a node that alpha-beta would otherwise cutoff. However, as noted above, it also requires researching the tree from time to time.

As noted above, NegaScout is a null-window search. As such, the three possible cases of return values for a null window search apply (see above under null-window search for more details). If NegaScout fails high, it will revisit the entire sub-tree with a wider search window. This window is still, however, smaller than an Alpha-Beta window. Alpha-Beta searches from -infinity to infinity (and then from alpha to beta), a "wide" NegaScout search is from the current beta value to the failed high value.

The two reasons why we would not research the tree are as follows: 1) if the backup values is correct (if b==beta); and 2) if the search is currently at the two lowest level of the tree (if maxdepth >= d-2). NegaScout has a "fail-safe" routine which allows it to generate correct values at the two lowest levels of the tree. In all other cases, we research the tree with a wider window.

A couple more important notes: NegaScout is less effective than Alpha-Beta on search trees of depth less than three. This is why the effects of NegaScout can't truly be seen in the example below (the search tree IS of depth three). Also, NegaScout benefits alot from a transposition table - especially one which orders its moves in best-first manner. This will generate many more cutoffs and less researches.

Here is a link to an Othello applet which utilizes the NegaScout search algorithm. However, it does not do so very well because it lacks a transition table. On the average, it performs about as well as regular Alpha-Beta - sometimes better, sometimes worse.

negascout tree
int NegaScout(move,alpha,beta,d){
	apply(move);						//make move given
	s[] = nextMoves(move);					//find available moves
	if(length(s)==0) return evaluation(p);			//leaf node
	a = alpha;
	b = beta;
	for(i=0;i <= w;i++){
		t = -NegaScout(s[i],-b,-a,d+1);
		if((t > a)&&(b!=beta)&&(d>=maxdepth-2))
			a = -NegaScout(s[i],-beta,-t,d);		//re-search
		a = max(a,t);
		if(a >= beta)					//cut-off	
			return a;				//set new null window
		b = a+1;
	}
	return a;
}

MTD(f)

The MTD (Memory-enhanced test) algorithm is a modified version of SSS* and DUAL*. It again uses the above alphabeta' algorithm. It works by taking a look at the return value after the first call to alphabeta'. This value will determine exactly which way the search for the true minimax value will go (at each iteration). If the returned value is a fail high (f-), then the search will act as though DUAL* were used. However, if the value is a fail low (f+), then the search will act as though SSS* were used. The claim by the MTD(f) literature is that SSS* starts off optimistic (thinks it has a higher value than the true minimax value), DUAL* starts off pessimistic (think is has a worse value than the true minimax value), while MTD(f) starts off realistic. This is a pretty fair claim since the algorithm outperforms both SSS* and DUAL*. In addition, MTD(f) has a better leaf node count, total node count and execution time than NegaScout. All three of these outperformances (over NegaScout) are by a margin greater than that which NegaScout outperforms the wide window AlphaBeta. The MTD(f) literature states that the SSS* and DUAL* algorithms make dozens and even hundreds of calls to the alphabeta' algorithm while MTD(f) only makes about 3 to 6.

To test the performance of the MTD(f) algorithm, its creators - Aske Plaat, Jonathan Schaeffer, Wim Pijls and Arie de Bruin - conducted what they claim to be one of the first real comparisons between AlphaBeta, SSS*, DUAL*, NegaScout and MTD(f). These three algorithms were all previously thought to be very different in the way they searched. However, they were all compared by using the exact same alphabeta' call (for SSS*, DUAL* and MTD(f)). The baseline for the comparisons was NegaScout - following the idea that most current game programs use NegaScout anyway, rather than a wide-window AlphaBeta. The most important comparison done was with leaf node count. The base line of NegaScout had 100 leaf nodes at all depths. AlphaBeta had a large number of leaf nodes - fluctuating between 102 and 110 - but performed best around depth 5. The DUAL* leaf node count fluctuated between 97 and 102 and performed best at depth 8. The SSS* leaf node count fluctuated between 91 and 104, performing best at depth 6. MTD(f) outperformed the rest - fluctuating between 87 and 100 and performing best at depth 6 (although still only getting 90 leaf nodes at depth 8). The correlation between leaf node count and execution time is that the fewer leaf nodes found, the more cutoffs there are. So the MTD(f) algorithm is outperforming the rest by a significant amount. Other comparisons done included total node count (between the same above algorithms) and execution time (between MTD(f) execution time, MTD(f) leaf node count and NegaScout leaf node count). Both comparisons still show MTD(f) performing ahead of the other four algorithms.

MTD(f) is currently used in MIT's parallel chess program, Cilkchess. This replaces the previous implementation of the program, StarSocrates, which searched via NegaScout.

int mtd(move,f,d){
	g = f;
	upper = infinity;
	lower = -infinity;
	while(upper < lower){
		if(g == lower) beta = g+1;
		else	beta = g;
		g = alphabeta'(move,beta-1,beta,d);
		if(g < beta) upper = g;
		else	lower = g;
	}
	return g;
}

ProbCut/Multi-ProbCut

Michael Buro developed this algorithm to work with his Othello program, Logistello - thought by many to be the absolute strongest Othello program ever written. The ProbCut algorithm, in short, works by looking at probability (as the name sugests). That is, it will go down a game tree to a given depth while expanding all nodes. Then it use the current vertex v' to estimate the true minimax value v and calculate the possibility that v is outside of the current alphabeta window. If it lies outside of the window, then the upperbound on the window is returned. If it is not outside of the window, then the search is continued up to the depth limit to return the true value v. This method requires no memory to execute (so transition tables are not needed). However, it requires that the program have a stable evaluation function. This guarantees a small variance between the estimated true value v and the actual true value v. Also, it can reach depths of 13 or 14 in the opening to midgame phases.

(more on Multi-ProbCut to come)

*Rule-Based Move Pruning

In class, we discussed using rules to form the basis for how one should make a move in the game. When a player first comes to Othello and they are learning how to play well, they begin to pick up certain things. What if we were to program these rules into the game, using them to narrow down our choice of which move to make? That is, when we make a call to the move generating function, simply have that function prune away some of its moves before returning them - this way, the rest of the program knows nothing of the existence of other possible functions.

The main rules that I tried programming centered around the first things a new Othello player would learn. First of all, one should always fight for the corner - gaining them means gaining a stable piece, and thus having more potential control over the entire board. Second, always try to grab the edges of the board. These too are a very stable area of the board and can cause one to flip many of their opponent's pieces at once - gaining more power on the board. Finally, don't do anything which will cause you to lose either of these two areas. This often means not playing near the corner when it could cost you the corner (and likewise for the edges).

The advantage to this method is that it yields an even greater speedup in the program (especially when used in conjunction with a search technique such as Alpha-Beta or NegaScout). This means that we can search even further down the tree in the same amount of time - thus yielding a better move for us. The disadvantage to this technique is that the rules I currently have programmed in are too simple to cover all of the bases for the game. Subsequently, sometimes the rules will causes a move to be pruned off that otherwise would have been chosen as the best move to make. Most times, however, they help to speed the search up enough to ensure that the user does not become impatient while they are gettin gbeaten by the game!

Rule-Based Alpha-Beta Tree
Example: Here, the depthlimit is three (the same tree as the examples above). All cuttoffs have been shown with red squiggly lines. Any move which is not shown to alpha beta is circled in green and cutoff with a red squiggly line. The backed-up values of alpha and beta are shown to make it easier to see how the search was done. Again, even though alpha-beta does not evaluate at every node, the evaluation values are shown.

Here is a link to an Othello applet which employs Alpha-Beta searching as well as rules to prune off unlikely moves.

My Othello Applets

"My" Othello Applets are really from an applet that I shamelessly borrowed from fasterlu. He implemented the game quite well. However, it only utilized the minimax search algorithm. And since the whole point of my research was to learn about alphabeta/advanced alphabeta techniques, I used his code for the front end of the game as well as for the evaluation function. I merely substituted my searching algorithm for his (as well as changing the colors).

*Thoughts

Throughout my research, I learned a lot about the various AlphaBeta techniques. AlphaBeta itself is pretty confusing when you try to sit down and program what you want it to do. The advanced techniques are even more confusing. This, however, should not dissuade anyone from trying to implement them - once the program works, it is well worth the time spent swearing at the computer, accusing it of working improperly.

The applet that I actually implemented and consider to work the best is the applet which uses rules to prune off unlikely moves as well as alpha-beta to search. Its a little ways away from where I initialy planned to be at the end of the semester. I planned on having one of the advanced alpha-beta methods up and running well. In fact, I do have one running - NegaScout. But without a transposition table, it really doesn't run "well." Nonetheless, I'm pretty happy with the rule-based applet. I also have some future plans for it this summer.

I plan on making the rules a little less basic - this will help to solve the problem of the applet working poorly in rare cases (but it should never work poorly). Again, the rules are way too basic right now for the gameplay (difficulty) to outperform normal alpha-beta in every instance. In addition, I would like to work some more on developing NegaScout. I had alot of trouble with the transposition table, and when the opportunity to try something else (rule-based reasoning) arose, I wanted to give that a shot. I'd rather have something decent up and running than fail on getting something excellent up and running. Also, if I have time, I would like to put my own evaluation function into the game. My research revolved around alph-beta, so it only made sense to me to not have to worry about other parts of the game working while I was trying to develop alpha-beta. However, the evaluation function is the only other part of the game that falls under the category of artificial intelligence - everything else just alters the board for searching or for moves made.

Links to other Othello Sources

*An Improvement of the Scout Tree Search Algorithm is a paper explaining how NegaScout works. It includes a couple good example search trees as well as a text experiment comparing NegaScout to AlphBeta.

Inside Reversi/Othello is an excellent website that has tons of background information on Othello. It lets you know how to get started writing the code as well as what the basic AI functions involved with the game are. Also has some useful links and an excellent Othello applet.

The inner workings of strong Othello Programs is another general info Othello site. However, this one does still have alot of good information. Again, some of it is basic AI technique, but that information can be useful when one doesn't understand the concepts.

Best First Search and Depth First Search in Practice discusses the memory concerns present in the AlphaBeta and SSS* algorithms.

MTD(f) A Minimax Algorithm Faster than NegaScout has alot of information about the MTD(f) algorithm. This site is written by one of the authors of the algorithm, but it gives a clearer (although less detailed) explanation of some of the ideas than the site below.

An Algorithm Faster Than NegaScout and SSS* in Practice is a site all about the MTD(f) algorithm. Its actually a published paper I believe. Either way, there is tons of information here - about MTD(f) as well as SSS*, DUAL*, alphabeta and NegaScout. Has running time comparisons between the various algorithms to prove that MTD(f) is the fastest.

A New Paradigm for Minimax Search tests three tournament programs using AlphaBeta, NegaScout, SSS* and MTD(f) to show that MTD(f) is the fastest

New Advances in Alpha Beta tree Searching gives an overview of some advanced techniques, namely SSS*. It also discusses the practical uses of advanced AlphaBeta techniques as well as the implementation concerns and usage of transposition tables.

Michael Buro created one of the greatest Othello programs ever, Logistello. This site has links to alot of his published works. Some are in German, but most of them aren't. All in all, tons of interesting information, including details on how he created Logistello and how it runs.