Move Generation using Bit boards (Connect-4)

Author

166

June 28, 2012 06:35 PM

Hi all,
I am facing some speed(performance) issues while generating moves for connect 4.

Perviously I wrote simple nested for-loops to generate the moves now I tried to convert it into bit boards so
I found all the empty squares and anded it with column bits. (eg column1=(1L<<1|1L<<10...)
This gave me the empty bits which are in a particular column.
Now I found the MSB by right-shifting this till the number was 0( trick to find MSB when its power of 2).
it gave me correct answer, but then surprisingly this was slower as compared to nested for (nested for loops took 238 ms where as bitboards took 1349 ms).

So then I tried another method, the folding trick as mentioned here.
[source lang="csharp"] x |= (x >> 1);
x |= (x >> 2);
x |= (x >> 4);
x |= (x >> 8);
x |= (x >> 16);
x |= (x >> 32);

for (int n = 53; n >=0; n--)
if (((1L << n) & (x & ~(x >> 1))) != 0)
return n;[/source]
This too gave me slow results.
What am I doing wrong as I am sure bitboards will be certainly faster then nested loops.
How can I achieve this without nested loops, something like debrujin sequence for MSB (64 bit number).

-Thank you.

alvaro

21,607

June 29, 2012 12:34 AM

I would use consecutive bits to represent columns. Following the same convention as Fhourstones:

.  .  .  .  .  .  .

5 12 19 26 33 40 47

4 11 18 25 32 39 46

3 10 17 24 31 38 45

2  9 16 23 30 37 44

1  8 15 22 29 36 43

0  7 14 21 28 35 42

You can then generate moves as

u64 generate_moves() {

  u64 occupied = pieces[0] | pieces[1];

  return BOARD_MASK & (occupied >> 1) & ~occupied;

}

When you need to loop over the moves, you do something like this:

  for (u64 moves = generate_moves(); moves; moves &= moves-1) {

	u64 move = moves & -moves;

	// `move' now has a bitboard with a single 1 in the position where you can move.

	// You can use the De Bruijn sequence trick if you want to convert it to an index.

  }

ashish123

Author

166

June 30, 2012 05:58 AM

Fhourstones representation is nice, it uses lesser number of bits as compared to mine (with borders).
However I will choose this type of representation in second version (to compare with my own implementation)

I found the bug that was causing delay, it considered a move on index 0, which lies on the border. (so approximately 7 times more ouch!)

I used Nalimov representation from chess-programming wiki.
Generally what I do is, take up a 2d-array, store all moves in form of moves[depth,move] and then access according to depth. I also keep another array which helps me to count number of moves for particular depth, which is used to traverse.

The array representation helped me to sort the moves based on killer moves heuristics. But I also noticed that using arrays for storing moves seems to be slow (I may be wrong on this one, kindly correct if I am.) But I am unable to see a way to sort killer moves first with using only bit boards.

Here is what I did with my genMoves method.
[source lang="csharp"]public static void genMoves()
{
long empty = ((~(xBits | yBits)) & bitBoard);

int moveIndex = 0;

moveIndex = findIndex((ulong)(empty & column1));

if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

moveIndex = findIndex((ulong)(empty & column2));
if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

moveIndex = findIndex((ulong)(empty & column3));
if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

moveIndex = findIndex((ulong)(empty & column4));

if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

moveIndex = findIndex((ulong)(empty & column5));

if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

moveIndex = findIndex((ulong)(empty & column6));

if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

moveIndex = findIndex((ulong)(empty & column7));

if (moveIndex != 0)
moves[depth, nPly[depth]++] = moveIndex;

}

public static int findIndex(ulong bb)
{
int result = 0;
if (bb > 0xFFFFFFFF)
{
bb >>= 32;
result = 32;
}
if (bb > 0xFFFF)
{
bb >>= 16;
result += 16;
}
if (bb > 0xFF)
{
bb >>= 8;
result += 8;
}
return result + ms1bTable[(int)bb];
[/source]
Theres negligible improvement of jus one second over the for loops.
With arrays its simpler to order the moves, but with bits its quicker.
Can you show me a way to order the moves using bit approach.

alvaro

21,607

June 30, 2012 11:08 AM

I can't write code for you following your square-to-bit convention, because I don't know what it is. However, my code shows you how you don't have to consider each column individually to find all the valid moves: Just compute `empties & shift_north(occupied)'. Then extract all the bits that are set, using a loop like the one I showed you.

I don't know of any way to sort moves other than putting them in an array first, but that shouldn't be slow at all.

ashish123

Author

166

July 01, 2012 04:59 PM

Just compute `empties & shift_north(occupied)'. Then extract all the bits that are set, using a loop like the one I showed you.

On an empty board, occupied will be 0, so according to the pseudo-code, only available move is 0.
am I missing something? I adopted the Fhourstones structure for a while.

alvaro

21,607

July 01, 2012 09:06 PM

[quote name='alvaro' timestamp='1341054536' post='4954278']
Just compute `empties & shift_north(occupied)'. Then extract all the bits that are set, using a loop like the one I showed you.

On an empty board, occupied will be 0, so according to the pseudo-code, only available move is 0.
am I missing something? I adopted the Fhourstones structure for a while.
[/quote]

Ooops! You are right. It's easily fixed, though: empties & (shift_north(occupied) | FIRST_ROW)

ashish123

Author

166

September 07, 2012 04:13 PM

Hi again,
I was occupied with few things so had to keep this coding away.
@alvaro: Apology for late reply, but your trick did its job and its working nicely.

Thinking about this further, I think I can reduce the number of moves when there is winning threat present.
say I have three in a row and computer has 6 different moves, its not practical to search every single move as next I am going to play on that.
Firstly can you please classify if this approach is correct. I have added some piece of code in my make move method which will help me do that.
its buggy currently(of which I do not concentrate as of now,concentrating on concept), but thing is I am focussing on discarding of search nodes as much as possible before adding more knowledge to the eval as it would slow down.
Currently its getting about 400knps.
Please help me with this.



			long occupied = (xBits | yBits);

			long empty = ~occupied;

			long bitMoves = 0L;

			bitMoves = bitBoard & empty & ((occupied >> 9) | lastrow);

			//Find the forced moves

			long xThreats = 0L;

			long yThreats = 0L;



				yThreats |= ((yBits << 1) & (yBits << 2) & (yBits << 3) & empty & bitBoard);//XXX_

				yThreats |= ((yBits >> 2) & (yBits << 1) & (yBits >> 1) & empty & bitBoard);//X_XX

				yThreats |= ((yBits << 2) & (yBits << 1) & (yBits >> 1) & empty & bitBoard);//XX_X

				yThreats |= ((yBits >> 1) & (yBits >> 2) & (yBits >> 3) & empty & bitBoard);//_XXX



				yThreats |= ((yBits << 10) & (yBits << 20) & (yBits << 30) & empty & bitBoard);//XXX_

				yThreats |= ((yBits >> 20) & (yBits << 10) & (yBits >> 10) & empty & bitBoard);//X_XX

				yThreats |= ((yBits << 20) & (yBits << 10) & (yBits >> 10) & empty & bitBoard);//XX_X

				yThreats |= ((yBits >> 10) & (yBits >> 20) & (yBits >> 30) & empty & bitBoard);//_XXX

				yThreats |= ((yBits << 8) & (yBits << 16) & (yBits << 24) & empty & bitBoard);//XXX_

				yThreats |= ((yBits >> 16) & (yBits << 8) & (yBits >> 8) & empty & bitBoard);//X_XX

				yThreats |= ((yBits << 16) & (yBits << 8) & (yBits >> 8) & empty & bitBoard);//XX_X

				yThreats |= ((yBits >> 8) & (yBits >> 16) & (yBits >> 24) & empty & bitBoard);//_XXX



				xThreats |= ((xBits << 1) & (xBits << 2) & (xBits << 3) & empty & bitBoard);//XXX_

				xThreats |= ((xBits >> 2) & (xBits << 1) & (xBits >> 1) & empty & bitBoard);//X_XX

				xThreats |= ((xBits << 2) & (xBits << 1) & (xBits >> 1) & empty & bitBoard);//XX_X

				xThreats |= ((xBits >> 1) & (xBits >> 2) & (xBits >> 3) & empty & bitBoard);//_XXX

				xThreats |= ((xBits << 10) & (xBits << 20) & (xBits << 30) & empty & bitBoard);//XXX_

				xThreats |= ((xBits >> 20) & (xBits << 10) & (xBits >> 10) & empty & bitBoard);//X_XX

				xThreats |= ((xBits << 20) & (xBits << 10) & (xBits >> 10) & empty & bitBoard);//XX_X

				xThreats |= ((xBits >> 10) & (xBits >> 20) & (xBits >> 30) & empty & bitBoard);//XXX_



				xThreats |= ((xBits << 8) & (xBits << 16) & (xBits << 24) & empty & bitBoard);//XXX_

				xThreats |= ((xBits >> 16) & (xBits << 8) & (xBits >> 8) & empty & bitBoard);//X_XX

				xThreats |= ((xBits << 16) & (xBits << 8) & (xBits >> 8) & empty & bitBoard);//XX_X

				xThreats |= ((xBits >> 8) & (xBits >> 16) & (xBits >> 24) & empty & bitBoard);//_XXX





				   if((((yThreats|xThreats)&bitMoves)!=0))

				   {

					   bitMoves = bitMoves&(yThreats|xThreats);// play on the threatend empty square only.

				   }

my board structure is
00 | 01 02 03 04 05 06 07 | 08
09 | 10 11 12 13 14 15 16 | 17
18 | 19 20 21 22 23 24 25 | 26
27 | 28 29 30 31 32 33 34 | 35
36 | 37 38 39 40 41 42 43 | 44
45 | 46 47 48 49 50 51 52 | 53
where | is the border.

alvaro

21,607

September 07, 2012 04:31 PM

You don't need two columns for padding, but that doesn't really matter.

I spent a lot of time in the early 90s writing a connect 4 program. What I did at the time was making the move generator smart enough to only allow you to win if a win is present, and only allows you to block an opponent's threat if one is present. My friends and I were playing a lot of connect 4 at the time, and we actually used rules similar to chess, where it is considered illegal to expose your king. So (during the search) my move generator also didn't let a player play right under an opponent's threat, because that results in immediate victory for the opponent.

Oh, I also extended the depth for forced moves (where only one move is legal, with the definition above). This makes the program stronger in tactics, but nowadays I would prefer to test this potential improvement more scientifically, actually playing thousands of games to see if it really helps.

A matter of style:

  long bitMoves = 0L;

  bitMoves = bitBoard & empty & ((occupied >> 9) | lastrow);

Why is the code above not simply this?
long bitMoves = bitBoard & empty & ((occupied >> 9) | lastrow);

Actually, I think my compiler (gcc) would complain about initializing a variable to a value that is never used.

ashish123

Author

166

September 08, 2012 09:12 AM

@alvaro, Thanks for the reply,
From your reply, I took that I was on right track.
Now for a trial position after 8 plies, my program solves the game in about 5 mins.
Later on I shall implement iterative deepening and would post the updated progress.

ashish123

Author

166

September 09, 2012 12:33 PM

Hi again,
I implemented the iterative deepening, with History heuristics, but then some how I find that the time taken for Iterative deepening is far too more than normal
I think this happened because of high search depth (20+) as there were lot of sorting for each move. I also read few papers which said that performance History heuristics decreases with higher depths where depth is above 7 or 8.

Move Generation using Bit boards (Connect-4)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Move Generation using Bit boards (Connect-4)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines