Jon Thysell

Xbox engineer. Fiction writer. Returned Peace Corps Volunteer. Ukulele nut. Nerd.

Creating an AI to play Hive with Mzinga, Part IV

Mzinga is my attempt at an open-source AI for the board game Hive. So far in this series of posts, I’ve given an overview of the project, a dive into Mzinga’s AI, and then a deeper dive into Mzinga’s board evaluation function. All together, I have quite the framework for executinga decent Hive AI, but the question remains: How do I make it better?

How do people get better at games? Study and competition. Trial and error. Strategies that win us games are remembered and repeated while plays that lose us games are remembered so that we don’t repeat them. But in a game with so many things to consider, where there is no “right” move each turn, you can’t just memorize the answer, you have to actually play.

What are the things to consider? The metrics I talked about in the last post, like how many moves are available and what pieces are surrounded or pinned. Mzinga looks at some ~90 different metrics. For each there is a corresponding weight, measuring how much each metric should impact the board’s score.

How an AI plays is largely determined by what their metric weights are. AIs with different weights will make different decisions on which move to play. Creating the best AI then, is an exercise in determining the best metric weights. Since we don’t know what are the best numbers, our best option is to:

  1. Create a bunch of AIs with different metric weights
  2. Have them play a ton of games against one another
  3. Pick the one that proves that it’s the best

To pick the best one, I need a way of rating each AI. So I followed in chess’s footsteps and adopted the Elo rating system. It does a few of things I like:

  1. Each player has only one number to keep track of, so it’s easy to see who’s the “best”
  2. A player’s rating is just an estimate of how strong the player is:
    1. It goes up and down as they win and lose and their rating is recalculated
    2. Points are taken form the loser and given to the winner, so there’s a finite amount of “rating points” to go around
    3. More games mean a more accurate estimate of their strength, so no one game can make or break a player’s rating
  3. When recalculated a player’s rating takes into consideration who was playing:
    1. A good player beating a bad player is not news, so the winner only takes a few points from the loser, ie. player’s can’t easily inflate their ratings by beating awful players
    2. A bad player beating a good player is an upset, so to reflect that the ratings were wrongly estimated, the winner get lots of points from the loser
    3. Two players with similar scores are expected to draw, so if one wins, it’s a medium amount of points transferred to separate their ratings a little more accurately

Now the Elo system isn’t perfect. The biggest being that it only shows the relative strength of the players involved – you need lots and lots of players participating for the ratings to mean anything. There are lots of criticisms and variants when talking about real people playing, but it’s fine for what we need it for.

So now we have a method for finding the best player in a population of AIs. Create a bunch of AIs, have them fight one another to improve the accuracy of their Elo rating, and pick the one with the highest rating. But like I said earlier, Elo ratings only show relative strength in a given population. What if I don’t have a good population? What if 99% of them are so awful that an AI with some basic strategy completely dominates? That’s no guarantee that the winning AI is actually all that good.

How do we improve a population of players so that we can be sure that we’re getting the best of the best?

It turns out we’re surrounded by an excellent model on how to make a population of something better and better: natural selection and evolution.

In the real world, living things need to fight to survive. The creatures that can’t compete die off, while those with the best traits survive long enough to pass on their traits to their offspring. Offspring inherit both a mix of their parents’ traits, but they’re more then the sum of their parts. New DNA comes into the population as new members join or mutations occur.

We can absolutely simulate evolution in our population of AIs. The life-cycle goes something like this:

  1. Create a bunch of different AIs with different weights (traits)
  2. Have them all fight each other to sort out a ranking of who’s the strongest
  3. Kill off the weakest AIs
  4. Have some of the remaining AIs mate and produce new AIs to join the population
  5. Repeat 1-4

Now the first three steps seem pretty straight forward, but AI mating? It’s actually not that hard to implement. For each metric weight, simply give the offspring the average value from each parent. To make sure that AI’s children aren’t all identical, add a little variance, or mutation, by tweaking the result a little bit.

For example, if parent A has a particular weight of 5, and parent B a weight of 3, rather than giving every child the weight of 4, give them each something random between 3.9 and 4.1. Just like in real life, we don’t necessarily know which traits were the most important in making sure that the parent survived, and we don’t get to pick which traits get passed on. So we pass on everything, good and bad, and let life (in this case lots of fighting) determine who’s better overall.

Now we can start a new generation and have them all start fighting again, so we can suss out everyone’s rankings in the presence of these (presumably better) offspring. Add in some new AIs every now and then to prevent too much inbreeding (where one AI’s traits, good and bad, start to dominate and everyone starts looking more and more like clones of one another) and we now have a true population of AIs, all fighting to be the best.

Now, how exactly am I doing all of this?

With the Mzinga Trainer.

It’s a simple, yet highly configurable little tool for creating, maintaining, and evolving a population of AIs. I started with several seeds of simple AI weights that I handcrafted, plus a slew of randomly generated ones. Then I set up different populations on different computers with different parameters, and have them fighting, dying, and mating with each other for over a week.

It has made Mzinga into one of my more interesting projects, as I’ve made improvements to the tool, I’ve spun up new populations, mixed existing ones, and run life-cycles with different parameters. Some run under really harsh conditions, where so many get killed that the re-population comes from the offspring of very few survivors. When I started noticing that one small population had become so inbred as to appear like a clones of one another, I added in some new blood, so to speak. Then I reduced the harshness to give different AIs, and ultimately different play styles and strategies, a chance to survive and procreate without getting mercilessly killed off.

It’s an ongoing process, and like life itself, something that takes a lot of time to happen.

The Mzinga Trainer tool is included with the install of Mzinga. Now, not only can you try your hand against the Mzinga AI I’ve created, but you can even try to make a better AI of your own. So come on and try out Mzinga today!

/jon

Creating an AI to play Hive with Mzinga, Part III

In the first post in this series, I introduced Mzinga, my attempt at an open-source AI for the board game Hive. Then in my last post, I gave a high-level overview of how the Mzinga AI works. Now I’d like to discuss Mzinga’s board evaluation function, which is at the core of the Mzinga AI.

As I said in the last post, the purpose of the board evaluation function is to determine, based on the current state of the board, which player is “ahead” in the game. The function returns a number from negative infinity to positive infinity, where negative infinity means the current player has lost, and positive infinity means the current player has won.

The first thing that needs to happen, is I need to collect some information about the state of the board. So I go through the board and calculate a variety of metrics – that is to say – I count a bunch of situations. For example, for each piece on the board, I count how many moves it can make, how many neighbors it has touching it, and flag  whether or not it’s “pinned” (can’t move because other pieces won’t allow it).

I also add those numbers up for each player – how many pieces does the player have in play vs. still in their hand, how many pieces do they have that are pinned, and how many total moves they have available.

I do all of this for both players, and it takes a while, relatively speaking. In chess you have the benefit of a set board size, and despite the enormous number of positions that the board could be in, it’s still in a square grid, which can benefit from some fancy programming (see Bitboards) to make it computer friendly and faster.

Hive, on the other hand, is played on a potentially infinite grid of hexes. A typical chess board might determine there are 25 valid moves for a player – in Hive I’ve seen upwards of 150.

Anyway, once I have calculated a board’s metrics, the next step is to determine what they mean. So I know Black has more pieces pinned than White, but White’s Queen Bee is starting to get surrounded, who is ahead? Black still has more pieces to play, but White has a Beetle working its way toward a key spot, who is ahead? How do I evaluate all of the different situations against each other?

The answer is to use a weighted score. Basically, I take each of the metrics and multiply the number by its matching weight. The weight can be any number, positive or negative, but every metric has it’s own matching weight, with bigger weights meaning that the matching metric has more impact on the final score.

So for example, if I think that having more available moves each turn is better for a player, then for the metric Player.ValidMoveCount, I might give a high positive value to Player.ValidMoveWeight. Since the object of the game is to surround the opponent’s Queen Bee, for the Opponent.QueenBee.NeighborCount metric I might also set the corresponding Opponent.QueenBee.NeighborWeight very high. And since I don’t want to leave my Queen Bee unprotected, for my own Player.QueenBee.NeighborCount metric, I would want to give a negative value to Player.QueenBee.NeighborWeight

My board evaluation function simply adds all of these products (metric x metric weight) together to get the board’s final score. With this kind of generic function, the question now becomes: What are the weights that make up the best AI player? The more accurately the board score actually describes how far ahead or behind a player is, the better the AI has a chance of making moves that keep it ahead until it wins.

So, what are the best weights? I don’t know. No one does. Even in a game like chess, with centuries of study, where players have a much better grasp of what pieces are the most important, and what positions are better than others, there’s still no perfect, definitive answer. Great sophisticated starting points, sure, not but not perfect.

Now for a young game like Hive, the starting point may be as simple as the “surround the enemy queen but also keep your queen from getting surrounded” with some simple guesses for those weights, and ignoring everything else on the board. And in fact, that’s where the Mzinga AI is at today. Smart enough to actually play the game toward winning, and much better than just randomly picking moves.

So how do we make it better? How do we get better weights?

Here is where the “real” fun begins, and that’s a topic for the next post. So stay tuned for Part IV, where I go more into detail about how I’m trying to improve the metric weights.

Try out Mzinga today!

/jon

Creating an AI to play Hive with Mzinga, Part II

In my last post, I introduced Mzinga, my attempt at an open-source AI for the board game Hive. In this post, I want to go through how the Mzinga AI works and what I’m doing to make it better.

Like many chess AIs, the algorithm I used for the Mzinga AI is called Minimax. At a high level, the decision making goes something like this:

  1. Determine how many moves ahead I want to look (let’s say 2).
  2. List out all of the valid moves that I, the current player could play.
  3. Try a move.
  4. After I’ve tried a move, calculate the board’s “score” with an evaluation function. Higher  (positive) numbers mean I’m doing better than my opponent, and positive infinity means that I’ve won the game. Lower (negative) scores mean my opponent is doing better than me, and negative infinity means my opponent has won the game. (We’ll come back to this in Part III).
  5. If the game isn’t over, list out all of the moves that my opponent could play at this point.
  6. Try a move.
  7. Again, calculate the board’s “score” with the evaluation function.
  8. Repeat steps 6-7 for each of my opponent’s possible moves (from that 2nd turn)
  9. Repeats steps 3-8 for each of my original possible moves (from that 1st turn)

The algorithm is called Minimax, because for each player’s turn, the AI assumes that the player will take the move that gets them the best score. I’m called the “maximizing” player because for my turns, I’m going to take the move that maximizes the score. My opponent is called the “minimizing” player, because I assume they’re going to take the move that minimizes the score.

By looking ahead, I can modify the score of an earlier move – ie. a move on my first turn might look great for me, but it may turn out that it opens up a really great move for my opponent. Since I assume they’re always going to take the best move, I replace the score of my move to equal the score of the move they’re most likely going to take. That way I don’t take the move that will screw me later.

After all of the calculations are done, I look at the original set of moves available to me, and I pick the one with the best score and actually play it.

Now, a key part of this working is the part where I “calculate” a board’s score. I’ll get into that next post, because there’s a few more things to note. First, doing that calculation, whatever it is, usually takes some time. So the more moves you look at (whether because a player has lots of moves to choose from, or because you’re searching more moves ahead) the longer it takes to come to a decision. So a lot of science goes into reducing how long it takes to calculate a board’s score, and how to avoid looking at some moves all together.

To reduce how long it takes to calculate a board’s score, I use a Transposition Table. Basically, since I’m playing a game where you take turns moving one piece at a time, there’s a good chance that I’m going to end up in the same layout of pieces, or position, at some point in the future. So, as I’m going along calculating board scores, I save them off. That way, when I see a position that I’ve seen before, I can just reuse value I’ve saved. As the game progresses, that table gets bigger and bigger, and I have more saved off values to reuse.

To reduce how many moves I look at, I use Alpha-Beta Pruning. Essentially, it means that if I’m looking through a player’s moves, there are times when I know that there’s no point in looking anymore. If I do move A and while looking through my opponent’s possible responses I see a move that wins the game for them, I don’t need to look at any more moves that would come after playing move A. I assume that playing move A will lose me the game.

The final thing I do is something called an iterative search. The deeper the depth of your search, the exponentially longer it takes. The game wouldn’t be very fun to play if the AI takes hours to decide what to do. Conversely, if the depth is too low, then the AI will be considerably weaker, since it doesn’t “think ahead”. So instead we do an iterative search.

It turns out that Alpha-Beta Pruning is most effective when you look at the “best” moves for each turn first. Only, if we already knew the best moves, we wouldn’t need to search! To help Alpha-Beta pruning out, what I do is:

  1. Search to a depth of one, ie. just looking at my move and not my opponent’s response.
  2. Sort my moves from best to worst.
  3. Search again but now to a depth of two, looking at both my moves and my opponent’s responses.
  4. Sort my moves again from best to worst.
  5. Search again but now to a depth of three, looking at both my moves, my opponent’s responses, and my next move.
  6. And so on, and so on…

Instead of having a set depth to search, I instead set a time limit on my search, say five seconds. I start searching deeper and deeper, each time resorting the moves better and better, until I run out of time. Combined with the fact that I’m saving off board scores as I go, each search iteration gets faster and faster as I see the same board positions over and over.

Now, a lot of code is spent making these operations go as fast as possible, but at the end of the day, the everything hinges on how good your AI is at evaluating a board’s position. The faster and more accurately you can rate who’s “ahead”, the stronger your AI will be.

Stay tuned for Part III, where I go more into detail about how the AI evaluates board positions and how I’m trying to improve it.

Try out Mzinga today!

/jon

Update (15-JUL-2016): Creating an AI to play Hive with Mzinga, Part III is up.

Creating an AI to play Hive with Mzinga, Part I

I first started playing chess seriously in college, and I learned just enough to realize that I wasn’t going to be able to commit to the kind of study that makes a great chess player. I even built a super weak chess AI in Matlab that just made random moves. More recently I’ve gotten back into playing over at Chess.com (hit me up for a game), and when I was looking for a side project last winter, I peered back into the deep and treacherous waters of chess AI programming.

But I’d already built a chess AI (dumb as it was) and wanted to do something new. Then it occurred to me that I could try to make an AI for another game I love to play, Hive.

What is Hive? Hive is an abstract board game (like chess) where the pieces are different kinds of bugs. Each bug moves in a certain way, and the goal is surround your opponent’s Queen Bee before they surround yours. Like chess in many ways, but simpler and more approachable to many people.

So I started looking around to see if anyone had already created any Hive AIs. While there is some software for playing Hive (play Hive online at Boardspace, with the PC game from BlueLine Games, or via various mobile apps) and that software usually has an AI player to play against, I couldn’t find anyone specifically trying to make a strong Hive AI player.

The chess world on the other hand, is loaded with people trying to make better chess AIs. The amount of available chess AIs (see this list of chess engines for example) and publicly available research material (see the Chess Programming Wiki) is simply astonishing.

One of the coolest aspects of it is the separation of the chess engine (where the AI does its work) and the UI (which just displays the game). There are well established protocols that chess engines can implement (such as the Chess Engine Communication Protocol or the Universal Chess Interface) to specify how you ask a chess engine to evaluate a board and pick the best move. This means that by implementing these protocols, chess programmers don’t have to agree on what programming language to use, or waste time making pretty interfaces. They can focus on the guts of their engine, using whatever tools they want, and they’ll be able to participate in games against both humans and other chess engines.

It’s with that model in mind that I started work on Mzinga.

Mzinga 0.9 ScreenshotMzinga is an open-source AI to play the board game Hive. It is architected similar to various chess-playing AIs, consisting of:

  • A set of standardized commands similar to the Universal Chess Interface
  • An command-line reference engine, which accepts the standardized commands
  • A general board UI which can interface with any engine that implements the standardized commands

The goal is not to simply implement Hive in code, but to establish a standard way for programmers to create their own AI players. The hope is that this will encourage explorations around Hive AI.

It’s written in C# (see the Mzinga source on GitHub) and runs on Windows 7 and above. You can play player vs. player, player vs. cpu, or cpu vs. cpu. It’s very beta – the included AI is not very strong, and the UI is a little bare, but it is very playable. Check it out and let me know what you think!

Stay tuned for Part II, where I go more into detail about how the AI works and what I’m doing to improve it.

Try out Mzinga today!

/jon

Update (14-JUL-2016): Creating an AI to play Hive with Mzinga, Part II is up.

Consolidating the Sega Genesis, Sega CD, and Sega 32x power supplies

As I’ve mentioned before, the Sega Genesis was my first and favorite childhood gaming console, and though over the years I’ve come to rely on emulators, re-releases and clone systems to get my Sega nostalgia, I recently decided to set up some original hardware.

Whether you’re new to the system, or a long-time fan, the holy trinity is to get a Sega Genesis along with its two biggest add-ons, the Sega CD and the Sega 32X. Now, one of the major annoyances is that each has its own wall-wart power supply, which are a pain to use: they’re hard to get situated on power strips due to their size and they generate a lot of heat, even with nothing turned on.

I thought there had to be a way to have a single consolidated power supply that supports all three systems at once. And after a little research, I found two solutions:

  1. The Sega Trio, which is a commercial power supply made specifically to solve this problem at the reasonable cost of $28.99. Only problem seems to be the very limited availability.
  2. Follow what others have done (see this forum post), and build my own custom harness on top of a generic power supply.

So I decided to build my own, and document the build along the way.

The Investigation

The first step was to collect all of the relevant power requirements for each system (thanks to this page for a lot of it). For my own setup, I have a model 2 Genesis, model 2 Sega CD, and a 32X. Here’s what the original power supplies look like:

System Voltage Amperage Plug Type Polarity
Sega Genesis Model 2 10V 0.85A 1.7mm Pos. center
Sega CD Model 2 9V 1.2A 2.1mm Neg. center
Sega 32X 10V 0.85A 1.7mm Pos. center

Now the next step is to find a suitable generic power supply. It is important to note that none of these systems actually needs exactly 9 or 10 volts to operate. In fact, the rating on the label is more like the “average voltage” that adapters of that model produce. So a particular one of these power adapters may actually produce anywhere from 8 to 12V.

This is all fine because the first thing each system does is step down that input power to the steady 5 volts that the chips inside are designed to use.

Since it’s very easy to source 9V power supplies, I’ll go with 9V for my generic.

As for the amperage, I need to add up the ratings of each power supply to get a minimum safe value. Amperage ratings show the maximum amperage that the power supplies can safely provide, not the actual amount the systems will necessarily draw. To be safe, OEM manufacturers typically provide power supplies that can offer slightly more amps than they expect their product to draw at full intensity.

So in the original setup, 2.9A is probably safely above the maximum current I’d expect the three systems to draw at once. Round that up to 3A, and now I’m looking for a 9V, minimum 3A power supply, preferably one that isn’t a wall-wart.

The next consideration is how to get one power supply to have three plugs. So I’m going to need to make or buy some kind of harness to split or daisy-chain three separate plugs. As for the plugs themselves, I have couple of things to consider.

Generic power supplies typically come standard with barrel plugs with a 2.1mm diameter center hole with positive polarity. The Genesis Model 2 and 32X both use the more common positive polarity center. Unfortunately, they use the less common EIAJ-03 barrel plugs with a smaller 1.7mm diameter center hole. The Sega CD Model 2 uses the common 2.1mm center hole, but the less common negative polarity center.

So in both cases I’m going to need some adapters.

At this point I have a spectrum of options. On one end is to buy all raw parts and try to solder up a completely custom harness. More extreme, I could even build my own power supply from scratch. On the other end of the spectrum is to try and get away with buying off-the-shelf cables and adapters.

Building from scratch is fine if you have the time, skills and tools. To save money you’ll need to buy parts in bulk, which is great if you’re going to build a bunch, or might use the parts in other projects.

Anyone can buy off-the-shelf parts and just plug them all together. No special skills required. But you will pay a premium.

For this project, I thought it would be interesting to source off-the-shelf parts, just to prove that I could make it work with no soldering required. Yes it probably wouldn’t look as slick, but anyone could replicate it. Also I was feeling lazy. I can always play cable doctor later and transform the off-the-shelf parts into a slicker, more permanent solution.

Bill of Materials

Item Product Quantity Unit Price
9V 3A DC Power Supply Output 9V 3A AC/DC Power Supply Cord Charger IEC320 C8 Socket Adapter 1 $14.37
AC Power Cord SF Cable, 6ft 18 AWG 2-Slot Non-Polarized Power Cord (IEC320 C7 to NEMA 1-15P) 1 5.75
3-Way DC Power Cable Splitter 1:3 DC Power Splitter Cable Cord 1 Female to 3 Male 5.5×2.1mm Port Pigtals 12V 1 3.99
2.1mm to 1.7mm DC Power Adapter Cable 4.8mm x 1.7mm Male Plug to 5.5mm x 2.1mm female socket DC Power Adapter cable 2 3.59
2.1mm Power Jack Polarity Changer MODE 2.1MM JACK-PLUG POLARITY CHANGER BLACK 68-102-0 1 3.80
Subtotal $35.09
Shipping 4.49
Total $39.58

First off, I’ll say that I didn’t spend a ton of time sourcing the parts. Just a couple days casually browsing Amazon and Ebay. If I found something that looked like it’d work, I bought it. For $40, this is definitely not the cheapest way to do this, but it’s not much worse than buying three new power supplies at $9.99 each.

The nicest thing was finding out that the CCTV security camera market supports an industry of nice DC power cable splitters in a variety of configurations, which was the part I assumed I’d have to make myself.

The Build

Once I had the parts, it was a simply a matter of connecting everything. If it’s not straight-forward for you, it goes like this:

  1. Connect AC cord to the power supply.
  2. Connect the 3-way splitter to the power supply.
  3. Connect each of the two 2.1mm to 1.7mm adapter cables to a cable coming out of the 3-way splitter (one each for the Genesis and 32X).
  4. Connect the 2.1mm polarity changer to the remaining cable coming out of the 3-way splitter (for the Sega CD).

The end result should look something like this:

Since the plugs have different sizes, it shouldn’t be too hard to figure out which plug connects to each system. For the sake of simplicity, I put a bit of colored tape on each: red for Genesis, blue for CD, yellow for 32X, corresponding with the primary colors of the game boxes for each system.

The power supply works like a charm. I’m free from the three-wart tyranny!

Have questions or comments? Sound off below.

/jon

Follow

Get every new post delivered to your Inbox.

Join 627 other followers