Mzinga is my attempt at an open-source AI for the board game Hive. So far in this series of posts, I’ve given an overview of the project, a dive into Mzinga’s AI, and then a deeper dive into Mzinga’s board evaluation function. All together, I have quite the framework for executinga decent Hive AI, but the question remains: How do I make it better?
How do people get better at games? Study and competition. Trial and error. Strategies that win us games are remembered and repeated while plays that lose us games are remembered so that we don’t repeat them. But in a game with so many things to consider, where there is no “right” move each turn, you can’t just memorize the answer, you have to actually play.
What are the things to consider? The metrics I talked about in the last post, like how many moves are available and what pieces are surrounded or pinned. Mzinga looks at some ~90 different metrics. For each there is a corresponding weight, measuring how much each metric should impact the board’s score.
How an AI plays is largely determined by what their metric weights are. AIs with different weights will make different decisions on which move to play. Creating the best AI then, is an exercise in determining the best metric weights. Since we don’t know what are the best numbers, our best option is to:
- Create a bunch of AIs with different metric weights
- Have them play a ton of games against one another
- Pick the one that proves that it’s the best
To pick the best one, I need a way of rating each AI. So I followed in chess’s footsteps and adopted the Elo rating system. It does a few of things I like:
- Each player has only one number to keep track of, so it’s easy to see who’s the “best”
- A player’s rating is just an estimate of how strong the player is:
- It goes up and down as they win and lose and their rating is recalculated
- Points are taken form the loser and given to the winner, so there’s a finite amount of “rating points” to go around
- More games mean a more accurate estimate of their strength, so no one game can make or break a player’s rating
- When recalculated a player’s rating takes into consideration who was playing:
- A good player beating a bad player is not news, so the winner only takes a few points from the loser, ie. player’s can’t easily inflate their ratings by beating awful players
- A bad player beating a good player is an upset, so to reflect that the ratings were wrongly estimated, the winner get lots of points from the loser
- Two players with similar scores are expected to draw, so if one wins, it’s a medium amount of points transferred to separate their ratings a little more accurately
Now the Elo system isn’t perfect. The biggest being that it only shows the relative strength of the players involved – you need lots and lots of players participating for the ratings to mean anything. There are lots of criticisms and variants when talking about real people playing, but it’s fine for what we need it for.
So now we have a method for finding the best player in a population of AIs. Create a bunch of AIs, have them fight one another to improve the accuracy of their Elo rating, and pick the one with the highest rating. But like I said earlier, Elo ratings only show relative strength in a given population. What if I don’t have a good population? What if 99% of them are so awful that an AI with some basic strategy completely dominates? That’s no guarantee that the winning AI is actually all that good.
How do we improve a population of players so that we can be sure that we’re getting the best of the best?
It turns out we’re surrounded by an excellent model on how to make a population of something better and better: natural selection and evolution.
In the real world, living things need to fight to survive. The creatures that can’t compete die off, while those with the best traits survive long enough to pass on their traits to their offspring. Offspring inherit both a mix of their parents’ traits, but they’re more then the sum of their parts. New DNA comes into the population as new members join or mutations occur.
We can absolutely simulate evolution in our population of AIs. The life-cycle goes something like this:
- Create a bunch of different AIs with different weights (traits)
- Have them all fight each other to sort out a ranking of who’s the strongest
- Kill off the weakest AIs
- Have some of the remaining AIs mate and produce new AIs to join the population
- Repeat 1-4
Now the first three steps seem pretty straight forward, but AI mating? It’s actually not that hard to implement. For each metric weight, simply give the offspring the average value from each parent. To make sure that AI’s children aren’t all identical, add a little variance, or mutation, by tweaking the result a little bit.
For example, if parent A has a particular weight of 5, and parent B a weight of 3, rather than giving every child the weight of 4, give them each something random between 3.9 and 4.1. Just like in real life, we don’t necessarily know which traits were the most important in making sure that the parent survived, and we don’t get to pick which traits get passed on. So we pass on everything, good and bad, and let life (in this case lots of fighting) determine who’s better overall.
Now we can start a new generation and have them all start fighting again, so we can suss out everyone’s rankings in the presence of these (presumably better) offspring. Add in some new AIs every now and then to prevent too much inbreeding (where one AI’s traits, good and bad, start to dominate and everyone starts looking more and more like clones of one another) and we now have a true population of AIs, all fighting to be the best.
Now, how exactly am I doing all of this?
With the Mzinga Trainer.
It’s a simple, yet highly configurable little tool for creating, maintaining, and evolving a population of AIs. I started with several seeds of simple AI weights that I handcrafted, plus a slew of randomly generated ones. Then I set up different populations on different computers with different parameters, and have them fighting, dying, and mating with each other for over a week.
It has made Mzinga into one of my more interesting projects, as I’ve made improvements to the tool, I’ve spun up new populations, mixed existing ones, and run life-cycles with different parameters. Some run under really harsh conditions, where so many get killed that the re-population comes from the offspring of very few survivors. When I started noticing that one small population had become so inbred as to appear like a clones of one another, I added in some new blood, so to speak. Then I reduced the harshness to give different AIs, and ultimately different play styles and strategies, a chance to survive and procreate without getting mercilessly killed off.
It’s an ongoing process, and like life itself, something that takes a lot of time to happen.
The Mzinga Trainer tool is included with the install of Mzinga. Now, not only can you try your hand against the Mzinga AI I’ve created, but you can even try to make a better AI of your own. So come on and try out Mzinga today!