Monday, July 9, 2012

Trivial and Simple Diplomacy Bots


Before I get stuck in to building the next world Diplomacy champion, it seems like a good idea to have a think about some of the more trivial bots that can be written to play diplomacy. The following notes cover the basic bots listed on the DIADE Clients page.

From there I couldn't resist extrapolating to variants on these bots that should be easy to knock out. Probably once I start coding on this project, my first goal will be to write a bot that can copy all the behaviours in this article.


HoldBot

The simplest bot of all, this bot just issues hold orders for all units. Any unit that is dislodged is disbanded. Then in the build phase if any of its home supply centers are empty it rebuilds on them. Thats it! Obviously HoldBot provides probably the most boring game of Diplomacy possible, and should present no challenge to any but the most trivial of bots. It can never win the game!

SupportHoldBot

It seems to me that the next logical step up for this bot is to add the ability to support hold other units. The simplest implementation of this would be to find all the adjacent regions containing a friendly unit, then issue a support hold order for one adjacent unit at random. If no adjacent regions contain friendly units, then just issue a hold order as usual.

RandBot

What would happen if you just chose any order at random? Randbot is here to find out! This crazy guy chooses an order for each unit by randomly selecting from the list of all possible orders. Hijinks ensue, and I assume that the performance is awful :) Then again, at least this bot has a better chance of winning than a HoldBot. I imagine that any game won by RandBot must take a very long time.

ConsistentRandBot (ConsBot)

RandBot might move an army by convoy, but not choose to convoy the army with an appropriate fleet. Or support a move/hold in a region where no unit exists. ConsBot adds additional logic to ensure that all orders given are consistent with each other and the current game state. I don't know the exact algorithm in existing implementations, but I don't think it is trivial to implement this in a way that retains pure randomness.

I guess the easiest way would be to choose the sequence of units to generate orders for at random, and filter all its options to those that have a possibility of being consistent. Some outcomes would dictate the orders required for other units; in this case that unit can be allocated its order and be removed from the sequence. Then we move on to the next unit in the sequence and repeat. One option to consider in implementation is whether we consider RandBot consistent if it is supporting, or convoying through, units controlled by another player.

SmarterBot

Although random orders let RandBot have fun exploring the game's state space, Diplomacy is a game with a simple tactical objective - take control of more supply centers. SmarterBot has been modified to prefer moves in Autumn that end with the unit in an unoccupied supply center that is not under its control. I guess it chooses from all such options randomly if any exist. I imagine that this simple tweak greatly improves the bot's performance.

DumbBot

In spite of its name, DumbBot is a lot more advanced than the other bots described above. It actually has a good shot at winning a Diplomacy game; well against other bots anyway. DumbBot uses a ranking function (see the algorithm here) on regions, in order to modify the probability that it will select any given mode.

The heuristic used looks at the size of the nearest opponent to a supply centre, or its owner, and also takes into account the ranking of neighbouring regions.  Regions without supply centers gain value only from their neighbours. Once the target region is determined, DumbBot moves the unit there. However if the region already contains a friendly unit, or one is already moving to the region, then DumbBot will support/support hold if that guarantees a success. Otherwise it will move elsewhere, probably removing that region from the options list and choosing again.

A random mechanism is still used to determine which region to move to, but higher ranked moves will be more likely selected then those less favoured. This randomness stops the bot from playing in a predictable fashion, and prevents it from getting stuck in a scenario where it just beats its head against the same wall over and over. I guess it simulates how a human player can try different tactics in order to get around their opponent.

Of course it is easy to replace the ranking function in DumbBot with another of our own devising. Many different algorithms could be constructed and tested against each other in the field of battle. The most trivial would probably be something like the inverse of the region's distance from a friendly unit or region.

DumbSupportHoldBot

Going back to SupportHoldBot, we can step up the complexity, by providing a ranking function to prioritise  the adjacent unit(s) that most need supporting. The bot could either choose the unit with the highest ranking, or use the rankings as a weighting like DumbBot, whilst choosing randomly from all available options.

Obviously the number of adjacent enemy units would heavily influence any ranking function. A more complicated algorithm would try and determine the probability that supporting that region would make a different to combat results; it might be better to support the region with two adjacent enemies and abandon a region with three. Finally our bot would benefit from supporting regions with supply centers over regions without. Note that HoldBot units only ever exist on supply centers, but it doesn't hurt to make our algorithm robust enough to be portable to other Bots.

Final Thoughts

We could invest a great deal of time designing and balancing the perfect ranking function for our bots, but that sounds more like problem solving than artificial intelligence programming. Personally I am more interesting in building a bot that is able to evolve its own heuristics to prioritise available options. That to me is what is exciting about intelligent agent development. I'd let my bots try many different strategies, and learn from the outcomes to calculate its own optimal heuristics and weightings.

An alternative approach would be to get access to the vast caches of diplomacy game results from online gameplay. I could sit back and let the bot study these records in order to optimise its behaviour. This option has the advantage of providing a lot of real world data, but the disadvantage is that the bot did not actually play in the game, so the behaviours contained within do no reflect how people might react to the bots actual behaviour in a game.

No comments:

Post a Comment