The Kingen AI has been cheating

Saturday, January 20, 2024

Barbu King - also known as Kingen in Dutch - has been available for quite some time already on Whisthub. On April 1st, 2024, it will be exactly two years ago that it was launched - and that was not a joke! In that time, Kingen has become one of the most loved card games on Whisthub, next to the all time classics Colour Whist and Manille.

Recently however, I got an email from someone telling me that the AI was cheating. "That's impossible!" I thought. I've written the AI myself, and I very much did not program it to cheat. The AI always plays with the same information as a human player would: it does not know which cards the other players are holding. "The user must have seen it wrong, the AI hasn't changed in all these years, so it would be strange that nobody would have ever reported this!" was my natural reaction.

Nevertheless, my curiosity was triggered, so I tried to reproduce the cheating behavior that the user reported. And, oh boy, was I surprised. It turns out that the kingen AI was indeed cheating, but without me knowing! Does this mean that computers have finally become conscious in this time where AI seems to be all over the place? Well, no, but let's find out what happened!

The cheating happened in a game situation that looks as follows

Bert chose to play a trumps game, picked ♦ as trump suit and Chris has played ♦10 upon Rachel's ♦2. We have ♦K, ♦Q and ♦8, so ideally we'd play our ♦8 because Bert is probably holding ♦A, as he's the one who picked ♦ as trump suit. This is for example what you would normally do in Colour Whist. However, according to the rules of kingen, we must overtrump if possible, so we are required to play either ♦K or ♦Q.

This is where it goes wrong with the AI. If the AI ends up in this situation, then it still plays ♦8, even though it's not allowed according to the rules! At first I was quite surprised by this. After all, as you can see, ♦8 is correctly grayed out, and you are not able to play it as a human player. Where did this difference come from?

First of all, you have to understand that the server makes a distinction between moves made by humans and by the AI. For human moves, it's not sufficient that illegal cards are grayed out in the interface. That's because if you have a technical background, you could still send an illegal move to the server, there's nothing that can be done to prevent that. Therefore every human move that arrives on the server is first checked to ensure it is not an illegal move. The code for this on the server looks a bit like this

// Find the player and move from the incoming request.
let player = game.findPlayer(request.id);
let { move } = request.body;

// If the move is invalid, throw an error.
if (!game.isValidMove(move)) {
  throw new Error(`Invalid move ${move}!`);
}

// No error? Cool, carry out the move.
player.move(move);

AI moves are different however. We have full control over how the AI plays, so we can just program it to follow the rules. This eliminates the need for verifying the moves made by the AI, which speeds up the AI.

While not really intentional, there's actually another benefit of not verifying the AI's moves: if there's a bug in the AI, and the AI would pick an illegal card, then the game does not crash, but just silently fails. If you think if of it, that's actually a good thing! Imagine if this hadn't been the case, then the game would hang if the AI accidentally picks an invalid card because we can't tell the AI to just "try again": the AI plays deterministic, so it would always pick the same - illegal - card again!

It also turns out that the bug does not always happen. If the hand would've been

or in other words, only one card that overtrumps Chris' ♦10, then the AI nicely plays ♦Q, as it should. The reason for this is that the AI has a shortcut: if there's only one valid option to play, then we just pick that option. There's no need to loop through all of the AI logic if there's only one valid option. It makes the AI play a lot faster. It probably also explains why the bug has remained undetected for so long, as oftentimes you will have only one card that overtrumps what has been played, and it that case the AI does follow the rules due to the shortcut.

It's only when there are multiple cards to overtrump that the bug happens. The reason is that the logic for deciding what to play in situations like these is largely based on the Colour Whist AI. In Colour Whist, the rules are a lot less strict, and basically the only rule is that you have to follow suit.

This means that the logic to decide what to play in this case did not take into account that there might be restrictions on what can be played, and the code just evaluated all cards in the suit that was led. Something like this:

// Rank the cards and then pick the "best" one to play.
function decide(player, trick) {
  let { suit } = trick;
  let cards = player.hand.suit(suit);
  let sorted = cards.sort(rankCards);
  return sorted[0];
}

This was an error on my side, and I should've caught it when testing the AI. Luckily Whisthub has been updated, and the bug is now gone. The AI is no longer allowed to cheat!

Now I know what some of you are going to say:

Just get rid of this silly rule that you have to overtrump and just let us play ♦8! It would have prevented this bug.

I understand, and honestly, I prefer this too, but I've decided to use the exact same rules as those used by IWWA IWWA for the sake of consistency and uniformity. You can find and read them again here. If they ever decide to change this aspect of the rules, I will happily follow suit - no pun intended - but until then, you will have to live with it. After all, the rules are the same for everyone, now including the AI!

So, what did I learn from all of this? Well, the most important takeaway is that, when developing the AI for a new game, I need to make sure that the AI's moves are verified against the game rules when running simulations. I will probably write another blog post in the future about how the AI is developed, but running simulations - meaning the AI playing games against itself - is a fundamental part of it. It's also how I've verified that the cheating bug is now no longer present: after simulating 200,000 games, all the moves made by the AI have been following the rules.

Humanity vs AI: the humans still have it!