llm-food-grab-game

🤖 Console Game For LLMs

Because it’s Friday. And because games are fun… I built a console game for my LLMs to play against each other in a kind of turn-based strategy challenge. It’s a bit goofy but at the same time quite instructive (though not in a way I hoped it would be).

The LLMs I’ve tried so far are being consistently beaten by a pretty simple bot. I ran a tournament between some of my favorite local models and two bots (see results below), and LLMs performed “average” at best.

I would love to hear your thoughts and get your help from this community because, frankly, I’m winging this and could use some smarter minds.

🎮 The Game: “Food Grab”

Game repo: https://github.com/facha/llm-food-grab-game.git

Premise: A 10x10 grid. Two players. One piece of food. First to reach the food gets the point. The game goes on until a set number of rounds.

game fragment

Each turn, the model is prompted with the current board state and must reply with new coordinates of the next move. It’s simple, and I’ve been expecting the models to do quite well. But as it turned out, a basic bot consistently beats LLMs.

Players:

I assumed LLMs would outperform the Smart Bot. It’s easy if you position yourself well for the next upcomming round during rounds where the opponent is closer to the food cell.

But no LLM nailed that part. And they were also struggling to do the same what the Smart Bot is doing (getting closer to the food with each move no matter what).

To end on a high note, LLMs did well against Silly Bot. Yay?

🧪 The Tournament

Each player played 200 games (100 as first player, 100 as second player) against every other participant.

📊 Score Matrix (Player vs Opponent)

Player smart_bot phi4 qwen3 gemma3 mistral-small3.1 silly_bot
smart_bot 100 122 116 129 129 197
phi4 78 100 99 108 121 191
qwen3 84 101 100 101 114 192
gemma3 71 92 99 100 101 189
mistral-small3.1 71 79 86 99 100 186
silly_bot 3 9 8 11 14 100

🏆 Rankings by TrueSkill Algorithm

Rank Player μ σ Exposed Score (μ-3σ)
1 smart_bot 39.21 4.65 25.25
2 phi4:14b-q8_0 28.31 3.31 18.37
3 gemma3:27b-it-q8_0 24.53 2.75 16.30
4 qwen3:32b-q8_0 27.12 3.71 16.01
5 mistral-small3.1:24b-instruct-2503-q8_0 20.67 2.97 11.74
6 silly_bot 11.41 5.37 -4.70

🏆 Rankings by Total Wins

Rank Player Total Wins
1 smart_bot 793
2 phi4:14b-q8_0 697
3 qwen3:32b-q8_0 692
4 gemma3:27b-it-q8_0 652
5 mistral-small3.1:24b-instruct-2503-q8_0 621
6 silly_bot 145

🤖 About that Smart Bot…

As you probably have noticed smart_bot is an absolute winner. I’ve tried things but no model was capable to consistently outplay him. In fact, it used to be called just bot. I’ve renamed him to smart_bot to honor the fact it’s been consistently beating those multi-gigabyte-sized monsters trained during months on huge GPU cluster.

Three models (gemma3:27b-it-q8_0, phi4:14b-q8_0, qwen3:32b-q8_0) showed similar results and mistral-small3.1:24b-instruct-2503-q8_0 performed somewhat poorer. phi4:14b-q8_0 has positively surprized since this is a smaller size model + at least in the past it’s been accused here and there of including benchmarking data in its training sets, which could make it appear better on popular benchmark rankings than it actually is.

🧠 Observations & Questions

Prompting

Here is the prompt (a sample) that the script is currently using:

Your coordinates: [2,1]
Food coordinates: [0,1]
Your valid moves are: [1,0] [1,1] [1,2] [2,0] [2,1] [2,2] [3,0] [3,1] [3,2]
Make a move towards food. 
Respond ONLY with the new coordinates as a Python list, e.g.: [2,3]
Do NOT include any extra text or explanation.

It is very direct and consice. And it worked the best. What I actually don’t like about it is that it tells the model what to do instead of leaving up to the model to figure it out.

I’ve tried other options. E.g. here is a more general prompt that explains the game rules and provides game state, but leaves the desicions up to the model.

You are an AI player in a turn-based grid game. The board is 10x10.
There are two players: 0 and 1. You are Player 1. Your goal is to move to the food cell (f) to score points. 

The valid moves are to adjacent squares (including diagonals) that are inside the board and not occupied by the other player.
On your turn you can move one cell in 8 directions: up, down, left, right, up-right, up-left, down-right, down-left (as coordinates).

Once one of the players reaches the food cell, he scores a point. Then the game continues: food respawns at a new random cell.
The players have to reach the food cell again from their current coordinates.

Here is the current game state:
   0 1 2 3 4 5 6 7 8 9
  +-------------------+
0|   0                |
1|                    |
2|                    |
3|                    |
4|                    |
5|                    |
6|                 f  |
7|                    |
8|                    |
9|                   1|
  +-------------------+
Current Score: [0, 0]
Player 0 coordinates: 1,0
Player 1 coordinates: 9,9
Food coordinates: 8,6

Your valid moves are: 8,8 8,9 9,8 9,9

Respond ONLY with the new coordinates as a Python list, e.g.: [2, 3]
Do NOT include any extra text or explanation.

Another prompt I’ve treid. I’ve asked an LLM online how can I improve my previous prompt. And this is what I’ve got in response. But it didn’t improve the results.

Game Board (10x10):

   0 1 2 3 4 5 6 7 8 9
  +-------------------+
0|   0                |
1|                    |
2|                    |
3|                    |
4|                    |
5|                    |
6|                 f  |
7|                    |
8|                    |
9|                   1|
  +-------------------+
Current Score: [0, 0]
Player 0 coordinates: 1,0
Player 1 coordinates: 9,9
Food coordinates: 8,6


Valid Moves for Player 1: 8,8 8,9 9,8 9,9

Instructions:
- You are Player 1, playing on a turn-based grid.
- You can move 1 cell in any of the 8 directions.
- Avoid moving to the other player's position.
- Score by reaching the food (f), which will then respawn.
- Reply ONLY with the coordinates of your move as a Python list. Example: [3, 4]
- DO NOT include any explanation or text.

In case you have any suggestions, please let me know.


Model Parameters

Here’s what the script is currently using:

{
  "temperature": 1.2,
  "top_p": 0.95,
  "top_k": 20,
  "min_p": 0.0,
  "max_tokens": 10,
  "model": "whatever",
  "messages": [
    { "role": "system", "content": "You are an AI playing a grid game. Only respond with a single move in the form [x,y]. /no_think" },
    { "role": "user", "content": prompt }
  ]
}

Again, if you think there are some changes worth making to these parameters or there are others to try, please let me know.

🔗 GitHub: llm-food-grab-game