Google Deepmind trains a video game-playing AI to be your co-op companion
13.03.2024 - 14:05
/ techcrunch.com
AI models that play games go back decades, but they generally specialize in one game and always play to win. Google Deepmind researchers have a different goal with their latest creation: a model that learned to play multiple 3D games like a human, but also does its best to understand and act on your verbal instructions.
There are of course “AI” or computer characters that can do this kind of thing, but they’re more like features of a game: NPCs that you can use formal in-game commands to indirectly control.
Deepmind’s SIMA (scalable instructable multiworld agent) doesn’t have any kind of access to the game’s internal code or rules; instead, it was trained on many, many hours of video showing gameplay by humans. From this data — and the annotations provided by data labelers — the model learns to associate certain visual representations of actions, objects, and interactions. They also recorded videos of players instructing one another to do things in game.
For example, it might learn from how the pixels move in a certain pattern on screen that this is an action called “moving forward,” or when the character approaches a door-like object and uses the doorknob-looking object, that’s “opening” a “door.” Simple things like that, tasks or events that take a few seconds but are more than just pressing a key or identifying something.
The training videos were taken in multiple games, from Valheim to Goat Simulator 3, the developers of which were involved with and consenting to this use of their software. One of the main goals, the researchers said in a call with press, was to see whether training an AI to play one set of games makes it capable of playing others it hasn’t seen, a process called generalization.
The answer is yes, with caveats. AI agents trained on multiple games performed better on games they hadn’t been exposed to. But of course many games involve specific and unique mechanics or terms that will stymie the best-prepared AI. But there’s nothing stopping the model from learning those except a lack of training data.
This is partly because, although there is lots of in-game lingo, there really are only so many “verbs” players have that really affect the game world. Whether you’re assembling a lean-to, pitching a tent, or summoning a magical shelter, you’re really “building a house,” right? So this map of several dozen primitives the agent currently recognizes is really interesting to peruse:
A map of several dozen actions SIMA recognizes and can perform or combine.
The researchers’ ambition, on top of advancing the ball in agent-based AI fundamentally, is to create a more natural game-playing companion than the stiff, hard-coded ones we have today.
“Rather than having a superhuman agent you play against, you