< img src = "/uploads/blogs/f3/72/ib-1ilpbdoa5_fc931624.jpg" Alt = "four models of artificial intelligence played in Super Mario: Who came out the winner"/> < p > Model Shi Gemini-1.5-Pro ​​first lost a man during Super Mario video game, scientists said. The other three models were more skilled.

< P > The HAO AI Lab Laboratory Researchers from the University of California have checked the skills of four artificial intelligence models & nbsp; play in a video game & nbsp; super mario. To do this, they created special play agents based on AI. It turned out that one model lasted the longest. About the results of the competition researchers & nbsp; reported & nbsp; in social network X (Twitter).

< p >

< blockquote class = "twitter-tweet" >< p lang = "en" dir = "ltr" > clause-3.7 was tested on Pok & eacute; mon red, but what about more real-time games like sue Super mario ?. /> We Threw Ai Gaming Agents Into Live Super Mario Games and Found Claude-3.7 OutperForrated Other Models with Simple Heuristics. < Br />< Br /~ Br /062~ 0CLAUD-3.5 IS ALSO STRONG, BUT LESS CAPABLE OF & hellip; pic.twitter.com/bqzvblwqx3~~~~//p> & mdash; Hao ai lab (@haoailab) February 28, 2025 < p >

< p >

< P > In the post that appeared on February 28, researchers showed how different AI models play in Super Mario. At the same time, they explained the complexity of the problem: the tasks appear in real time and the program should be reacted as quickly as possible. The study was attended by four programs: Claude-3.7, Claude-3.5, Gemini-1.5-Pro ​​and GPT-4O.

< h2 > Models Shi and Super Mario < p >What are the results of AI models in Super Mario video game:

< Ul > < li > Claude-3.7 & mdash; in the first place; < li > Claude-3.5 & mdash; in second place. The system had problems with “planning complex maneuvers”;

< li > gemini-1.5 -pro and GPT-4O & mdash; played the worst.

< P > “We have attracted artificial intelligence agents to Live Super Mario and found that Claude-3.7 exceeds other models with simple heuristics”, & mdash; summed up in post.

< p >In video with game frames you can see the achievements of AI programs. One screen shows how four systems work in unison. We see how “runs” a man in a red suit who should overcome the danger, collect coins and not die. The longest screen is not black in Claude-3.7 & mdash; in the left upper corner. The Gemini-1.5-Pro ​​coped the worst & mdash; It has the first darkened screen.

< P > In one of the previous posts, researchers spoke about other games that can play AI. Among them & mdash; Game 2048 and Tetris. In addition, in the following messages, they reported competitions in the Sokoban game. The task of the player & mdash; collect boxes at a certain point of space. The winner was another system & mdash; O3-mini: It reached the fourth level. The results of others are somewhat worse: Claude-3.7-Thyinking stopped on the second, Deeseek-R1 & mdash; on the first, Gemini-2.0-Flash-Thinking & mdash; did not cope with any.

< p >

< blockquote class = "twitter-tweet" > < P lang = "en" dir = ltr "> you might have heard top reasoning Models Now Match Aime Gold Medalists in 2025, But Watch Them Crumble in Box-Pushing Sokoban (倉庫番). < Br />< Br /> Again, We Put Top Reasoning Models Into The Game, O3-Mini (Medium) Took The Crown, Reaching Level 4 Before Tanglemed WitH & Hell. pic.twitter.com/ajbcatmktq~~0~//p> & mdash; Hao ai lab (@haoailab) March 6, 2025 < p >

< P > Note that in May 2023, Defense Brief spoke about another study on & nbsp; video games. The research was attended by a sea lion named Spike. The US Navy taught an animal at games that needed to monitor the cursor movement. It turned out that the animal was able to cope with this and probably further teach to detect mines.

Natasha Kumar

By Natasha Kumar

Natasha Kumar has been a reporter on the news desk since 2018. Before that she wrote about young adolescence and family dynamics for Styles and was the legal affairs correspondent for the Metro desk. Before joining The Times Hub, Natasha Kumar worked as a staff writer at the Village Voice and a freelancer for Newsday, The Wall Street Journal, GQ and Mirabella. To get in touch, contact me through my natasha@thetimeshub.in 1-800-268-7116