AGI Is Coming… Right After AI Learns to Play Wordle
Published in arXiv Preprints, 2025
This paper investigates multimodal agents, in particular, OpenAI's Computer-User Agent (CUA), trained to control and complete tasks through a standard computer interface, similar to humans. We evaluated the agent's performance on the New York Times Wordle game to elicit model behaviors and identify shortcomings. Our findings revealed a significant discrepancy in the model's ability to recognize colors correctly depending on the context. The model had a 5.36% success rate over several hundred runs across a week of Wordle. Despite the immense enthusiasm surrounding AI agents and their potential to usher in Artificial General Intelligence (AGI), our findings reinforce the fact that even simple tasks present substantial challenges for today's frontier AI models. We conclude with a discussion of the potential underlying causes, implications for future development, and research directions to improve these AI systems.
@misc{shekkizhar2025agicomingrightai,
title={AGI Is Coming... Right After AI Learns to Play Wordle},
author={Sarath Shekkizhar and Romain Cosentino},
year={2025},
eprint={2504.15434},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2504.15434},
}