Latest Research Reveals AI Still Unprepared for Office Roles
Your job is safe for now as AI still struggles with real office tasks
In the world of tech innovation, generative AI has become the buzzword, captivating industries and sparking a fervor about its potential to replace human talent. Yet, as we look around prominent law firms and investment banks, it’s clear that the human essence remains irreplaceably strong. Despite the relentless chatter about AI’s capabilities, a recent study by Mercor reveals a sobering truth: when it comes to the complicated demands of real-world tasks, AI simply isn’t quite ready to take the lead.
A Reality Check for the “Replacement” Theory
Mercor’s groundbreaking benchmark, known as APEX-Agents, presents a stark contrast to the typical tests AI systems endure. Unlike routine tasks like writing poems or solving basic math problems, APEX-Agents employs genuine queries from professionals across various sectors, demanding multi-step tasks that require deft navigation of diverse information sources.
The findings? Surprisingly, even the most advanced models—think Gemini 3 Flash and GPT-5.2—struggled to break the 25% accuracy barrier. Gemini topped the leaderboard with a 24% success rate, while GPT-5.2 closely followed at 23%. Most competitors lingered in the teens, painting a compelling picture of AI’s current limitations.
Why AI is Failing the “Office Test”
As highlighted by Mercor’s CEO, Brendan Foody, the central issue isn’t a lack of raw intelligence but rather the significant challenge of context. In a legal or financial setting, professionals weave together insights from Slack threads, PDF documents, spreadsheets, and more to address questions like those surrounding GDPR compliance.
Human brains excel at this context-switching, effortlessly navigating through a web of information. AI, however, falters. When tasked with retrieving details from scattered sources, these models often get tangled, providing incorrect answers or simply failing to respond.
The “Unreliable Intern”
For those concerned about job security, this revelation might come as a relief. The data suggests that current AI systems are behaving more like an unreliable intern than a skilled professional, managing accurate responses about a quarter of the time.
However, there is a flip side to this news. The pace of advancement is undeniably rapid. Foody notes that just a year ago, AI models were performing at a mere 5% to 10%. Now, they’ve skyrocketed to an impressive 24%. While AI may not be poised to seize control of knowledge work just yet, it’s evident that its learning curve is steep and accelerating. For the moment, the anticipated revolution in this domain remains on hold as these systems grapple with the complexities of multitasking.
As we navigate these changes, it’s essential to embrace the evolution of technology while recognizing the irreplaceable value of human insight and adaptability. Together, we can foster an environment where human and AI capabilities complement one another, driving progress in the workplace.
Now, let’s continue to champion the unique skills and perspectives that make us invaluable in our roles. Keep pushing boundaries, and stay inspired as we move forward together in this dynamic landscape!

