Apple researchers ran an AI test that exposed a fundamental ‘intelligence’ flaw

Ryan Christoffel | Nov 1 2024 - 7:42 am PT

Apple just shipped its first Apple Intelligence features and launched new AI-optimized Macs. But for all the AI hype, there are clearly limitations with the technology’s intelligence. And one of those limits was highlighted by Apple’s AI research through a recent experiment.

Testing AI’s capabilities

Last month, a team of Apple’s researchers published a new paper about a key AI limitation.

Michael Hiltzik writes for The Los Angeles Times:

See if you can solve this arithmetic problem:

Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?

If you answered “190,” congratulations: You did as well as the average grade school kid by getting it right. (Friday’s 44 plus Saturday’s 58 plus Sunday’s 44 multiplied by 2, or 88, equals 190.)

You also did better than more than 20 state-of-the-art artificial intelligence models tested by an AI research team at Apple. The AI bots, they found, consistently got it wrong.

The research paper explains that the best and brightest LLM models saw “catastrophic performance drops” when trying to answer simple math problems that were written out like this.

It happened primarily when those problems included irrelevant data, which even schoolchildren quickly learn to disregard.

Thus calling into question AI’s current intelligence capabilities.

Apple’s AI research finds ‘intelligence’ is not what it appears

Due to the variety of tests Apple’s AI research entailed, the paper concludes that current AI models are ‘not capable of genuine logical reasoning.’

Which might be something we’re generally aware of, but it stands as an important cautionary note as more and more trust is given to AI’s ‘intelligence.’

Top comment by VoxelFox

Liked by 7 people

Grady Booch, father UML, has been saying this for years. LLMs aren't intelligent and never will be, though they may get large and complex enough to simulate it. The problem really isn't the amount of data you feed it, it's the foundational architecture. LLMs are based on probability, not logic and understanding.

View all comments

AI optimists might assume the problem is an easy fix, but Apple’s team disagreed. “Can scaling data, models, or compute fundementaly solve this? We don’t think so!”

Ultimately, Apple’s paper is not meant to dampen enthusiasm over AI’s capabilities, but rather provide a measure of common sense.

AI can perform some tasks as though it’s extremely intelligent, but in many ways that ‘intelligence’ isn’t what it might appear.

What do you make of Apple’s AI findings? Let us know in the comments.

Best accessories for iPhone, iPad, Mac, and more

Add 9to5Mac as a preferred source on Google

FTC: We use income earning auto affiliate links. More.

Check out 9to5Mac on YouTube for more Apple news:

Comments

Author

Ryan Christoffel iryantldr

Ryan got his start in journalism as an Editor at MacStories, where he worked for four years covering Apple news, writing app reviews, and more. For two years he co-hosted the Adapt podcast on Relay FM, which focused entirely on the iPad. As a result, it should come as no surprise that his favorite Apple device is the iPad Pro.