Google's Gemini Can Now Control Your Phone Apps, and It's Weirdly Mesmerizing to Watch

Photo by Solen Feyissa on Unsplash
Google’s Gemini AI assistant has leveled up in a major way. The tech giant just rolled out task automation on the Pixel 10 Pro and Samsung Galaxy S26 Ultra, which means Gemini can now actually use apps on your phone without you lifting a finger. Right now it’s limited to a handful of food delivery and rideshare services, and yeah, it’s still in beta. Is it slow? Absolutely. Is it clunky sometimes? For sure. Does it actually solve any real problems you were having? Not really. But watching an AI assistant genuinely interact with your phone for the first time outside of a polished keynote presentation? That’s legitimately wild.
Let’s be real: Gemini moves at a snail’s pace compared to you. If you need an Uber in the next thirty seconds, you’re still your phone’s best option. But here’s the thing, task automation is built to work in the background while you do other stuff. You can literally watch your passport for the tenth time while Gemini orders your dinner. If you’re into that sort of thing, you can peep what’s happening on your screen as text pops up at the bottom showing you each move. When you ask for a chicken combo and the menu only offers half portions, Gemini figures out it needs to add two halves. That’s actually pretty clever.
The default setup has Gemini running quietly in the background, but if you want to watch the chaos unfold, you can tap a button to observe. Sometimes it gets a little painful, like watching Gemini hunt for a side of greens sitting right there on the screen while you’re internally screaming. My teriyaki order took about nine minutes to complete. Not ideal when you’re hungry. The smart play is to let Gemini do its thing and just confirm the final order before anything gets charged. In my testing, the AI nailed accuracy way more often than it messed up, and when it did fail, it was usually something simple like needing location permission.
Here’s where it got legitimately impressive: I put a flight to San Francisco on my calendar and asked Gemini to schedule an Uber to get me to the airport on time. Because it can access your email and calendar, it found my flight details, suggested leaving at 11:30 or 11:45 AM for a 1:45 PM departure, and booked the ride in about three minutes. The fact that it understood “schedule a ride” when Uber actually calls it “reserve a ride” shows how natural language processing is changing the game compared to older digital assistants.
The real issue is that apps built for humans are terrible for AI to navigate. No wonder Gemini struggles, it’s trying to interpret fancy food photos and dodge ads when it could just access a database. Google’s working on solutions like Model Context Protocol to give AI cleaner information to work with. Right now, this feels like a preview of what’s coming, even if it’s awkward and slow. We’re watching the future happen in real time, and honestly, that’s pretty rad.
AUTHOR: mb
SOURCE: The Verge
























































