It will be better for Gemini when it comes to understanding your phone screen

Google’s AI Assistant: What It Really Means to Be a Digital Expert on Android and What It Means for the Future of Android

The updates will be rolled out to hundreds of millions of devices over a few months and there will be more features in the works.

If you put it on your phone as the default help, it will be able to summarize questions about a website or a screen shot. Soon, it’ll also be able to tell if there’s a video on your screen and prompt you to ask questions about it. You could already get the video to do something like this in a roundabout way by using the automatic caption.

You need to have access to a paid version of the software in order to use it, but the cue will be similar if you look at a PDF. That’s because the feature ingests the entire PDF, so it requires the long context window available to Gemini Advanced subscribers. After taking the PDF on board, you can turn it into an expert on whatever topic is important to you, such as a dishwasher owner’s manual or recycling guidelines. A $20 per month plan for artificial intelligence is part of the plan.

Nearly a decade ago, Google showed off a feature called Now on Tap in Android Marshmallow—tap and hold the home button and Google will surface helpful contextual information related to what’s on the screen. Are you talking about a movie with a friend? Now on Tap could get you details about the title without having to leave the messaging app. Looking at a restaurant in Yelp? The phone can suggest OpenTable with just a tap.

Dave Burke, the vice president of engineering on the software company, tells me over the call that he believes we now have the technology to build really exciting assistants. “We need to be able to have a computer system that understands what it sees and I don’t think we had the technology back then to do it well. We do now.

I got a chance to speak with Burke and Sameer Samat, president of the Android ecosystem at Google, about what’s new in the world of Android, the company’s new AI assistant Gemini, and what it all holds for the future of the OS. This is a once-in-a-decade opportunity to rethink what the phone can do, and to rethink all ofandroid, said Samat.

Circle to Search is a new way of approaching Search on mobile. The experience of Now on Tap is more interactive than Circle to Search, which was introduced a few months ago. You circle what you want to search on the screen. Burke says, “It’s a very visceral, fun, and modern way to search … It skews younger as well because it’s so fun to use.”

It was clear that the answers were not just provided but were also being shown to students how to solve the problems. Circle to Search will be able to solve more complex problems later this year. This is powered by the LearnLM models, which are designed for education.

A mouthful, “Search Generative Experience” will be available to everyone in the US this week. It’s going to be like when you use Perplexity or Arc Search, results pages will be design and populated with summarized answers from the web.

The ability to search by video is a step further than the images based search that was already available by the company. That means you can take a video of something you want to search for, ask a question during the video, and Google’s AI will attempt to pull up relevant answers from the web.

Google is rolling out a new feature this summer that could be a boon for just about anyone with years — or even more than a decade — of photos to sift through. “Ask Photos” lets Gemini pore over your Google Photos library in response to your questions, and the feature goes beyond just pulling up pictures of dogs and cats. CEO Sundar Pichai demonstrated by asking Gemini what his license plate number is. The response was the number itself, followed by a picture of it so he could make sure that was right.

Google’s Project Astra is a multimodal AI assistant that the company hopes will become a do-everything virtual assistant that can watch and understand what it sees through your device’s camera, remember where your things are, and do things for you. It’s powering many of the most impressive demos from I/O this year, and the company’s aim for it is to be an honest-to-goodness AI agent that can’t just talk to you but also actually do things on your behalf.

Google’s answer to OpenAI’s Sora is a new generative AI model that can output 1080p video based on text, image, and video-based prompts. Videos can be produced in a variety of styles, like aerial shots or timelapses, and can be tweaked with more prompts. The company is already offering Veo to some creators for use in YouTube videos but is also pitching it to Hollywood for use in films.

Gems: Customizing Google AI to Become a Positive Coach for Running Programs and Business Successes (with an Appearance for SynthID)

Google is rolling out a custom chatbot creator called Gems. Just like OpenAI’s GPTs, Gems lets users give instructions to Gemini to customize how it will respond and what it specializes in. If you want it to be a positive and insistent running coach with daily motivations and running plans — aka my worst nightmare — you’ll be able to do that soon (if you’re a Gemini Advanced subscriber).

If you’re on an Android phone or tablet, you can now circle a math problem on your screen and get help solving it. Google’s AI won’t solve the problem for you — so it won’t help students cheat on their homework — but it will break it down into steps that should make it easier to complete.

Using on-device Gemini Nano AI smarts, Google says Android phones will be able to help you avoid scam calls by looking out for red flags, like common scammer conversation patterns, and then popping up real-time warnings like the one above. The company promises to offer more details on the feature later in the year.

The lightest version of the Gemini model is being added to Chrome on the desktop. The built-in assistant will use on-device AI to help you generate text for social media posts, product reviews, and more from directly within Google Chrome.

According to the company,SynthID will incorporate watermarking into content created with its new Veo video generator and that it can now also detect artificial intelligence-generated videos.

Previous post The people were waiting hours to see the trial
Next post Everything was said at I/O