It is almost ready for use, and it is a next-gen artificial intelligence model
Getting the Most Out of Google: A Test to See How the Chatbot Can Learn to Translate the First Two Lines of the Philippine Patriotic Oath
The older one was pretty good. It could give tea recommendations and make a chocolate cake recipe. But it couldn’t give you a photo of a majestic horse — at least until recently — and can be slower to respond than ChatGPT.
There is still a lot of work that needs to be done to get accurate results, and people should keep using them for the bot to learn how to respond to questions. I ran the tests to see how they held up.
The image of dogs running in fields of flowers made me want to cleanse my palate. I took a photo of my friend’s dog, Sundae, so that I could make it look like she’s in a photo shoot. The background should be removed and replaced with a pink one. This was one that I had to test against. Plus, as DALL-E 3 is supposed to be able to simply edit photos. I may have broken both of them because neither could give me what I asked for. The photo of a golden doodle with daisies was changed to another with a pink background. The prompt took too long to analyze, and ChatGPLt couldn’t generate anything.
I asked what the tasks were, since I was made to handle ” highly complex tasks.” The machine answered, “Translation.” I asked the company to translate the first few lines of the Philippine Patriotic Oath. It is a fairly obscure oath, especially since the version I know has been changed several times in the past 20 years.
I was told by the company that it couldn’t help me with my request because it is trained to respond in a subset of languages. I asked which languages it supports, but the chatbot refused to answer, saying it can’t give me a definitive list of languages it can understand. I then asked Gemini Advanced if it knows Filipino, and it responded positively. Officially though, Google does not list Filipino in the 40 languages Gemini currently supports.
It worked in the favor of the company when it tapped into other products of the search engine. The rundown included the coordinates of each Filipino and Ethiopia restaurant in New York City.
A few days ago, I asked ChatGPT The results for restaurant recommendations were incorrect because I was just looking for new restaurants. The names were correct, but not all of the locations were. I asked people to chat. Plus for this test and got much more accurate locations but a smaller list of restaurants. In this case, the request was worked better by Gemini.
The main reason I use a bot is to summarize complicated papers. Two paragraphs from Apple’s paper on image editing were fed to Gemini Advanced. The paper gave me a headache the first time I read it, so I figured it would be easy for Gemini to at least give me the gist. To see how the bot strings its instructions, I wanted to see it. One asked to summarize and the other to generate text.
The summary was… passable. It really did give me a rundown of the concepts discussed in those two paragraphs, but it didn’t “translate” it into plain language. I should have asked that. Gemini then moved on to writing the article I asked for, and you know what? The summaries I’ve asked for were better than those 150 words.
Source: Gemini Advanced is most impressive when it’s working with Google
The Gemini 1.5 Context Window: Why it’s Important for Humans, Business, and the Internet? A Comment on Sundar Pichai
Chatbots occupy a tricky space for users — they have to be a search engine, a creation tool, and an assistant all at once. That is true of a chatbot coming fromGoogle,which is increasingly relying onAI to supplement its search engine, voice assistant, and just about every productivity tool in its arsenal.
But there’s one new thing in Gemini 1.5 that has the whole company, starting with CEO Sundar Pichai, especially excited: Gemini 1.5 has an enormous context window, which means it can handle much larger queries and look at much more information at once. That window is a whopping 1 million tokens, compared to 128,000 for OpenAI’s GPT-4 and 32,000 for the current Gemini Pro. Tokens are a tricky metric to understand (here’s a good breakdown), so Pichai makes it simpler: “It’s about 10 or 11 hours of video, tens of thousands of lines of code.” The context window means you can ask the AI bot about all of that content at once.
He told me that you can fit the entire Lord of the Rings trilogy into that context window. This seems too specific, so I ask him: this has already happened, hasn’t it? Someone in Google is just checking to see if Gemini spots any continuity errors, trying to understand the complicated lineage of Middle-earth, and seeing if maybe AI can finally make sense of Tom Bombadil. “I’m sure it has happened, or will happen,” Pichai says with a laugh, “one of the two.”
The larger context window is going to be very useful for businesses. This allows use cases in which you can add a lot of personal context and information at the moment of the query. Think of it as we have expanded the query window. He imagines filmmakers might upload their entire movie and ask Gemini what reviewers might say; he sees companies using Gemini to look over masses of financial records. “I view it as one of the bigger breakthroughs we have done,” he says.
Eventually, Pichai tells me, all these 1.0s and 1.5s and Pros and Ultras and corporate battles won’t really matter to users. “People will just be consuming the experiences,” he says. It is similar to using a cellphone without paying attention to the processor underneath. Everyone knows the chip in their phone, he says, which is why we are still in this phase. He says that the underlying technology is shifting so fast. People care.