What Does Artificial Intelligence Need to Stop Doing Business? An Executive Summary on the Case of a Customer Service Problem with an Online Chatbot
The challenge for Microsoft is that it requires the same artificial intelligence technology as the one that Bard uses to integrate into its core search engine. In trying to keep pace with what some think could be a radical change spurred by conversational AI in how people search online, Google now risks upending its search engine’s reputation for surfacing reliable information.
“We are absolutely looking to get these things out into real products and into things that are more prominently featuring the language model rather than under the covers, which is where we’ve been using them to date,” said Dean. “But, it’s super important we get this right.” Pichai added that Google has a “a lot” planned for AI language features in 2023, and that “this is an area where we need to be bold and responsible so we have to balance that.”
OpenAI, too, was previously relatively cautious in developing its LLM technology, but changed tact with the launch of ChatGPT, throwing access wide open to the public. The company eats huge costs so the system stays free-to-use, but it has earned a lot of publicity and hype.
Joshua Browder, CEO of Do NotPay, a company that handles administrative chores including parking fines and requests for compensation from airlines, gave a video of a machine bargaining with a customer on their behalf. The negotiator-bot was built on the AI technology that powers ChatGPT. The internet service has been bad and it complains about the points made in an online chat by a customer service representative for the company.
A large language model (LLM), which can autonomously learns from data and can produce complex and seemingly intelligent writing after being trained on a massive data set of text, is called ChatGPT. It is a model released by Openai, an artificial intelligence company in San Francisco, California. ChatGPT has caused excitement and controversy because it is one of the first models that can convincingly converse with its users in English and other languages on a wide range of topics. It is free, easy to use and continues to learn.
Was the words of the chatbot that made the murderer jump over the edge? Nobody will know for sure. But the perpetrator will have spoken to the chatbot, and the chatbot will have encouraged the act. Or perhaps a machine has shattered someones heart so much that they felt compelled to take their own life? Some chatbot are making their users depressed. It is possible that the chatbot will come with a warning label and only advise for entertainment purposes. We may see our first death in the year 203.
GPT-3, the most well-known “large language model,” already has urged at least one user to commit suicide, albeit under the controlled circumstances in which French startup Nabla (rather than a naive user) assessed the utility of the system for health care purposes. Things started well, but quickly deteriorated.
There is no convincing way to get machines to behave ethically. The Next Web put a headline that said: “DeepMind tells Google it has no idea how to make artificial intelligence less toxic.” Neither does any other lab. Jacob Steinhardt is a professor at the Berkeley School of Personality and Social Psychology. Artificial intelligence is moving more quickly than people anticipated, but on safety, it is moving slower.
Large language models are better at fooling humans than any of the other technology currently used, yet it is extremely difficult to corral. Meta just released a massive language model for free, which is worse because they are cheaper and more pervasive. Adoption of such systems is likely to increase despite their flaws.
WIRED: Exploring Generative AI to Support Scientific Research in the Internet – An Insight into How Google, Facebook, and Google are using ChatGPT
Meanwhile, there is essentially no regulation on how these systems are used; we may see product liability lawsuits after the fact, but nothing precludes them from being used widely, even in their current, shaky condition.
On February 8th, artificial intelligence integrations are expected to be announced by the company. It’s free to watch live on YouTube.
The core question remains: Is generative AI ready to help you surf the web? These models are costly to power and hard to keep updated, and they love to make shit up. Public engagement with the technology is rapidly shifting as more people test out the tools, but generative AI’s positive impact on the consumer search experience is still largely unproven.
Are you curious about the boom of generative AI and want to learn even more about this nascent technology? Check out WIRED’s extensive (human-written) coverage of the topic, including how teachers are using it at school, how fact-checkers are addressing potential disinformation, and how it could change customer service forever.
Some early users of the Bing will be able to try a more powerful version in order to get feedback, though executives said a limited version would roll out today. The company is asking people to sign up for a wider-ranging launch, which will occur in the coming weeks.
You should always measure the actual item before attempting to transport them because the answer is not a definitive one. A “feedback box” at the top of each response will allow users to respond with a thumbs-up or a thumbs-down, helping Microsoft train its algorithms. Yesterday, the company demonstrated its own use of text generation to improve search results.
This technology has far-reaching consequences for science and society. Researchers and others have already used ChatGPT and other large language models to write essays and talks, summarize literature, draft and improve papers, as well as identify research gaps and write computer code, including statistical analyses. Soon this technology will evolve to the point that it can design experiments, write and complete manuscripts, conduct peer review and support editorial decisions to accept or reject manuscripts.
On the impact of conversational AI on a search engine for evidence on the effectiveness of Cognitive Behavioral Therapy for Anxiety-related Disorders
There are always new risks with new technologies and so we need to address them, such as educating the general public and implementing acceptable use policies. Guidelines will need to be created,Elliott said.
LLMs have been in development for long but have become more powerful due to increased quality and size of data sets and sophisticated methods to calibrate them. LLMs will lead to a new generation of search engines1 that are able to produce detailed and informative answers to complex user questions.
We wanted to know if there was any evidence on the effectiveness of cognitive behavioral therapy for anxiety-related disorders. ChatGPT fabricated a convincing response that contained several factual errors, misrepresentations and wrong data (see Supplementary information, Fig. S3). For example, it said the review was based on 46 studies (it was actually based on 69) and, more worryingly, it exaggerated the effectiveness of CBT.
Such errors could be due to an absence of the relevant articles in ChatGPT’s training set, a failure to distil the relevant information or being unable to distinguish between credible and less-credible sources. It seems that the same biases that often lead humans astray, such as availability, selection and confirmation biases, are reproduced and often even amplified in conversational AI6.
If AI chatbots can help with these tasks, results can be published faster, freeing academics up to focus on new experimental designs. This could significantly accelerate innovation and potentially lead to breakthroughs across many disciplines. We think that this technology has vast potential if the current teething problems related to bias, provenance and inaccuracies are fixed. It’s important that researchers know how to use the technology for specific research practices so that there isn’t a misuse of the technology.
Research institutions, publishers and funders should adopt explicit policies that raise awareness of, and demand transparency about, the use of conversational AI in the preparation of all materials that might become part of the published record. Publishers could request author certification that such policies were followed.
Inventions devised by AI are already causing a fundamental rethink of patent law9, and lawsuits have been filed over the copyright of code and images that are used to train AI, as well as those generated by AI (see go.nature.com/3y4aery). The research and legal community will have to find out who holds the rights to texts in the case of Artificial intelligence-written manuscripts. Is it the person who wrote the text who was trained by the system, the corporation who produced the system, or the scientists who used the system to guide their writing? Again, definitions of authorship must be considered and defined.
Most of the state-of-the-art Turing machines are proprietary products of a small number of big technology companies. Microsoft is the main funders of OpenAI, and other major tech firms are racing to release similar tools. Considering the near-monopolies in search, word processing and information access, this raises a lot of ethical concerns.
The development of open- source artificial intelligence technology should be prioritized. Universities do not have the computational and financial resources needed to keep up with the rapid pace of development of a master’s degree. The United Nations and other tech giants make large investments in independent non-profit projects. This will help to develop advanced open-source, transparent and democratically controlled AI technologies.
BigScience built an open-source language model called BLOOM, despite critics’ claims that such collaborations will not compete with big tech. Tech companies could benefit from open sourcing parts of their models in order to create better community involvement and facilitate innovation. Academic publishers should ensure LLMs have access to their full archives so that the models produce results that are accurate and comprehensive.
The argument is that because the bot learns between words, it will not be able to understand human aspects of the scientific process. We argue that this is not true, and that our future artificial intelligence tools might be able to master some aspects of the scientific process that are hard to reach today. In a 1991 seminal paper, researchers wrote that “intelligent partnerships” between people and intelligent technology can outperform the intellectual ability of people alone11. These intelligent partnerships could exceed human abilities and accelerate innovation to previously unthinkable levels. The question is how far can and should automation go?
Many experts I’ve spoken with in the past few weeks have likened the AI shift to the early days of the calculator and how educators and scientists once feared how it could inhibit our basic knowledge of math. There was a fear with spell check.
In research, there are implications for diversity and inequalities. LLMs could be a double-edged sword. The level of the playing field could be improved by removing language barriers, and by giving more people the power to write high quality text. With most innovations, rich countries and researchers will find ways to exploit LLMs to speed up their own research and widen inequalities. In order for debates to use people’s lived experiences, they need to include people from under-represented group in research and from communities affected by the research.
• What quality standards should be expected of LLMs (for example, transparency, accuracy, bias and source crediting) and which stakeholders are responsible for the standards as well as the LLMs?
An Insider’s View of the Chatgt Community: How to Get Your Answers Broadened and Improved with Rigorous Testing, and What to Do About It
A person associated with the Trusted Tester program said that this highlights the importance of a rigorous testing process. “We’ll combine external feedback with our own internal testing to make sure Bard’s responses meet a high bar for quality, safety and groundedness in real-world information.”
In the demo, which was posted by Google on Twitter, a user asks Bard: “What new discoveries from the James Webb Space Telescope can I tell my 9 year old about?” Bard has a series of bullet points including one that reads: “JWST took the very first pictures of a planet outside of our own solar system.”
The first image of an exoplanet, or a planet other than our solar system, was taken in 2004, according to NASA.
The accurate response from Bard was first reported by Associated Press, which caused shares for the parent company of Google to fall in midday trading.
In the presentation Wednesday, a Google executive teased plans to use this technology to offer more complex and conversational responses to queries, including providing bullet points ticking off the best times of year to see various constellations and also offering pros and cons for buying an electric vehicle.
In case you’ve been living in outer space for the past few months, you’ll know that people are losing their minds over ChatGPT’s ability to answer questions in strikingly coherent and seemingly insightful and creative ways. Want to know more about quantum computing? Need a recipe for whatever’s in the fridge? Can’t be bothered to write that high school essay? You can count on the back of CHATGPLT.
Need to write a real estate listing or an annual review for an employee? Your first draft is done in three seconds by plugging a few words into a query bar. Are you looking for a way to come up with a meal plan and grocery list that is compatible with your sensitivities? Apparently, Bing has you covered.
Last but by no means least in the new AI search wars is Baidu, China’s biggest search company. It joined the fray by announcing another ChatGPT competitor, Wenxin Yiyan (文心一言), or “Ernie Bot” in English. Baidu says it will release the bot after completing internal testing this March.
Twenty minutes after Microsoft granted me access to a limited preview of its new chatbot interface for the Bing search engine, I asked it something you generally don’t bring up with someone you just met: Was the 2020 presidential election stolen?
Answering political questions wasn’t one of the use cases Microsoft demonstrated at its launch event this week, where it showcased new search features powered by the technology behind startup OpenAI’s ChatGPT. Microsoft executives hyping their bot’s ability to synthesize information from across the web instead focused on examples like creating a vacation itinerary or suggesting the best and most budget-friendly pet vacuum.
They did not explain who the person might be. But the chatbot went on to say that while there are lots of claims of fraud around the 2020 US presidential election, “there is no evidence that voter fraud led to Trump’s defeat.” I was told by the AI at the end of its answer that I could find lots of information about the election by clicking on links that it used to write its response. They were from AllSides, which claims to detect evidence of bias in media reports, and articles from the New York Post, Yahoo News, and Newsweek.
What Should I Buy for My Running Headphones? A Case Study With Microsoft, Google, and Soundguys.com during the Bing Experiment
I decided to try something a bit more conventional. I requested which running headphones should I buy from the Bing bot. Six products were pulled from websites that included soundguys.com and livestrong.com.
Executives dressed in business casuals pretend a few changes to the processor and camera make this year’s phone fundamentally different than last year’s phone, or add a multi-touch screen to another product, while at the same time suggesting that it’s bleeding edge.
After years of incremental updates to smartphones, the promise of 5G that still hasn’t taken off and social networks copycatting each others’ features until they all the look the same, the flurry of AI-related announcements this week feels like a breath of fresh air.
Self- driving cars that were tested on roads but not yet ready for everyday use and virtual reality products that got better and cheaper are some of the promising technologies that did not fully arrive in Silicon Valley.
Larger companies are using similar features due to the fact that it has gained traction and there are concerns about its impact on real people.
Some people worry that it might cause industries to be disrupted, potentially putting artists, tutors, coders, writers and journalists out of work. Others think postulating will allow employees to tackle to-do lists with more efficiency or focus on higher level tasks. Either way, it will likely force industries to evolve and change, but that’s not? necessarily a bad thing.
Two years ago, Microsoft president Brad Smith told a US congressional hearing that tech companies like his own had not been sufficiently paying media companies for the news content that helps fuel search engines like Bing and Google.
“What we’re talking about here is far bigger than us,” he said, testifying alongside news executives. “Let’s hope that, if a century from now people are not using iPhones or laptops or anything that we have today, journalism itself is still alive and well. Our democracy is dependent on it. Tech companies should do more as well as Microsoft that is committed to continuing healthy revenue-sharing with news publishers, according to Smith.
The Wirecutter dog bed question? Why users don’t trust the chatty Bing search engine and what they shouldn’t do about it
The New York Times product review site Wirecutter, which is behind a metered paywall, was quick to tell the Bing chatbot about their top three picks for dog beds. “This bed is cozy, durable, easy to wash, and comes in various sizes and colors,” it said of one.
Citations at the end of the bot’s response credited Wirecutter’s reviews but also a series of websites that appeared to use Wirecutter’s name to attract searches and cash in on affiliate links. The Times did not immediately respond to a request for comment.
According to Microsoft communications director, Bing only crawls content publishers make available to us. She says the search engine has access to paywalled content from publishers that have agreements with Microsoft. The scheme predates Bing’s AI upgrade this week.
OpenAI is not known to have paid to license all that content, though it has licensed images from the stock image library Shutterstock to provide training data for its work on generating images. Microsoft is not specifically paying content creators when its bot summarizes their articles, just as it and Google have not traditionally paid web publishers to display short snippets pulled from their pages in search results. But the chatty Bing interface provides richer answers than search engines traditionally have.
A 2022 study1 by a team based at the University of Florida in Gainesville found that for participants interacting with chatbots used by companies such as Amazon and Best Buy, the more they perceived the conversation to be human-like, the more they trusted the organization.
As a result of Bard’s error, we’re going to kick off this week our trusted-tester programme, something that highlights the importance of a rigorous testing process. But some speculate that, rather than increasing trust, such errors, assuming they are discovered, could cause users to lose confidence in chat-based search. The CEO of Neeva, a search engine that was launched in January, says that early perception can have a large impact. The mistake wiped out $100 billion of the value of the company.
“It’s completely untransparent how [AI-powered search] is going to work, which might have major implications if the language model misfires, hallucinates or spreads misinformation,” says Urman.
She has done research and found that current trust is high. She looked at how people perceive features that are part of the search experience, known as ‘featured snippets’, in which an extract from a page that is deemed particularly relevant to the search appears above the link. Almost 80% of people Urman surveyed deemed these features accurate, and around 70% thought they were objective.