Meta has a new machine learning language model that shows it also does artificial intelligence
The 1,000 Languages Initiative: How a Large Language Model Can Contribute to Artificial Intelligence and Human Language Processing in the Internet and on the Web
These models are capable of many tasks, though, from language generation (like OpenAI’s GPT-3) to translation (see Meta’s No Language Left Behind work). The1,000 Languages Initiative does not focus on any single function, but on creating a single system with huge knowledge across the world’s languages.
Google has already begun integrating these language models into products like Google Search, while fending off criticism about the systems’ functionality. There is a tendency to regurgitate harmful societal biases, as well as an inability to parse language with human sensitivity, which are flaws in language models. After the researchers published papers detailing these problems, they were fired by the company.
The company believes that a model of this size will make it easier to bring various artificial intelligence features to languages that are poorly represented in online spaces.
“By having a single model that is exposed to and trained on many different languages, we get much better performance on our low resource languages,” says Ghahramani. “The way we get to 1,000 languages is not by building 1,000 different models. Languages are like organisms, they’ve evolved from one another and they have certain similarities. And we can find some pretty spectacular advances in what we call zero-shot learning when we incorporate data from a new language into our 1,000 language model and get the ability to translate [what it’s learned] from a high-resource language to a low-resource language.”
Access to data is a problem when training across so many languages, though, and Google says that in order to support work on the 1,000-language model it will be funding the collection of data for low-resource languages, including audio recordings and written texts.
It does not have a plan to apply the model to any particular location, only that it expects it will find a wide range of uses in all of their products.
“One of the really interesting things about large language models and language research in general is that they can do lots and lots of different tasks,” says Ghahramani. The same language model can do many things, including turn commands for a robot into code. The interesting thing about language model is that they are becoming a repository of a lot of knowledge and by probing them in different ways you can get some useful functions.
According to a report from CNBC, Alphabet CEO Sundar Pichai and Google’s head of AI Jeff Dean addressed the rise of ChatGPT in a recent all-hands meeting. One employee at the search giant had a question about the bot’s launch and whether it was a missed opportunity. Pichai and Dean reportedly responded by saying that Google’s AI language models are just as capable as OpenAI’s, but that the company had to move “more conservatively than a small startup” because of the “reputational risk” posed by the technology.
It seems to be trying to damp down expectations for OpenAI. Sam Altman recently said that CHATGEPA is limited, but good enough at some things to create a misleading impression of greatness. It’s a mistake to rely on it for important things right now. it’s a preview of progress; we have lots of work to do on robustness and truthfulness.”
“We believe that the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular,” the company wrote in its post. “We look forward to seeing what the community can learn — and eventually build — using LLaMA.”
Meta is releasing LLaMA as a non-profit software that can be used for research use cases with access granted to groups like universities and industry labs.
In a research paper, Meta claims that the second-smallest version of the LLaMA model, LLaMA-13B, performs better than OpenAI’s popular GPT-3 model “on most benchmarks,” while the largest, LLaMA-65B, is “competitive with the best models,” like DeepMind’s Chinchilla70B and Google’s PaLM 540B. (The numbers in these names refer to the billions of parameters in each model — a measure of the system’s size and a rough approximation of its sophistication, though these two qualities do not necessarily scale in lockstep.)
L LaMA-13B can also run on a single data center-grade graphics card after being trained. The news for smaller institutions that want to do tests on the system will be welcome, but it doesn’t mean much to lone researchers who can’t afford the equipment.
Meta Launches AI Chatbots and Microsoft Doesn’t Take a Toll on Its Shared Product Launching: The Case of BlenderBot and Galactica
Meta’s release is also notable partly because it’s missed out on some of the buzz surrounding AI chatbots. It may be a good thing, given that Microsoft has come under fire forrushing the launch of Bing and for taking a nosedive in its stock price after its own chatbot made an error.
Meta has released its own accessible chatbots in the past, but the reception has been less than stellar. One of them, a program called blenderBot, was lambasted for being simply and not very good, while the other program called Galactica was pulled offline after only three days because it was producing scientific nonsense.