Living guidelines for generative AI is what scientists must oversee

Understanding Self-Regulation of Generative Artificial Intelligence: Key Principles for Research Development and Impact on Research funding Organizations and Research Funding Organizations

Transparency. Stakeholders should always be transparent with their use of generative artificial intelligence. This increases awareness and allows researchers to study how generative AI might affect research quality or decision-making. In our view, developers of generative AI tools should also be transparent about their inner workings, to allow robust and critical evaluation of these technologies.

Governments are beginning to regulate AI technologies, but comprehensive and effective legislation is years off (see Nature 620, 260–263; 2023). The European Union Artificial Intelligence Act, which is now in the final stage of negotiation, requires transparency in the form of disclosure of content that is generated with artificial intelligence and the publication of summaries of copyrighted data. The administration of US President Joe Biden aims for self-regulation. It announced in July that it had obtained voluntary commitments from seven leading tech companies to manage the risks posed by Artificial Intelligence. Digital ‘watermarks’ that identify the origins of a text, picture or video might be one mechanism. In August, the Cyberspace Administration of China announced that it will enforce AI regulations, including requiring that generative AI developers prevent the spread of mis-information or content that challenges Chinese socialist values. The UK government is trying to get an intergovernmental agreement on limiting the risks of Artificial Intelligence at a summit in November.

Generative AI is threatening the integrity of science as it is changing how scientists look for information and conduct their research. The use of commercially-developed artificial intelligence tools in research can introduce bias and diminish the validity of scientific knowledge. Generated outputs can change facts, while still being authoritative.

It’s not clear whether self-regulation will be effective in the long run. AI is advancing at breakneck speed in a sprawling industry that is continuously reinventing itself. Regulations drawn up today will be outdated by the time they become official policy, and might not anticipate future harms and innovations.

First, the summit participants agreed on three key principles for the use of generative AI in research — accountability, transparency and independent oversight.

  1. Research funding organizations should always include human assessment in evaluating research funding proposals.

Source: Living guidelines for generative AI — why scientists must oversee its use

Living guidelines for open science: alignment, integrity and reproducibility, and applications in other fields of science, with an emphasis on the Stroke Foundation and COVID-19

Guidelines co-developed with Olivier Bouin, Mathieu Denis, Zhenya Tsoy, Vilas Dhar, Huub Dijstelbloem, Saadi Lahlou, Yvonne Donders, Gabriela Ramos, Klaus Mainzer & Peter-Paul Verbeek (see Supplementary information for co-developers’ affiliations).

Much like the AI Risk Management Framework of the US National Institute of Standards and Technology4, for example, the committee could map, measure and manage risks. This would require close communication with the auditor. Living guidelines could include the right to control exploitation of their identity for publicity, while the auditing body would examine whether a particular application might violate this right, such as by producing deep fakes. An AI application that fails certification can still enter the marketplace (if policies don’t restrict it), but individuals and institutions adhering to the guidelines would not be able to use it.

  1. The organization and body should at least include, but not be limited to, experts in computer science, behavioural science, psychology, human rights, privacy, law, ethics, science of science and philosophy (and related fields). The stakeholder insights and interests of stakeholders from across the sectors should be assured through the composition of the teams and the procedures. Standards for composition of the team might change over time.

Similar bodies exist in other domains, such as the US Food and Drug Administration, which assesses evidence from clinical trials to approve products that meet its standards for safety and effectiveness. The Center for Open Science, an international organization based in Charlottesville, Virginia, seeks to develop regulations, tools and incentives to change scientific practices towards openness, integrity and reproducibility of research.

These approaches are applied in other fields. The Stroke Foundation of Australia has adopted guidelines to allow patients to access new medicines quickly. The foundation now updates its guidelines every three to six months, instead of roughly every seven years as it did previously. Similarly, the Australian National Clinical Evidence Taskforce for COVID-19 updated its recommendations every 20 days during the pandemic, on average5.

Another example is the Transparency and Openness Promotion (TOP) Guidelines for promoting open-science practices, developed by the Center for Open Science6. The metric is called the Top Factor and it allows researchers to easily check if journals follow open-science guidelines. A similar approach could be used for AI algorithms.

Auditing and Regulating Generative AI: A Challenge for Tech Companies and the Unified View of AI Governance in the 21st Century

Financial investments will be needed. The auditing body will be the most expensive element, because it needs computing power comparable to that of OpenAI or a large university consortium. Although the amount will depend on the remit of the body, it is likely to require at least $1 billion to set up. That is about the cost of training GPT-5, a successor to GPT-4 that has a large language model.

A group of experts should be formed at a cost of $1 million, and they would report back within six months. The budget plans and how the auditing body and committee would function should be sketched out by this group.

Some of the investment could come from the public purse. Tech companies can contribute through a pooled and independently run mechanism.

Alternatively, the scientific auditing body might become an independent entity within the United Nations, similar to the International Atomic Energy Agency. There could be differences in opinions on regulating generative AI by member states. It is slow to update formal legislation.

Tech companies could fear that regulations will hamper innovation, and might prefer to self-regulate through voluntary guidelines rather than legally binding ones. For example, many companies changed their privacy policies only after the European Union put its General Data Protection Regulation into effect in 2016 (see go.nature.com/3ten3du).However, our approach has benefits. Auditing and regulation can engender public trust and reduce the risks of malpractice and litigation.

These benefits could provide an incentive for tech companies to invest in an independent fund to finance the infrastructure needed to run and test AI systems. Failing quality checks can lead to negative media coverage and declining shares, so some may be reluctant to do so.

The independence of scientific research in a field dominated by the tech industry is a challenge. Its membership must be managed to avoid conflicts of interests, given that these have been demonstrated to lead to biased results in other fields7,8. A strategy for dealing with such issues needs to be developed9.

OpenAI vs. OpenAI: An OpenAI View of OpenAI Language Models in the Stanford Era: A Systematic Study of Open AI Models and Software Frameworks

When OpenAI published details of the stunningly capable AI language model GPT-4, which powers ChatGPT, in March, its researchers filled 100 pages. They left out a couple important details, like how it was built or how it works.

That was no accidental oversight, of course. OpenAI and other big companies are keen to keep the workings of their most prized algorithms shrouded in mystery, in part out of fear the technology might be misused but also from worries about giving competitors a leg up.

The Stanford team looked at 10 different AI systems, mostly large language models like those behind ChatGPT and other chatbots. The models that are included in these include commercial models, like the GPT-4 from OpenAI. The report also surveyed models offered by startups, including Jurassic-2 from AI21 Labs, Claude 2 from Anthropic, Command from Cohere, and Inflection-1 from chatbot maker Inflection.

And they examined “open source” AI models that can be downloaded for free, rather than accessed exclusively in the cloud, including the image-generation model Stable Diffusion 2 and Llama 2, which was released by Meta in July this year. These models are often not as open as they might appear to be.

The Stanford team scored the openness of these models on 13 different criteria, including how transparent the developer was about the data used to train the model—for example, by disclosing how it was collected and annotated and whether it includes copyrighted material. The study also looked for disclosures about the hardware used to train and run a model, the software frameworks employed, and a project’s energy consumption.

The researchers found that no model achieved more than 54 percent transparency across all of the criteria. Meta’s Llama 2 was judged the most open, while Amazon’s Titan Text was the least transparent. But even an “open source” model like Llama 2 was found to be quite opaque, because Meta has not disclosed the data used for its training, how that data was collected and curated, or who did the work.

Previous post A surprisingly good first attempt by the OnePlus open review
Next post Israelis celebrate Biden visit but are concerned about U.S. constraints on action