Artificial intelligence will change science now researchers must tame it

AI Tools for Science Advice are Not Black Boxes: How Accurate Can They Be and Should We Use It? The Case of the United States

Answers to these questions are critical to our future. Powerful language models can be used in research and technological development but they are also available for commercialization and open source. Policymakers have already started experimenting with publicly available generative AI tools. Legislative staff members in the United States are experimenting with OpenAI’s GPT-4 (see go.nature.com/3zpwhux) and, reportedly, other unapproved and potentially less reliable AI tools. This led administrators to place limits on the use of chatbot in the US House of Representatives.

These issues highlight that AI tools for science advice cannot be black boxes — they will require transparency and participatory design processes. Advisers and policymakers must be involved in the training process to make sure that outputs are legitimate. Researchers should advise and test the systems before widespread adoption. The quality of scientific info feeding into the advice and the credibility of the output needs to be carefully checked by these groups. Accountability needs to be ensured by the transparent and explainable algorithm used to assist the process.

We do not propose that policy briefs be drafted by LLM-based tools in their entirety, but AI could be used to facilitate parts of the process. The reviewers and designers of policy papers are still an essential part of the process, providing quality control that ensures relevance and legitimacy. Yet, as generative AI tools improve, they could be used to provide first drafts of discrete sections, such as plain-language summaries of technical information or complex legislation.

Synthesis of Statistical Literature: Artificial Intelligence as a Tool for User Credibility in Science-Adverge Expertise

It is difficult to find current evidence and involve a lot of judgement. Hard-pressed science advisers must take what they can get. But what if the searches could be more algorithmic?

systematic reviews identify a question of interest, analyse relevant studies and find the best answer, see www.cochranelibrary.com For example, one recent review examined evidence on whether healthy-eating initiatives were successful in young children, finding that they can be, although uncertainties remain2.

The alternative approach of subject-wide evidence syntheses entails reading the literature at scale3. 70 people who were part of the biodiversity project spent the equivalent of 50 years reading and analyzing 1.5 million papers in 17 languages. The effectiveness of each intervention was assessed by the expert panel after the summaries were read. The synopses, on subjects ranging from bat conservation to sustainable aquaculture, are published online (https://conservationevidence.com). Users can modify meta-analyses to their own needs using the parallel tool, Metadataset.

Artificial intelligence is touching more and more research fields. From protein folding to weather forecasting, and from medical diagnostics to science communication, the list seems to grow by the day. Nature says that the percentage of papers in the Scopus database that mention artificial intelligence has risen to 8% from 2% a decade ago.

Decision-making could be helped by automated processes for search, screening and data extraction. artificial intelligence can make a list of possible options in solution scanning. Take, for example, policies for reducing shoplifting. When prompted to list potential policy options, ChatGPT can identify topics such as employee training and store layout and design. Advisers can then collate and synthesize the relevant evidence in these areas. Rapid assessments will inevitably miss some options, although they will find others that wouldn’t use conventional approaches. Which dimensions of credibility are most important might also differ, depending on the policy question and context.

Science-advice professionals will need to be trained in AI-user skills, such as the best way to prompt LLMs to produce the required outputs. Even minor shifts in tone and context in a prompt can alter the probabilities used by the LLM to generate a response. Advisers also need to be trained to avoid inappropriate over-reliance on AI systems — such as when drafting advice on emerging topics for which information is needed rapidly. These might be areas in which LLMs perform poorly, because of a lack of relevant training data. Science advisers need a nuanced understanding of the risks.

A POST note on the research on COVID-19 vaccines: AI tools as science policy advisers? The potential and the pitfalls

Many academic journals use standardized formats for reporting study results, but there is great variation across disciplines. Working papers from international agencies and project reports from non-governmental organizations are not the only mismatches in other sources of information. Such diversity in presentation makes it difficult to develop fully automated methods to identify specific findings and study criteria. For example, it is usually important to know over what period an effect was measured or how large the sample was, but this information can be buried in the text. Presenting the research methodology and results in a more consistent manner could help. The journal published by Cell Press uses a structured reporting format called STAR Methods.

Systematic reviews need to search across a number of databases to find relevant scientific literature. The outcome could be affected by the choice of database. But requirements by governments to publish funded research as open access10,11 could make it easier to retrieve study results. For research topics that governments deem as funding priorities, eliminating paywalls will enable the creation of evidence databases and ensure alignment with copyright laws.

The system that was built in the experiment by the publisher was only references to peer-reviewed research. The system generated a policy paper on the subject of batteries. The resulting text was bland and pitched at a high level of understanding compared to the original synthesis and far from the briefs needed. However, this system demonstrated some important design principles. For instance, forcing it to generate only text that refers to scientific sources ensured that the resulting advice credited the scientists who were cited.

A POSTnote is commissioned by the parliament to summarize the latest research on COVID-19 vaccines. POST could make a document that was tailored to politicians, instead of a single publication. For example, a politician might receive a version that shows how the people in their constituency contributed to the field of vaccine manufacturing. They could be told about the rates of infections in their region.

Another dimension might be the level of scientific explanation of how vaccines work. Science-literate politicians could get specialized knowledge, but those without scientific background wouldn’t get a lay version. The level of technical detail might be dialled up or down by the reader themselves.

Source: AI tools as science policy advisers? The potential and the pitfalls

How do LLMs communicate with the public? The role of artificial intelligence in teaching and learning: Open, transparent and open systems should work in alignment with national and international policy

Different language models have separate political leanings on both the social and economic fronts. Data that models are trained to use can pick up some biases. These biases can have consequences for how models perform on certain tasks. Other forms of bias include race, religion, gender and more.

Such processes would be best conducted by institutions that have clear mechanisms in place to ensure robust governance, broad participation, public accountability and transparency. For example, national governments could build on current efforts, such as the US What Works Clearinghouse and the UK What Works Network. Alternatively, international bodies, such as the United Nations scientific and cultural organization UNESCO, could develop these tools in alignment with open science goals. Care should be taken to seek international collaboration between countries of all income levels. It is key to ensure not just the availability of these tools and scientific information to low-income countries, but also the consistent development of rigorous, unbiased systems for evidence synthesis that align with national and international policies and priorities.

Policy briefings often contain classified or other sensitive information, such as the details of a defence acquisition or draft findings from a public-health study, which needs to remain private until cleared for public dissemination. If advisers use publicly available tools, such as ChatGPT, they might be at risk of disclosing restricted information — a concern that has already complicated AI-model deployment elsewhere in government and in the private sector (see go.nature.com/3rrhm67). In order to develop internal models that can run on secure server, institutions need to establish clear guidelines about what documents and information will be fed into external LLMs.

Another factor that survey respondents commented on is the dominant part corporations are playing in the development of AI. Companies are great contributors to technology. But the scale of their ownership of AI, in terms of both the technology and the human data needed to power it, is greater than in the past. Researchers need access to data, code and metadata. Producers of black-box systems need to recognize the necessity of making these available for research if AI claims are to pass verification and reproducibility tests. Regulators are still playing catch up despite the rapid development of artificial intelligence.

The world is using artificial intelligence in science education. Students at schools and universities regularly use LLM tools to answer questions, and teachers are starting to recognize that curricula and methods of pedagogy will need to change to take this into account.

Artificial Intelligence has been changing as well. The renaissance in machine- learning tools pre-trained on huge volumes of scientific data sets has brought about a new age of generative artificial intelligence.

Previous post President Biden is expected to visit the picket line
Next post There are 7 best conspiracy theories about Taylor Swift