How Artificial Intelligence is Changing Science: A Memoir of Sam Altman, Sarah Myers West, and Seor Salme
A debacle at OpenAI has highlighted concerns that commercial forces are acting against responsible development of AI. Sam Altman, co- founder and chief executive of the company, was fired on 17 November, only to be reinstated five days later after staff revolted. “The push to retain dominance is leading to toxic competition,” says Sarah Myers West at the AI Now Institute. She is among those who worry that products are appearing before anyone fully understands their behaviour, uses and misuses. She says that they need to begin enforcement of the laws.
Earlier this year, an image of the pope wearing a huge puffer jacket went viral — and many people didn’t realize it was generated using AI. Scientists are now working on ways to stop deceptive deepfakes. This story is a part of a collection on how artificial intelligence is changing science. (Illustration by Señor Salme)
It is hard to believe that science is producing data so large. Advances in artificial intelligence (AI) are increasingly needed to make sense of all this information (see ref. Nature Rev. Phys. 4, 353, is in the works. For example, through training on copious quantities of data, machine-learning (ML) methods get better at finding patterns without being explicitly programmed to do so.
3D inkjet printer for artificial intelligence (AI): where to build your next data centre? (Natural Human Behavioural Research (Daily Briefing: Data centres’ huge water footprint becomes clear amid AI
3D printers can create complex designs in one go, such as a robotic hand with soft plastic muscles and rigid plastic bones. It’s hard to combine different materials in a single print run. This inkjet-type printer builds 3D structures by spraying layer after layer of material. It keeps an electronic eye on any accidental lumps or bumps and compensates for them in the next layer. This removes the need for messy mechanical smoothing, which usually limits the materials that can be used.
The Japanese habit of showing respect to non-living things, including AI, demonstrates one path towards a more balanced relationship with our gadgets — and is not a sign of unhealthy anthropomorphism, argues AI ethicist and anthropologist Shoko Suzuki. (Nature Human Behaviour | 5 min read)
‘Thirsty’ computing hubs could put pressure on already stretched water resources in sub-Saharan Africa and other regions where drinking water is scarce. Data centres that power AI technologies have a huge ‘water footprint’: they need water for cooling and contribute to power plants’ water usage through their vast electricity consumption. Yet water scarcity is rarely considered when deciding where to build data centres, says computer scientist Mohammad Atiqul Islam, co-author of a study of the problem. “Typically, companies care more about performance and cost.”
AI-Based Brainstorming to Solve Problems in Clinical Trials: Towards an Empirical Ethical Evaluation of Artificial Intelligence
The large language model GPT-4 can be coaxed into producing fake clinical-trial data to support an unverified scientific claim. The AI-generated data compared the outcomes of two surgical treatments for an eye condition and suggested that one procedure is better than the other. Both lead to the same outcomes in real trials. Although the data don’t hold up to close scrutiny by authenticity experts, “to an untrained eye, this certainly looks like a real data set”, says biostatistician Jack Wilkinson.
Several AI agents can work together to solve chess puzzles that tend to stump computers. Researchers tried weaving together up to ten versions of the chess AI AlphaZero, each trained for different strategies. An automated system decides which agent has the best chance of succeeding. The system was able to solve more chess puzzles than anyone else, thanks to the artificial brainstorming session, according to an artificial intelligence researcher.
Scholarly organizations such as professional societies, funding agencies, publishers and universities have the necessary leverage to promote progress. Publishers should make sure that ethics principles of Artificial Intelligence and Machine learning are supported through the peer review process and in publications. Ideally, common standards and expectations for authors, editors and reviewers should be adopted across publishers and be codified in existing ethical guidance (such as through the Council of Science Editors).
All scales of the planet, its life and its history, is being offered by the technologies of Earth, space and environmental sciences. Artificial intelligence is being used more widely for forecasting weather, climate modelling and assessing damage during disasters, for managing energy and water and for aiding response to disasters.
Yet, despite its power, AI also comes with risks. Misapplication by researchers who are unfamiliar with the details and the use of poorly trained models and badly designed input data sets can lead to unreliable results and even cause harm. For example, if reports of weather events — such as tornadoes — are used to build a predictive tool, the training data are likely to be biased towards heavily populated regions, where more events are observed and reported. If the model underestimated the tornados in urban areas, it would lead to unsuitable responses.
- Risk. There are dangers and biases to consider and manage when working with data sets and algorithms.
Towards Explainable Artificial Intelligence for Health and Social Science: Recommending a Better Understanding of Climate, Environmental and Environmental Data
More detailed recommendations are available in the community report6 facilitated by the American Geophysical Union, and are organized into modules for ease of distribution, use in teaching and continued improvement.
Some regions and communities have better coverage of environmental data than others. Areas that are often under cloud cover, such as tropical rainforests, or that have fewer in situ sensors or satellite coverage, such as the polar regions, will be less well represented. There are gaps for health and social-science data.
Data sets with good quality are often biased against vulnerable or marginalized communities in favor of wealthier areas. Black people are harder to diagnose skin diseases in Artificial intelligence-based models because they are more likely to have their data collected from white people.
Such problems can be exacerbated when data sources are combined — as is often required to provide actionable advice to the public, businesses and policymakers. Assessing the impact of air pollution is dependent on environmental data and on economic, health or social-science data.
Unintended harmful outcomes can occur when confidential information is revealed, such as the location of protected resources or endangered species. The risks of attacks on data without researchers being aware increases since the diversity of data sets now being used. Artificial intelligence and mathematical machines can be used in a wide range of ways, which are difficult to detect. Noise or interference can be added, inadvertently or on purpose, to public data sets made up of images or other content. This can alter a model’s outputs and the conclusions that can be drawn. Furthermore, outcomes from one AI or ML model can serve as input for another, which multiplies their value but also multiplies the risks through error propagation.
Sharing data and code, further testing, addressing risks and biases in all approaches, and reporting uncertainties are all important for researchers. These all necessitate an expanded description of methods, compared with the current way in which AI-enabled studies are reported.
Researchers and developers are working on such approaches, through techniques known as explainable AI (XAI) that aim to make the behaviour of AI systems more intelligible to users. In short-term weather forecasting, the ability to analyse huge volumes of remote-sense data can improve the forecasting of severe weather. Clear explanations of how outputs were reached are crucial to enable humans to assess the validity and usefulness of the forecasts, and to decide whether to alert the public or use the output in other AI models to predict the likelihood and extent of fires or floods2.
Xai attempts to quantify or visualize which input data featured more or less in reaching the model’s outputs in any given task. Researchers should examine these explanations and ensure that they are reasonable.
Research teams should include specialists in each type of data used, as well as members of communities who can be involved in providing data or who might be affected by research outcomes. One example is an AI-based project that combined Traditional Knowledge from Indigenous people in Canada with data collected using non-Indigenous approaches to identify areas that were best suited to aquaculture (see go.nature.com/46yqmdr).
We also urge funders to require that researchers use suitable repositories as part of their data sharing and management plan. Institutions should support and partner with those, instead of expanding their own generalist repositories.
This trend is evident from data for papers published in all journals of the AGU5, which implemented deposition policies in 2019 and started enforcing them in 2020. Most publication- related data has been deposited in the two general repositories: Zenodo and figshare. (Figshare is owned by Digital Science, which is part of Holtzbrinck, the majority shareholder in Nature’s publisher, Springer Nature.) Many institutions maintain their own generalist repositories, again often without discipline-specific, community-vetted curation practices.
The service can be provided by DisciplinaryRems as well as a few others, but it can take several weeks to trained staff. It is important to plan for data deposition well before a journal accepts the paper.
Research repositories should be maintained if they do not detract from government and institutions – as a safeguard against anti-integer attacks
Sustained financial investments from funders, governments and institutions — that do not detract from research funds — are needed to keep suitable repositories running, and even just to comply with new mandates16.