Deepmind learns about the art of coding

June 7, 2023June 7, 2023

DeepMind: Reinforcement Learning Can Learn to Write Words and Code by Sorting Lists When You’re Trying to Play Chess

Reinforcement learning is a technique that Deepmind uses to help their artificial intelligence improve in chess and Go. This type of AI learns by doing. It works by treating a given task like a game, in which rewards are earned for smart moves that increase the program’s efficiency. The system works to maximize the reward, which can result in a Go strategy or quicker assembly program. The type of data that can be found in large language models like GPT-4 is not the sort of data used to learn how to write words or code. It makes producing writing that mirrors the internet or producing common segments of code easier. But it’s not so good at producing novel, state-of-the-art solutions to coding challenges the AI has never seen before.

The neural networks rewarded the programs for speed, not only on correctness. Mankowitz’s team trained the system to evaluate speed either on the basis of the number of total instructions or the processing time. Depending on the processor used and the number of values to be sorted, AlphaDev’s best algorithms took between 4% and 71% less time than did human algorithms. But when the algorithms were called multiple times to sort lists of one -quarter of a million values, the cumulative time saving was only 1–2%, because of other code it did not optimize.

To see where AlphaDev eked out its gains, the team took a closer look at its algorithms. They found two new tactics for sorting, which were called AlphaDev swap move and AlphaDev copy move. They are comparable to the move which AlphaGo, a predecessor to AlphaZero, made against the human Go champion Lee Sedol at an exhibition match. He says that it influenced how we thought about strategies, and that it was fundamental to winning the game.

The team at DeepMind would like to work on more problems, even the design of hardware, in the future. We want to tackle the whole stack.

More Stories

The U.S. launched attacks on Iran and there are 4 things that have been said

admin

June 23, 2025

Qatar’s Foreign Ministry has said that the attack on Al Udeid Air Base by the Iranian Revolutionary Guard Corps (IRGC) was “flagrant violation” of the state’s sovereignty. “We affirm that…Qatar reserves the right to respond directly in a manner proportional to the nature and scale of this blatant aggression and in accordance with international law,” it added.

There are 4 things that came out of the U.S. airstrikes on Iran

admin

June 23, 2025

Iran’s Foreign Minister Javad Zarif has called the US strike on the Fordo nuclear site “a violation of the US’ constitutional right to war” and a “violation of UN Charter and international law”. The US had launched air strikes against Iran’s Fordo nuclear site and two other nuclear sites in retaliation for an attack on a US military base in Iraq.

Here’s what to watch as Iran’s nuclear facilities are attacked by the U.S

admin

June 23, 2025

The US on Sunday said it has carried out an airstrike on Iranian nuclear facilities, marking the first time that America has conducted such an attack since World War II. Iran said the strike was carried out in response to a drone attack that killed at least two people on its soil. The US added that it launched the strike “in self-defence”.

Experts say that Iran’s nuclear program is not destroyed by Satellites

admin

June 23, 2025

Satellites have shown that Iran has over 400 kilograms of 60%enriched uranium, enough for around ten bombs, the International Atomic Energy Agency (IAEA) said. The agency further said that the three sites hit by US air strikes – Fordo, Isfahan, and Natanz – contained nuclear material in the form of uranium enriched to different levels.

European officials are trying to find a solution to the Iran issue

admin

June 21, 2025

Iranian officials on Friday claimed that Israel carried out airstrikes on four cities, including Ihan where Iran’s nuclear site is. The Israeli military has said that it struck dozens of Iranian military targets around Tehran and western Iran. Iran’s state media had also said that five members of Iran’s paramilitary Revolutionary Guard were killed in the airstrikes.

When you feed nothing, what happens?

admin

June 18, 2025June 18, 2025

British artist Terence Broad, who trained an artificial intelligence model called ‘GAN’ on Mark Rothko’s artwork, has revealed that he did not train the model on any data at all. He added, “By hacking… neural network, and locking elements of it into a loop, I was able to induce this AI into producing images without any training data at all.”

Does the president or congress have war powers?

There are two important things that come from the U.S. strikes on Iran

The U.S. launched attacks on Iran and there are 4 things that have been said

There are 4 things that came out of the U.S. airstrikes on Iran

Here’s what to watch as Iran’s nuclear facilities are attacked by the U.S

The first images from the Vera C. Rubin Observatory are stunningly beautiful

Experts say that Iran’s nuclear program is not destroyed by Satellites

Trump takes a huge Gamble after military strike avoided his Predecessors

Only a few dozen children from Gaza have been treated at a cancer center in Jordan

Deepmind learns about the art of coding

DeepMind: Reinforcement Learning Can Learn to Write Words and Code by Sorting Lists When You’re Trying to Play Chess