OpenAI – Page 10 – Ai Info365

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

OpenAIJanuary 10, 2024215Views 0Likes 0Comments

In our recent paper, we explore how populations of deep reinforcement learning (deep RL) agents can learn microeconomic behaviours, such as production, consumption, and trading of goods. We find that artificial agents learn to make economically rational decisions about production, consumption, and prices, and react appropriately to supply and demand changes. The population converges to…

Active offline policy selection – Google DeepMind

OpenAIJanuary 10, 2024223Views 0Likes 0Comments

Reinforcement learning (RL) has made tremendous progress in recent years towards addressing real-life problems – and offline RL made it even more practical. Instead of direct interactions with the environment, we can now train many algorithms from a single pre-recorded dataset. However, we lose the practical advantages in data-efficiency of offline RL when we evaluate…

Building a culture of pioneering responsibly

OpenAIJanuary 10, 2024187Views 0Likes 0Comments

How to ensure we benefit society with the most impactful technology being developed today As chief operating officer of one of the world’s leading artificial intelligence labs, I spend a lot of time thinking about how our technologies impact people’s lives – and how we can ensure that our efforts have a positive outcome. This…

From LEGO competitions to DeepMind’s robotics lab

OpenAIJanuary 10, 2024218Views 0Likes 0Comments

Today’s post is all about Akhil Raju, a software engineer on the robotics team. We originally met Akhil in season two of DeepMind: The Podcast, but we wanted to get to know him better and hear more about his path to DeepMind. What sparked your curiosity in artificial intelligence (AI)? When I was young, I…

A Generalist Agent – Google DeepMind

OpenAIJanuary 10, 2024208Views 0Likes 0Comments

Inspired by progress in large-scale language modelling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks…

On the Expressivity of Markov Reward

OpenAIJanuary 9, 2024187Views 0Likes 0Comments

Reward is the driving force for reinforcement learning (RL) agents. Given its central role in RL, reward is often assumed to be suitably general in its expressivity, as summarized by Sutton and Littman’s reward hypothesis: In our work, we take first steps toward a systematic study of this hypothesis. To do so, we…

Improving language models by retrieving from trillions of tokens

OpenAIJanuary 9, 2024191Views 0Likes 0Comments

In recent years, significant performance gains in autoregressive language modeling have been achieved by increasing the number of parameters in Transformer models. This has led to a tremendous increase in training energy cost and resulted in a generation of dense “Large Language Models” (LLMs) with 100+ billion parameters. Simultaneously, large datasets containing trillions of words…

Language modelling at scale: Gopher, ethical considerations, and retrieval

OpenAIJanuary 8, 2024192Views 0Likes 0Comments

Responsibility & Safety Published …

Creating Interactive Agents with Imitation Learning

OpenAIJanuary 8, 2024173Views 0Likes 0Comments

Research Published …

Simulating matter on the quantum scale with AI

OpenAIJanuary 7, 2024171Views 0Likes 0Comments

Solving some of the major challenges of the 21st Century, such as producing clean electricity or developing high temperature superconductors, will require us to design new materials with specific properties. To do this on a computer requires the simulation of electrons, the subatomic particles that govern how atoms bond to form molecules and are also…