Multi-Agent Reinforcement Learning

Click to preview

For Instructors

Request Resources

Digital Exam/Desk Copy Print Desk Copy

Multi-Agent Reinforcement Learning

Foundations and Modern Approaches

by Albrecht, Christianos, Schäfer

Click here to preview

Instructor Requests

Digital Exam/Desk Copy Print Desk Copy Ancillaries

The first comprehensive introduction to Multi-Agent Reinforcement Learning (MARL), covering MARL's models, solution concepts, algorithmic ideas, technical challenges, and modern approaches.

Multi-Agent Reinforcement Learning (MARL), an area of machine learning in which a collective of agents learn to optimally interact in a shared environment, boasts a growing array of applications in modern life, from autonomous driving and multi-robot factories to automated trading and energy network management. This text provides a lucid and rigorous introduction to the models, solution concepts, algorithmic ideas, technical challenges, and modern approaches in MARL.

The book first introduces the field's foundations, including basics of reinforcement learning theory and algorithms, interactive game models, different solution concepts for games, and the algorithmic ideas underpinning MARL research. It then details contemporary MARL algorithms which leverage deep learning techniques, covering ideas such as centralized training with decentralized execution, value decomposition, parameter sharing, and self-play. The book comes with its own MARL codebase written in Python, containing implementations of MARL algorithms that are self-contained and easy to read. Technical content is explained in easy-to-understand language and illustrated with extensive examples, illuminating MARL for newcomers while offering high-level insights for more advanced readers.

·First textbook to introduce the foundations and applications of MARL, written by experts in the field
·Integrates reinforcement learning, deep learning, and game theory
·Practical focus covers considerations for running experiments and describes environments for testing MARL algorithms
·Explains complex concepts in clear and simple language
·Classroom-tested, accessible approach suitable for graduate students and professionals across computer science, artificial intelligence, and robotics
·Resources include code and slides

Expand/Collapse All
Cover (pg. Cover)
Contents (pg. vii)
Summary of Notation (pg. xiii)
List of Figures (pg. xvii)
Preface (pg. xxiii)
1 Introduction (pg. 1)
1.1 Multi-Agent Systems (pg. 2)
1.2 Multi-Agent Reinforcement Learning (pg. 6)
1.3 Application Examples (pg. 9)
1.3.1 Multi-Robot Warehouse Management (pg. 9)
1.3.2 Competitive Play in Board Games and Video Games (pg. 10)
1.3.3 Autonomous Driving (pg. 11)
1.3.4 Automated Trading in Electronic Markets (pg. 11)
1.4 Challenges of MARL (pg. 12)
1.5 Agendas of MARL (pg. 13)
1.6 Book Contents and Structure (pg. 15)
I Foundations of Multi-Agent Reinforcement Learning (pg. 17)
2 Reinforcement Learning (pg. 19)
2.1 General Definition (pg. 20)
2.2 Markov Decision Processes (pg. 22)
2.3 Expected Discounted Returns and Optimal Policies (pg. 24)
2.4 Value Functions and Bellman Equation (pg. 26)
2.5 Dynamic Programming (pg. 29)
2.6 Temporal-Difference Learning (pg. 32)
2.7 Evaluation with Learning Curves (pg. 36)
2.8 Equivalence of R(s,a,s') and R(s,a) (pg. 39)
2.9 Summary (pg. 40)
3 Games: Models of Multi-Agent Interaction (pg. 43)
3.1 Normal-Form Games (pg. 44)
3.2 Repeated Normal-Form Games (pg. 46)
3.3 Stochastic Games (pg. 47)
3.4 Partially Observable Stochastic Games (pg. 49)
3.5 Modeling Communication (pg. 55)
3.6 Knowledge Assumptions in Games (pg. 56)
3.7 Dictionary: Reinforcement Learning↔ Game Theory (pg. 58)
3.8 Summary (pg. 58)
4 Solution Concepts for Games (pg. 61)
4.1 Joint Policy and Expected Return (pg. 62)
4.2 Best Response (pg. 65)
4.3 Minimax (pg. 65)
4.4 Nash Equilibrium (pg. 68)
4.5 ϵ-Nash Equilibrium (pg. 70)
4.6 (Coarse) Correlated Equilibrium (pg. 71)
4.7 Conceptual Limitations of Equilibrium Solutions (pg. 75)
4.8 Pareto Optimality (pg. 76)
4.9 Social Welfare and Fairness (pg. 78)
4.10 No-Regret (pg. 81)
4.11 The Complexity of Computing Equilibria (pg. 83)
4.12 Summary (pg. 87)
5 Multi-Agent Reinforcement Learning in Games: First Steps and Challenges (pg. 89)
5.1 General Learning Process (pg. 90)
5.2 Convergence Types (pg. 92)
5.3 Single-Agent RL Reductions (pg. 95)
5.4 Challenges of MARL (pg. 101)
5.5 What Algorithms Do Agents Use? (pg. 109)
5.6 Summary (pg. 112)
6 Multi-Agent Reinforcement Learning: Foundational Algorithms (pg. 115)
6.1 Dynamic Programming for Games: Value Iteration (pg. 116)
6.2 Temporal-Difference Learning for Games: Joint-Action Learning (pg. 118)
6.3 Agent Modeling (pg. 127)
6.4 Policy-Based Learning (pg. 140)
6.5 No-Regret Learning (pg. 151)
6.6 Summary (pg. 156)
II Multi-Agent Deep Reinforcement Learning: Algorithms and Practice (pg. 159)
7 Deep Learning (pg. 161)
7.1 Function Approximation for Reinforcement Learning (pg. 161)
7.2 Linear Function Approximation (pg. 163)
7.3 Feedforward Neural Networks (pg. 165)
7.4 Gradient-Based Optimization (pg. 169)
7.5 Convolutional and Recurrent Neural Networks (pg. 175)
7.6 Summary (pg. 180)
8 Deep Reinforcement Learning (pg. 183)
8.1 Deep Value Function Approximation (pg. 184)
8.2 Policy Gradient Algorithms (pg. 195)
8.3 Observations, States, and Histories in Practice (pg. 215)
8.4 Summary (pg. 216)
9 Multi-Agent Deep Reinforcement Learning (pg. 219)
9.1 Training and Execution Modes (pg. 220)
9.2 Notation for Multi-Agent Deep Reinforcement Learning (pg. 222)
9.3 Independent Learning (pg. 223)
9.4 Multi-Agent Policy Gradient Algorithms (pg. 230)
9.5 Value Decomposition in Common-Reward Games (pg. 242)
9.6 Agent Modeling with Neural Networks (pg. 266)
9.7 Environments with Homogeneous Agents (pg. 274)
9.8 Policy Self-Play in Zero-Sum Games (pg. 281)
9.9 Population-Based Training (pg. 290)
9.10 Summary (pg. 301)
10 Multi-Agent Deep Reinforcement Learning in Practice (pg. 305)
10.1 The Agent-Environment Interface (pg. 305)
10.2 MARL Neural Networks in PyTorch (pg. 307)
10.3 Centralized Value Functions (pg. 312)
10.4 Value Decomposition (pg. 313)
10.5 Practical Tips for MARL Algorithms (pg. 313)
10.6 Presentation of Experimental Results (pg. 316)
11 Multi-Agent Environments (pg. 319)
11.1 Criteria for Choosing Environments (pg. 320)
11.2 Structurally Distinct 2×2 Matrix Games (pg. 321)
11.3 Complex Environments (pg. 323)
11.4 Environment Collections (pg. 332)
A Surveys on Multi-Agent Reinforcement Learning (pg. 337)
References (pg. 341)
Index (pg. 363)

Stefano V. Albrecht

Stefano V. Albrecht is Associate Professor in the School of Informatics at the University of Edinburgh, where he leads the Autonomous Agents Research Group. His research focuses on the development of machine learning algorithms for autonomous systems control and decision making, with a particular focus on deep reinforcement learning and multi-agent interaction.

Filippos Christianos

Filippos Christianos is a research scientist in multi-agent deep reinforcement learning focusing on how MARL algorithms can be used efficiently and the author of multiple popular MARL-focused code libraries.

Lukas Schäfer

Lukas Schä fer is a researcher focusing on the development of more generalizable, robust, and sample-efficient decision making using deep reinforcement learning, with a particular focus on multi-agent reinforcement learning.

Instructors Only
You must have an instructor account and submit a request to access instructor materials for this book. Register / Log in

eTextbook

Go paperless today! Available online anytime, nothing to download or install.

Features

Bookmarking
Note taking
Highlighting

I have an access code.

Browser	Version
Chrome (Recommended)	Latest
Firefox	Latest
Edge	Latest
Safari	Latest

For Instructors

Request Resources

Multi-Agent Reinforcement Learning

Foundations and Modern Approaches

Instructor Requests

Stefano V. Albrecht

Filippos Christianos

Lukas Schäfer

Features

The MIT Press

Catalog

For Instructors

Request Resources

Multi-Agent Reinforcement Learning

Foundations and Modern Approaches

Instructor Requests

Stefano V. Albrecht

Filippos Christianos

Lukas Schäfer

Features

The MIT Press

Catalog

Video Title

Common Access Code Issues

Common Payment Issues