Multi-Agent Reinforcement Learning
Foundations and Modern Approaches
by Albrecht, Christianos, Schäfer
| ISBN: 9780262380515 | Copyright 2024
Instructor Requests
The first comprehensive introduction to Multi-Agent Reinforcement Learning (MARL), covering MARL's models, solution concepts, algorithmic ideas, technical challenges, and modern approaches.
Multi-Agent Reinforcement Learning (MARL), an area of machine learning in which a collective of agents learn to optimally interact in a shared environment, boasts a growing array of applications in modern life, from autonomous driving and multi-robot factories to automated trading and energy network management. This text provides a lucid and rigorous introduction to the models, solution concepts, algorithmic ideas, technical challenges, and modern approaches in MARL.
The book first introduces the field's foundations, including basics of reinforcement learning theory and algorithms, interactive game models, different solution concepts for games, and the algorithmic ideas underpinning MARL research. It then details contemporary MARL algorithms which leverage deep learning techniques, covering ideas such as centralized training with decentralized execution, value decomposition, parameter sharing, and self-play. The book comes with its own MARL codebase written in Python, containing implementations of MARL algorithms that are self-contained and easy to read. Technical content is explained in easy-to-understand language and illustrated with extensive examples, illuminating MARL for newcomers while offering high-level insights for more advanced readers.
·First textbook to introduce the foundations and applications of MARL, written by experts in the field
·Integrates reinforcement learning, deep learning, and game theory
·Practical focus covers considerations for running experiments and describes environments for testing MARL algorithms
·Explains complex concepts in clear and simple language
·Classroom-tested, accessible approach suitable for graduate students and professionals across computer science, artificial intelligence, and robotics
·Resources include code and slides
Expand/Collapse All | |
---|---|
Cover (pg. Cover) | |
Contents (pg. vii) | |
Summary of Notation (pg. xiii) | |
List of Figures (pg. xvii) | |
Preface (pg. xxiii) | |
1 Introduction (pg. 1) | |
1.1 Multi-Agent Systems (pg. 2) | |
1.2 Multi-Agent Reinforcement Learning (pg. 6) | |
1.3 Application Examples (pg. 9) | |
1.3.1 Multi-Robot Warehouse Management (pg. 9) | |
1.3.2 Competitive Play in Board Games and Video Games (pg. 10) | |
1.3.3 Autonomous Driving (pg. 11) | |
1.3.4 Automated Trading in Electronic Markets (pg. 11) | |
1.4 Challenges of MARL (pg. 12) | |
1.5 Agendas of MARL (pg. 13) | |
1.6 Book Contents and Structure (pg. 15) | |
I Foundations of Multi-Agent Reinforcement Learning (pg. 17) | |
2 Reinforcement Learning (pg. 19) | |
2.1 General Definition (pg. 20) | |
2.2 Markov Decision Processes (pg. 22) | |
2.3 Expected Discounted Returns and Optimal Policies (pg. 24) | |
2.4 Value Functions and Bellman Equation (pg. 26) | |
2.5 Dynamic Programming (pg. 29) | |
2.6 Temporal-Difference Learning (pg. 32) | |
2.7 Evaluation with Learning Curves (pg. 36) | |
2.8 Equivalence of R(s,a,s') and R(s,a) (pg. 39) | |
2.9 Summary (pg. 40) | |
3 Games: Models of Multi-Agent Interaction (pg. 43) | |
3.1 Normal-Form Games (pg. 44) | |
3.2 Repeated Normal-Form Games (pg. 46) | |
3.3 Stochastic Games (pg. 47) | |
3.4 Partially Observable Stochastic Games (pg. 49) | |
3.5 Modeling Communication (pg. 55) | |
3.6 Knowledge Assumptions in Games (pg. 56) | |
3.7 Dictionary: Reinforcement Learning↔ Game Theory (pg. 58) | |
3.8 Summary (pg. 58) | |
4 Solution Concepts for Games (pg. 61) | |
4.1 Joint Policy and Expected Return (pg. 62) | |
4.2 Best Response (pg. 65) | |
4.3 Minimax (pg. 65) | |
4.4 Nash Equilibrium (pg. 68) | |
4.5 ϵ-Nash Equilibrium (pg. 70) | |
4.6 (Coarse) Correlated Equilibrium (pg. 71) | |
4.7 Conceptual Limitations of Equilibrium Solutions (pg. 75) | |
4.8 Pareto Optimality (pg. 76) | |
4.9 Social Welfare and Fairness (pg. 78) | |
4.10 No-Regret (pg. 81) | |
4.11 The Complexity of Computing Equilibria (pg. 83) | |
4.12 Summary (pg. 87) | |
5 Multi-Agent Reinforcement Learning in Games: First Steps and Challenges (pg. 89) | |
5.1 General Learning Process (pg. 90) | |
5.2 Convergence Types (pg. 92) | |
5.3 Single-Agent RL Reductions (pg. 95) | |
5.4 Challenges of MARL (pg. 101) | |
5.5 What Algorithms Do Agents Use? (pg. 109) | |
5.6 Summary (pg. 112) | |
6 Multi-Agent Reinforcement Learning: Foundational Algorithms (pg. 115) | |
6.1 Dynamic Programming for Games: Value Iteration (pg. 116) | |
6.2 Temporal-Difference Learning for Games: Joint-Action Learning (pg. 118) | |
6.3 Agent Modeling (pg. 127) | |
6.4 Policy-Based Learning (pg. 140) | |
6.5 No-Regret Learning (pg. 151) | |
6.6 Summary (pg. 156) | |
II Multi-Agent Deep Reinforcement Learning: Algorithms and Practice (pg. 159) | |
7 Deep Learning (pg. 161) | |
7.1 Function Approximation for Reinforcement Learning (pg. 161) | |
7.2 Linear Function Approximation (pg. 163) | |
7.3 Feedforward Neural Networks (pg. 165) | |
7.4 Gradient-Based Optimization (pg. 169) | |
7.5 Convolutional and Recurrent Neural Networks (pg. 175) | |
7.6 Summary (pg. 180) | |
8 Deep Reinforcement Learning (pg. 183) | |
8.1 Deep Value Function Approximation (pg. 184) | |
8.2 Policy Gradient Algorithms (pg. 195) | |
8.3 Observations, States, and Histories in Practice (pg. 215) | |
8.4 Summary (pg. 216) | |
9 Multi-Agent Deep Reinforcement Learning (pg. 219) | |
9.1 Training and Execution Modes (pg. 220) | |
9.2 Notation for Multi-Agent Deep Reinforcement Learning (pg. 222) | |
9.3 Independent Learning (pg. 223) | |
9.4 Multi-Agent Policy Gradient Algorithms (pg. 230) | |
9.5 Value Decomposition in Common-Reward Games (pg. 242) | |
9.6 Agent Modeling with Neural Networks (pg. 266) | |
9.7 Environments with Homogeneous Agents (pg. 274) | |
9.8 Policy Self-Play in Zero-Sum Games (pg. 281) | |
9.9 Population-Based Training (pg. 290) | |
9.10 Summary (pg. 301) | |
10 Multi-Agent Deep Reinforcement Learning in Practice (pg. 305) | |
10.1 The Agent-Environment Interface (pg. 305) | |
10.2 MARL Neural Networks in PyTorch (pg. 307) | |
10.3 Centralized Value Functions (pg. 312) | |
10.4 Value Decomposition (pg. 313) | |
10.5 Practical Tips for MARL Algorithms (pg. 313) | |
10.6 Presentation of Experimental Results (pg. 316) | |
11 Multi-Agent Environments (pg. 319) | |
11.1 Criteria for Choosing Environments (pg. 320) | |
11.2 Structurally Distinct 2×2 Matrix Games (pg. 321) | |
11.3 Complex Environments (pg. 323) | |
11.4 Environment Collections (pg. 332) | |
A Surveys on Multi-Agent Reinforcement Learning (pg. 337) | |
References (pg. 341) | |
Index (pg. 363) |
Stefano V. Albrecht
Filippos Christianos
Lukas Schäfer
Instructors Only | |
---|---|
You must have an instructor account and submit a request to access instructor materials for this book.
|
eTextbook
Go paperless today! Available online anytime, nothing to download or install.
Features
|