Decision Making Under Uncertainty

Theory and Application

by Kochenderfer, Amato, Chowdhary, How, Reynolds, Thornton, Torres-Carrasquillo, Üre, Vian

ISBN: 9780262331708 | Copyright 2015

Click here to preview

Instructor Requests

Digital Exam/Desk Copy Print Desk Copy Ancillaries
Tabs

Many important problems involve decision making under uncertainty—that is, choosing actions based on often imperfect observations, with unknown outcomes. Designers of automated decision support systems must take into account the various sources of uncertainty while balancing the multiple objectives of the system. This book provides an introduction to the challenges of decision making under uncertainty from a computational perspective. It presents both the theory behind decision making models and algorithms and a collection of example applications that range from speech recognition to aircraft collision avoidance.

Focusing on two methods for designing decision agents, planning and reinforcement learning, the book covers probabilistic models, introducing Bayesian networks as a graphical model that captures probabilistic relationships between variables; utility theory as a framework for understanding optimal decision making under uncertainty; Markov decision processes as a method for modeling sequential problems; model uncertainty; state uncertainty; and cooperative decision making involving multiple interacting agents. A series of applications shows how the theoretical concepts can be applied to systems for attribute-based person search, speech applications, collision avoidance, and unmanned aircraft persistent surveillance.

Decision Making Under Uncertainty unifies research from different communities using consistent notation, and is accessible to students and researchers across engineering disciplines who have some prior exposure to probability theory and calculus. It can be used as a text for advanced undergraduate and graduate students in fields including computer science, aerospace and electrical engineering, and management science. It will also be a valuable professional reference for researchers in a variety of disciplines.

This book is a tour de force for its systematic treatment of the latest advances in decision making and planning under uncertainty. The detailed discussion on modeling issues and computational efficiency within real-world applications makes it invaluable for students and practitioners alike.

David Hsu Professor of Computer Science, National University of Singapore

This book is a thorough and authoritative treatment of the mathematics of planning and reasoning under uncertainty. The real-life case studies that end the book help ground the theory with concrete examples that can serve as models for researchers developing new applications of these powerful ideas. It would make a terrific text for a semester-long course on the subject of algorithmic decision making.

Michael L. Littman Professor of Computer Science, Brown University

An intuitive and accessible introduction to the exciting topic of decision making under uncertainty--very timely given the latest advances in robotics and autonomous systems. Problems are framed in the probabilistic inference formulation and provide a modern take on the classical reinforcement learning paradigm under partial observability, with natural links to real-world applications.

Sethu Vijayakumar FRSE Professor of Robotics, University of Edinburgh
Expand/Collapse All
Contents (pg. vii)
Preface (pg. xix)
About the Authors (pg. xxi)
Acknowledgements (pg. xxv)
1 Introduction (pg. 1)
I. Theory (pg. 9)
2: Probabilistic Models (pg. 11)
3: Decision Problems (pg. 57)
4: Sequential Problems (pg. 77)
5: Model Uncertainty (pg. 113)
6: State Uncertainty (pg. 133)
7: Cooperative Decision Making (pg. 159)
II. Application (pg. 189)
8: Probabilistic Surveillance Video Search (pg. 191)
9: Dynamic Models for Speech Applications (pg. 229)
10: Optimized Airborne Collision Avoidance (pg. 249)
11: Multiagent Planning for Persistent Surveillance (pg. 277)
12: Integrating Automation with Humans (pg. 291)
Index (pg. 317)
Preface (pg. xix)
About the Authors (pg. xxi)
Acknowledgments (pg. xxv)
1 Introduction (pg. 1)
1.1 Decision Making (pg. 1)
1.2 Example Applications (pg. 2)
1.2.1 Trafic Alert and Collision Avoidance System (pg. 2)
1.2.2 Unmanned Aircraft Persistent Surveillance (pg. 3)
1.3 Methods for Designing Decision Agents (pg. 4)
1.3.1 Explicit Programming (pg. 4)
1.3.2 Supervised Learning (pg. 4)
1.3.3 Optimization (pg. 5)
1.3.4 Planning (pg. 5)
1.3.5 Reinforcement Learning (pg. 5)
1.4 Overview (pg. 5)
1.5 Further Reading (pg. 7)
References (pg. 7)
I THEORY (pg. 9)
2 Probabilistic Models (pg. 11)
2.1 Representation (pg. 11)
2.2 Inference (pg. 25)
2.3 Parameter Learning (pg. 40)
2.4 Structure Learning (pg. 46)
2.5 Summary (pg. 52)
2.6 Further Reading (pg. 54)
References (pg. 54)
3 Decision Problems (pg. 57)
3.1 Utility Theory (pg. 57)
3.2 Decision Networks (pg. 64)
3.3 Games (pg. 68)
3.4 Summary (pg. 72)
3.5 Further Reading (pg. 72)
References (pg. 74)
4 Sequential Problems (pg. 77)
4.1 Formulation (pg. 77)
4.2 Dynamic Programming (pg. 79)
4.3 Structured Representations (pg. 89)
4.4 Linear Representations (pg. 91)
4.5 Approximate Dynamic Programming (pg. 93)
4.6 Online Methods (pg. 99)
4.7 Direct Policy Search (pg. 103)
4.8 Summary (pg. 108)
4.9 Further Reading (pg. 108)
References (pg. 110)
5 Model Uncertainty (pg. 113)
5.1 Exploration and Exploitation (pg. 113)
5.2 Maximum Likelihood Model-Based Methods (pg. 116)
5.3 Bayesian Model-Based Methods (pg. 118)
5.4 Model-Free Methods (pg. 121)
5.5 Generalization (pg. 124)
5.6 Summary (pg. 129)
5.7 Further Reading (pg. 129)
References (pg. 130)
6 State Uncertainty (pg. 133)
6.1 Formulation (pg. 133)
6.2 Belief Updating (pg. 136)
6.3 Exact Solution Methods (pg. 140)
6.4 Offline Methods (pg. 144)
6.5 Online Methods (pg. 149)
6.6 Summary (pg. 155)
6.7 Further Reading (pg. 155)
References (pg. 156)
7 Cooperative Decision Making (pg. 159)
7.1 Formulation (pg. 159)
7.2 Properties (pg. 164)
7.3 Notable Subclasses (pg. 166)
7.4 Exact Solution Methods (pg. 170)
7.5 Approximate Solution Methods (pg. 177)
7.6 Communication (pg. 178)
7.7 Summary (pg. 180)
7.8 Further Reading (pg. 180)
References (pg. 182)
II APPLICATION (pg. 189)
8 Probabilistic Surveillance Video Search (pg. 191)
8.1 Attribute-Based Person Search (pg. 191)
8.2 Probabilistic Appearance Model (pg. 195)
8.3 Learning and Inference Techniques (pg. 206)
8.4 Performance (pg. 217)
8.5 Interactive Search Tool (pg. 223)
8.6 Summary (pg. 225)
References (pg. 227)
9 Dynamic Models for Speech Applications (pg. 229)
9.1 Modeling Speech Signals (pg. 229)
9.2 Speech Recognition (pg. 232)
9.3 Topic Identiication (pg. 235)
9.4 Language Recognition (pg. 236)
9.5 Speaker Identiication (pg. 238)
9.6 Machine Translation (pg. 242)
9.7 Summary (pg. 243)
References (pg. 243)
10 Optimized Airborne Collision Avoidance (pg. 249)
10.1 Airborne Collision Avoidance Systems (pg. 249)
10.2 Collision Avoidance Problem Formulation (pg. 253)
10.3 State Estimation (pg. 259)
10.4 Real-Time Execution (pg. 261)
10.5 Evaluation (pg. 265)
10.6 Summary (pg. 273)
References (pg. 274)
11 Multiagent Planning for Persistent Surveillance (pg. 277)
11.1 Mission Description (pg. 277)
11.2 Centralized Problem Formulation (pg. 278)
11.3 Decentralized Approximate Formulations (pg. 280)
11.4 Model Learning (pg. 282)
11.5 Flight Test (pg. 285)
11.6 Summary (pg. 286)
References (pg. 289)
12 Integrating Automation with Humans (pg. 291)
12.1 Human Capabilities and Coping (pg. 291)
12.2 Considering the Human in Design (pg. 296)
12.3 A Systems View of Implementation (pg. 308)
12.4 Summary (pg. 313)
References (pg. 314)
Index (pg. 317)
1.1 Decision Making (pg. 1)
1.2 Example Applications (pg. 2)
1.2.1 Trafic Alert and Collision Avoidance System (pg. 2)
1.2.2 Unmanned Aircraft Persistent Surveillance (pg. 3)
1.3 Methods for Designing Decision Agents (pg. 4)
1.3.1 Explicit Programming (pg. 4)
1.3.2 Supervised Learning (pg. 4)
1.3.3 Optimization (pg. 5)
1.3.4 Planning (pg. 5)
1.3.5 Reinforcement Learning (pg. 5)
1.4 Overview (pg. 5)
1.5 Further Reading (pg. 7)
References (pg. 7)
2 Probabilistic Models (pg. 11)
2.1 Representation (pg. 11)
2.2 Inference (pg. 25)
2.3 Parameter Learning (pg. 40)
2.4 Structure Learning (pg. 46)
2.5 Summary (pg. 52)
2.6 Further Reading (pg. 54)
References (pg. 54)
3 Decision Problems (pg. 57)
3.1 Utility Theory (pg. 57)
3.2 Decision Networks (pg. 64)
3.3 Games (pg. 68)
3.4 Summary (pg. 72)
3.5 Further Reading (pg. 72)
References (pg. 74)
4 Sequential Problems (pg. 77)
4.1 Formulation (pg. 77)
4.2 Dynamic Programming (pg. 79)
4.3 Structured Representations (pg. 89)
4.4 Linear Representations (pg. 91)
4.5 Approximate Dynamic Programming (pg. 93)
4.6 Online Methods (pg. 99)
4.7 Direct Policy Search (pg. 103)
4.8 Summary (pg. 108)
4.9 Further Reading (pg. 108)
References (pg. 110)
5 Model Uncertainty (pg. 113)
5.1 Exploration and Exploitation (pg. 113)
5.2 Maximum Likelihood Model-Based Methods (pg. 116)
5.3 Bayesian Model-Based Methods (pg. 118)
5.4 Model-Free Methods (pg. 121)
5.5 Generalization (pg. 124)
5.6 Summary (pg. 129)
5.7 Further Reading (pg. 129)
References (pg. 130)
6 State Uncertainty (pg. 133)
6.1 Formulation (pg. 133)
6.2 Belief Updating (pg. 136)
6.3 Exact Solution Methods (pg. 140)
6.4 Offline Methods (pg. 144)
6.5 Online Methods (pg. 149)
6.6 Summary (pg. 155)
6.7 Further Reading (pg. 155)
References (pg. 156)
7 Cooperative Decision Making (pg. 159)
7.1 Formulation (pg. 159)
7.2 Properties (pg. 164)
7.3 Notable Subclasses (pg. 166)
7.4 Exact Solution Methods (pg. 170)
7.5 Approximate Solution Methods (pg. 177)
7.6 Communication (pg. 178)
7.7 Summary (pg. 180)
7.8 Further Reading (pg. 180)
References (pg. 182)
8 Probabilistic Surveillance Video Search (pg. 191)
8.1 Attribute-Based Person Search (pg. 191)
8.2 Probabilistic Appearance Model (pg. 195)
8.3 Learning and Inference Techniques (pg. 206)
8.4 Performance (pg. 217)
8.5 Interactive Search Tool (pg. 223)
8.6 Summary (pg. 225)
References (pg. 227)
9 Dynamic Models for Speech Applications (pg. 229)
9.1 Modeling Speech Signals (pg. 229)
9.2 Speech Recognition (pg. 232)
9.3 Topic Identiication (pg. 235)
9.4 Language Recognition (pg. 236)
9.5 Speaker Identiication (pg. 238)
9.6 Machine Translation (pg. 242)
9.7 Summary (pg. 243)
References (pg. 243)
10 Optimized Airborne Collision Avoidance (pg. 249)
10.1 Airborne Collision Avoidance Systems (pg. 249)
10.2 Collision Avoidance Problem Formulation (pg. 253)
10.3 State Estimation (pg. 259)
10.4 Real-Time Execution (pg. 261)
10.5 Evaluation (pg. 265)
10.6 Summary (pg. 273)
References (pg. 274)
11 Multiagent Planning for Persistent Surveillance (pg. 277)
11.1 Mission Description (pg. 277)
11.2 Centralized Problem Formulation (pg. 278)
11.3 Decentralized Approximate Formulations (pg. 280)
11.4 Model Learning (pg. 282)
11.5 Flight Test (pg. 285)
11.6 Summary (pg. 286)
References (pg. 289)
12 Integrating Automation with Humans (pg. 291)
12.1 Human Capabilities and Coping (pg. 291)
12.2 Considering the Human in Design (pg. 296)
12.3 A Systems View of Implementation (pg. 308)
12.4 Summary (pg. 313)
References (pg. 314)
Index (pg. 317)
1.1 Decision Making (pg. 1)
1.2 Example Applications (pg. 2)
1.2.1 Trafic Alert and Collision Avoidance System (pg. 2)
1.2.2 Unmanned Aircraft Persistent Surveillance (pg. 3)
1.3 Methods for Designing Decision Agents (pg. 4)
1.3.1 Explicit Programming (pg. 4)
1.3.2 Supervised Learning (pg. 4)
1.3.3 Optimization (pg. 5)
1.3.4 Planning (pg. 5)
1.3.5 Reinforcement Learning (pg. 5)
1.5 Further Reading (pg. 7)
1.4 Overview (pg. 5)
References (pg. 7)
2 Probabilistic Models (pg. 11)
2.1 Representation (pg. 11)
2.2 Inference (pg. 25)
2.3 Parameter Learning (pg. 40)
2.4 Structure Learning (pg. 46)
2.5 Summary (pg. 52)
2.6 Further Reading (pg. 54)
References (pg. 54)
3 Decision Problems (pg. 57)
3.1 Utility Theory (pg. 57)
3.2 Decision Networks (pg. 64)
3.3 Games (pg. 68)
3.4 Summary (pg. 72)
3.5 Further Reading (pg. 72)
References (pg. 74)
4 Sequential Problems (pg. 77)
4.1 Formulation (pg. 77)
4.2 Dynamic Programming (pg. 79)
4.3 Structured Representations (pg. 89)
4.4 Linear Representations (pg. 91)
4.5 Approximate Dynamic Programming (pg. 93)
4.6 Online Methods (pg. 99)
4.7 Direct Policy Search (pg. 103)
4.8 Summary (pg. 108)
4.9 Further Reading (pg. 108)
References (pg. 110)
5 Model Uncertainty (pg. 113)
5.1 Exploration and Exploitation (pg. 113)
5.2 Maximum Likelihood Model-Based Methods (pg. 116)
5.3 Bayesian Model-Based Methods (pg. 118)
5.4 Model-Free Methods (pg. 121)
5.5 Generalization (pg. 124)
5.6 Summary (pg. 129)
5.7 Further Reading (pg. 129)
References (pg. 130)
6 State Uncertainty (pg. 133)
6.1 Formulation (pg. 133)
6.2 Belief Updating (pg. 136)
6.3 Exact Solution Methods (pg. 140)
6.4 Offline Methods (pg. 144)
6.5 Online Methods (pg. 149)
6.6 Summary (pg. 155)
6.7 Further Reading (pg. 155)
References (pg. 156)
7 Cooperative Decision Making (pg. 159)
7.1 Formulation (pg. 159)
7.2 Properties (pg. 164)
7.3 Notable Subclasses (pg. 166)
7.4 Exact Solution Methods (pg. 170)
7.5 Approximate Solution Methods (pg. 177)
7.6 Communication (pg. 178)
7.7 Summary (pg. 180)
7.8 Further Reading (pg. 180)
References (pg. 182)
8 Probabilistic Surveillance Video Search (pg. 191)
8.1 Attribute-Based Person Search (pg. 191)
8.2 Probabilistic Appearance Model (pg. 195)
8.3 Learning and Inference Techniques (pg. 206)
8.4 Performance (pg. 217)
8.5 Interactive Search Tool (pg. 223)
8.6 Summary (pg. 225)
References (pg. 227)
9 Dynamic Models for Speech Applications (pg. 229)
9.1 Modeling Speech Signals (pg. 229)
9.2 Speech Recognition (pg. 232)
9.3 Topic Identiication (pg. 235)
9.4 Language Recognition (pg. 236)
9.5 Speaker Identiication (pg. 238)
9.6 Machine Translation (pg. 242)
9.7 Summary (pg. 243)
References (pg. 243)
10 Optimized Airborne Collision Avoidance (pg. 249)
10.1 Airborne Collision Avoidance Systems (pg. 249)
10.2 Collision Avoidance Problem Formulation (pg. 253)
10.3 State Estimation (pg. 259)
10.4 Real-Time Execution (pg. 261)
10.5 Evaluation (pg. 265)
10.6 Summary (pg. 273)
References (pg. 274)
11 Multiagent Planning for Persistent Surveillance (pg. 277)
11.1 Mission Description (pg. 277)
11.2 Centralized Problem Formulation (pg. 278)
11.3 Decentralized Approximate Formulations (pg. 280)
11.4 Model Learning (pg. 282)
11.5 Flight Test (pg. 285)
11.6 Summary (pg. 286)
References (pg. 289)
12 Integrating Automation with Humans (pg. 291)
12.1 Human Capabilities and Coping (pg. 291)
12.2 Considering the Human in Design (pg. 296)
12.3 A Systems View of Implementation (pg. 308)
12.4 Summary (pg. 313)
References (pg. 314)
Index (pg. 317)

Mykel J. Kochenderfer

Mykel J. Kochenderfer is Assistant Professor in the Department of Aeronautics and Astronautics at Stanford University. He is a consultant for the MIT Lincoln Laboratory.










Instructors Only
You must have an instructor account and submit a request to access instructor materials for this book.
eTextbook
Go paperless today! Available online anytime, nothing to download or install.

Features

  • Bookmarking
  • Note taking
  • Highlighting