Probabilistic Graphical Models
Principles and Techniques
by Koller, Friedman
ISBN: 9780262277389  Copyright 2009
Instructor Requests
Most tasks require a person or an automated system to reasonâ€”to reach conclusions based on available information. The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is modelbased, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most realworld applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality.
Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.
This landmark book provides a very extensive coverage of the field, ranging from basic representational issues to the latest techniques for approximate inference and learning. As such, it is likely to become a definitive reference for all those who work in this area. Detailed worked examples and case studies also make the book accessible to students.
Kevin Murphy Department of Computer Science, University of British Columbia
Expand/Collapse All  

Contents (pg. ix)  
Acknowledgments (pg. xxiii)  
List of Figures (pg. xxv)  
List of Algorithms (pg. xxxi)  
List of Boxes (pg. xxxiii)  
1 Introduction (pg. 1)  
1.1 Motivation (pg. 1)  
1.2 Structured Probabilistic Models (pg. 2)  
1.3 Overview and Roadmap (pg. 6)  
1.4 Historical Notes (pg. 12)  
2 Foundations (pg. 15)  
2.1 Probability Theory (pg. 15)  
2.2 Graphs (pg. 34)  
2.3 Relevant Literature (pg. 39)  
2.4 Exercises (pg. 39)  
Part I: Representation (pg. 43)  
3 The Bayesian Network Representation (pg. 45)  
3.1 Exploiting Independence Properties (pg. 45)  
3.2 Bayesian Networks (pg. 51)  
3.3 Independencies in Graphs (pg. 68)  
3.4 From Distributions to Graphs (pg. 78)  
3.5 Summary (pg. 92)  
3.6 Relevant Literature (pg. 93)  
3.7 Exercises (pg. 96)  
4 Undirected Graphical Models (pg. 103)  
4.1 The Misconception Example (pg. 103)  
4.2 Parameterization (pg. 106)  
4.3 Markov Network Independencies (pg. 114)  
4.4 Parameterization Revisited (pg. 122)  
4.5 Bayesian Networks and Markov Networks (pg. 134)  
4.6 Partially Directed Models (pg. 142)  
4.7 Summary and Discussion (pg. 151)  
4.8 Relevant Literature (pg. 152)  
4.9 Exercises (pg. 153)  
5 Local Probabilistic Models (pg. 157)  
5.1 Tabular CPDs (pg. 157)  
5.2 Deterministic CPDs (pg. 158)  
5.3 ContextSpecific CPDs (pg. 162)  
5.4 Independence of Causal Influence (pg. 175)  
5.5 Continuous Variables (pg. 185)  
5.6 Conditional Bayesian Networks (pg. 191)  
5.7 Summary (pg. 193)  
5.8 Relevant Literature (pg. 194)  
6 TemplateBased Representations (pg. 199)  
6.1 Introduction (pg. 199)  
6.2 Temporal Models (pg. 200)  
6.3 Template Variables and Template Factors (pg. 212)  
6.4 Directed Probabilistic Models for ObjectRelational Domains (pg. 216)  
6.5 Undirected Representation (pg. 228)  
6.6 Structural Uncertainty (pg. 232)  
6.7 Summary (pg. 240)  
6.8 Relevant Literature (pg. 242)  
7 Gaussian Network Models (pg. 247)  
7.1 Multivariate Gaussians (pg. 247)  
7.2 Gaussian Bayesian Networks (pg. 251)  
7.3 Gaussian Markov Random Fields (pg. 254)  
7.4 Summary (pg. 257)  
7.5 Relevant Literature (pg. 258)  
8 The Exponential Family (pg. 261)  
8.1 Introduction (pg. 261)  
8.2 Exponential Families (pg. 261)  
8.3 Factored Exponential Families (pg. 266)  
8.4 Entropy and Relative Entropy (pg. 269)  
8.5 Projections (pg. 273)  
8.6 Summary (pg. 282)  
8.7 Relevant Literature (pg. 283)  
Part II: Inference (pg. 285)  
9 Exact Inference: Variable Elimination (pg. 287)  
9.1 Analysis of Complexity (pg. 288)  
9.2 Variable Elimination: The Basic Ideas (pg. 292)  
9.3 Variable Elimination (pg. 296)  
9.4 Complexity and Graph Structure: Variable Elimination (pg. 305)  
9.5 Conditioning (pg. 315)  
9.6 Inference with Structured CPDs (pg. 325)  
9.7 Summary and Discussion (pg. 336)  
9.8 Relevant Literature (pg. 337)  
9.9 Exercises (pg. 338)  
10 Exact Inference: Clique Trees (pg. 345)  
10.1 Variable Elimination and Clique Trees (pg. 345)  
10.2 Message Passing: Sum Product (pg. 348)  
10.3 Message Passing: Belief Update (pg. 364)  
10.4 Constructing a Clique Tree (pg. 372)  
10.5 Summary (pg. 376)  
10.6 Relevant Literature (pg. 377)  
10.7 Exercises (pg. 378)  
11 Inference as Optimization (pg. 381)  
11.1 Introduction (pg. 381)  
11.2 Exact Inference as Optimization (pg. 386)  
11.3 PropagationBased Approximation (pg. 391)  
11.4 Propagation with Approximate Messages (pg. 430)  
11.5 Structured Variational Approximations (pg. 448)  
11.6 Summary and Discussion (pg. 473)  
11.7 Relevant Literature (pg. 475)  
11.8 Exercises (pg. 477)  
12 ParticleBased Approximate Inference (pg. 487)  
12.1 Forward Sampling (pg. 488)  
12.2 Likelihood Weighting and Importance Sampling (pg. 492)  
12.3 Markov Chain Monte Carlo Methods (pg. 505)  
12.4 Collapsed Particles (pg. 526)  
12.5 Deterministic Search Methods (pg. 536)  
12.6 Summary (pg. 540)  
12.7 Relevant Literature (pg. 541)  
12.8 Exercises (pg. 544)  
13 MAP Inference (pg. 551)  
13.1 Overview (pg. 551)  
13.2 Variable Elimination for (Marginal) MAP (pg. 554)  
13.3 MaxProduct in Clique Trees (pg. 562)  
13.4 MaxProduct Belief Propagation in Loopy Cluster Graphs (pg. 567)  
13.5 MAP as a Linear Optimization Problem (pg. 577)  
13.6 Using Graph Cuts for MAP (pg. 588)  
13.7 Local Search Algorithms (pg. 595)  
13.8 Summary (pg. 597)  
13.9 Relevant Literature (pg. 598)  
13.10 Exercises (pg. 601)  
14 Inference in Hybrid Networks (pg. 605)  
14.1 Introduction (pg. 605)  
14.2 Variable Elimination in Gaussian Networks (pg. 608)  
14.3 Hybrid Networks (pg. 615)  
14.4 Nonlinear Dependencies (pg. 630)  
14.5 ParticleBased Approximation Methods (pg. 642)  
14.6 Summary and Discussion (pg. 646)  
14.7 Relevant Literature (pg. 647)  
14.8 Exercises (pg. 649)  
15 Inference in Temporal Models (pg. 651)  
15.1 Inference Tasks (pg. 652)  
15.2 Exact Inference (pg. 653)  
15.3 Approximate Inference (pg. 661)  
15.4 Hybrid DBNs (pg. 675)  
15.5 Summary (pg. 688)  
15.6 Relevant Literature (pg. 690)  
15.7 Exercises (pg. 692)  
Part III: Learning (pg. 695)  
16 Learning Graphical Models: Overview (pg. 697)  
16.1 Motivation (pg. 697)  
16.2 Goals of Learning (pg. 698)  
16.3 Learning as Optimization (pg. 702)  
16.4 Learning Tasks (pg. 711)  
16.5 Relevant Literature (pg. 715)  
17 Parameter Estimation (pg. 717)  
17.1 Maximum Likelihood Estimation (pg. 717)  
17.2 MLE for Bayesian Networks (pg. 722)  
17.3 Bayesian Parameter Estimation (pg. 733)  
17.4 Bayesian Parameter Estimation in Bayesian Networks (pg. 741)  
17.5 Learning Models with Shared Parameters (pg. 754)  
17.6 Generalization Analysis (pg. 769)  
17.7 Summary (pg. 776)  
17.8 Relevant Literature (pg. 777)  
17.9 Exercises (pg. 778)  
18 Structure Learning in Bayesian Networks (pg. 783)  
18.1 Introduction (pg. 783)  
18.2 ConstraintBased Approaches (pg. 786)  
18.3 Structure Scores (pg. 790)  
18.4 Structure Search (pg. 807)  
18.5 Bayesian Model Averaging (pg. 824)  
18.6 Learning Models with Additional Structure (pg. 832)  
18.7 Summary and Discussion (pg. 838)  
18.8 Relevant Literature (pg. 840)  
19 Partially Observed Data (pg. 849)  
19.1 Foundations (pg. 849)  
19.2 Parameter Estimation (pg. 862)  
19.3 Bayesian Learning with Incomplete Data (pg. 897)  
19.4 Structure Learning (pg. 908)  
19.5 Learning Models with Hidden Variables (pg. 925)  
19.6 Summary (pg. 933)  
19.7 Relevant Literature (pg. 934)  
19.8 Exercises (pg. 935)  
20 Learning Undirected Models (pg. 943)  
20.1 Overview (pg. 943)  
20.2 The Likelihood Function (pg. 944)  
20.3 Maximum (Conditional) Likelihood Parameter Estimation (pg. 949)  
20.4 Parameter Priors and Regularization (pg. 958)  
20.5 Learning with Approximate Inference (pg. 961)  
20.6 Alternative Objectives (pg. 969)  
20.7 Structure Learning (pg. 978)  
20.8 Summary (pg. 996)  
20.9 Relevant Literature (pg. 998)  
20.10 Exercises (pg. 1001)  
Part IV: Actions and Decisions (pg. 1007)  
21 Causality (pg. 1009)  
21.1 Motivation and Overview (pg. 1009)  
21.2 Causal Models (pg. 1014)  
21.3 Structural Causal Identifiability (pg. 1017)  
21.4 Mechanisms and Response Variables (pg. 1026)  
21.5 Partial Identifiability in Functional Causal Models (pg. 1031)  
21.6 Counterfactual Queries (pg. 1034)  
21.7 Learning Causal Models (pg. 1040)  
21.8 Summary (pg. 1053)  
21.9 Relevant Literature (pg. 1054)  
21.10 Exercises (pg. 1055)  
22 Utilities and Decisions (pg. 1059)  
22.1 Foundations: Maximizing Expected Utility (pg. 1059)  
22.2 Utility Curves (pg. 1064)  
22.3 Utility Elicitation (pg. 1068)  
22.4 Utilities of Complex Outcomes (pg. 1071)  
22.5 Summary (pg. 1081)  
22.6 Relevant Literature (pg. 1082)  
22.7 Exercises (pg. 1084)  
23 Structured Decision Problems (pg. 1085)  
23.1 Decision Trees (pg. 1085)  
23.2 Influence Diagrams (pg. 1088)  
23.3 Backward Induction in Influence Diagrams (pg. 1095)  
23.4 Computing Expected Utilities (pg. 1100)  
23.5 Optimization in Influence Diagrams (pg. 1107)  
23.6 Ignoring Irrelevant Information (pg. 1119)  
23.7 Value of Information (pg. 1121)  
23.8 Summary (pg. 1126)  
23.9 Relevant Literature (pg. 1128)  
23.10 Exercises (pg. 1130)  
24 Epilogue (pg. 1133)  
A Background Material (pg. 1137)  
A.1 Information Theory (pg. 1137)  
A.2 Convergence Bounds (pg. 1143)  
A.3 Algorithms and Algorithmic Complexity (pg. 1146)  
A.4 Combinatorial Optimization and Search (pg. 1154)  
A.5 Continuous Optimization (pg. 1161)  
Bibliography (pg. 1173)  
Notation Index (pg. 1211)  
Subject Index (pg. 1215) 
Daphne Koller
Daphne Koller is Professor in the Department of Computer Science at Stanford University.
Nir Friedman
Nir Friedman is Professor in the Department of Computer Science and Engineering at Hebrew University.
Instructors  

You must have an instructor account and submit a request to access instructor materials for this book.
