Knowledge Graphs

Fundamentals, Techniques, and Applications

by Kejriwal, Knoblock, Szekely

ISBN: 9780262363211 | Copyright 2021

Click here to preview

Instructor Requests

Digital Exam/Desk Copy Ancillaries
Tabs

A rigorous and comprehensive textbook covering the major approaches to knowledge graphs, an active and interdisciplinary area within artificial intelligence.

The field of knowledge graphs, which allows us to model, process, and derive insights from complex real-world data, has emerged as an active and interdisciplinary area of artificial intelligence over the last decade, drawing on such fields as natural language processing, data mining, and the semantic web. Current projects involve predicting cyberattacks, recommending products, and even gleaning insights from thousands of papers on COVID-19. This textbook offers rigorous and comprehensive coverage of the field. It focuses systematically on the major approaches, both those that have stood the test of time and the latest deep learning methods.

After presenting introductory and background material, the text covers techniques for constructing knowledge graphs, adding new knowledge to (or refining old knowledge in) knowledge graphs, and accessing (or querying) knowledge graphs. Finally, the book describes specific knowledge graph ecosystems, with each ecosystem corresponding to several real-world applications and case studies. Each chapter concludes with a software and resources section as well as bibliographic notes that suggest required reading. End-of-chapter exercises, 130 in all, represent various levels of abstraction.

Expand/Collapse All
Contents (pg. v)
List of Figures (pg. xv)
List of Tables (pg. xxiii)
Preface (pg. xxv)
I. KNOWLEDGE GRAPH FUNDAMENTALS (pg. 1)
1. Introduction to Knowledge Graphs (pg. 3)
1.1 Graphs (pg. 3)
1.2 Representing Knowledge as Graphs (pg. 5)
1.3 Examples of Knowledge Graphs (pg. 10)
1.4 How to Read This Text (pg. 14)
1.5 Concluding Notes (pg. 15)
1.6 Software and Resources (pg. 15)
1.7 Bibliographic Notes (pg. 16)
1.8 Exercises (pg. 18)
2. Modeling and Representing Knowledge Graphs (pg. 21)
2.1 Introduction (pg. 21)
2.2 RDF Schema (pg. 28)
2.3 Property-Centric Models (pg. 31)
2.4 Wikidata Model (pg. 34)
2.5 The Semantic Web Layer Cake (pg. 40)
2.6 Schema Heterogeneity and Semantic Labeling (pg. 42)
2.7 Concluding Notes (pg. 44)
2.8 Software and Resources (pg. 44)
2.9 Bibliographic Notes (pg. 46)
2.10 Exercises (pg. 47)
II. KNOWLEDGE GRAPH CONSTRUCTION (pg. 51)
3. Domain Discovery (pg. 53)
3.1 Introduction (pg. 53)
3.2 Focused Crawling (pg. 56)
3.3 Influential Systems and Methodologies (pg. 64)
3.4 Concluding Notes (pg. 71)
3.5 Software and Resources (pg. 71)
3.6 Bibliographic Notes (pg. 73)
3.7 Exercises (pg. 75)
4. Named Entity Recognition (pg. 77)
4.1 Introduction (pg. 77)
4.2 Why Is Information Extraction Hard? (pg. 80)
4.3 Approaches for Named Entity Recognition (pg. 82)
4.4 Deep Learning for Named Entity Recognition (pg. 89)
4.5 Domain-Specific Named Entity Recognition (pg. 91)
4.6 Evaluating Information Extraction Quality (pg. 92)
4.7 Concluding Notes (pg. 93)
4.8 Software and Resources (pg. 93)
4.9 Bibliographic Notes (pg. 94)
4.10 Exercises (pg. 96)
5. Web Information Extraction (pg. 97)
5.1 Introduction (pg. 97)
5.2 Wrapper Generation (pg. 102)
5.3 Beyond Wrappers: Information Extraction over Structured Data (pg. 113)
5.4 Concluding Notes (pg. 120)
5.5 Software and Resources (pg. 121)
5.6 Bibliographic Notes (pg. 122)
5.7 Exercises (pg. 123)
6. Relation Extraction (pg. 125)
6.1 Introduction (pg. 125)
6.2 Ontologies and Programs (pg. 127)
6.3 Techniques for Relation Extraction (pg. 131)
6.4 Recent Research: Deep Learning for Relation Extraction (pg. 139)
6.5 Beyond Relation Extraction: Event Extraction and Joint Information Extraction (pg. 143)
6.6 Concluding Notes (pg. 144)
6.7 Software and Resources (pg. 144)
6.8 Bibliographic Notes (pg. 146)
6.9 Exercises (pg. 147)
7. Nontraditional Information Extraction (pg. 149)
7.1 Introduction (pg. 149)
7.2 Open Information Extraction (pg. 151)
7.3 Social Media Information Extraction (pg. 161)
7.4 Other Kinds of Nontraditional Information Extraction (pg. 166)
7.5 Concluding Notes (pg. 167)
7.6 Software and Resources (pg. 169)
7.7 Bibliographic Notes (pg. 170)
7.8 Exercises (pg. 171)
III. KNOWLEDGE GRAPH COMPLETION (pg. 173)
8. Instance Matching (pg. 175)
8.1 Introduction (pg. 175)
8.2 Formalism (pg. 178)
8.3 Why Is Instance Matching Challenging? (pg. 179)
8.4 Two-Step Pipeline (pg. 180)
8.5 Evaluating the Two-Step Pipeline (pg. 194)
8.6 Postsimilarity Steps (pg. 198)
8.7 Formalizing Instance Matching: Swoosh (pg. 203)
8.8 A Note on Research Frontiers (pg. 205)
8.9 Data Cleaning beyond Instance Matching (pg. 208)
8.10 Concluding Notes (pg. 212)
8.11 Software and Resources (pg. 213)
8.12 Bibliographic Notes (pg. 215)
8.13 Exercises (pg. 217)
9. Statistical Relational Learning (pg. 221)
9.1 Introduction (pg. 221)
9.2 Modeling Dependencies (pg. 223)
9.3 Statistical Relational Learning Frameworks (pg. 224)
9.4 Knowledge Graph Identification (pg. 232)
9.5 Other Applications (pg. 235)
9.6 Advanced Research: Data Programming (pg. 236)
9.7 Concluding Notes (pg. 237)
9.8 Software and Resources (pg. 238)
9.9 Bibliographic Notes (pg. 238)
9.10 Exercises (pg. 239)
10. Representation Learning for Knowledge Graphs (pg. 241)
10.1 Introduction (pg. 241)
10.2 Embedding Architectures: A Primer (pg. 243)
10.3 Embeddings beyond Words (pg. 246)
10.4 Knowledge Graph Embeddings (pg. 248)
10.5 Influential KGE Systems (pg. 251)
10.6 Extrafactual Contexts (pg. 262)
10.7 Applications (pg. 267)
10.8 Concluding Notes (pg. 270)
10.9 Software and Resources (pg. 271)
10.10 Bibliographic Notes (pg. 272)
10.11 Exercises (pg. 273)
IV. ACCESSING KNOWLEDGE GRAPHS (pg. 277)
11. Reasoning and Retrieval (pg. 279)
11.1 Introduction (pg. 279)
11.2 Reasoning (pg. 281)
11.3 Retrieval (pg. 291)
11.4 Retrieval versus Reasoning (pg. 293)
11.5 Concluding Notes (pg. 302)
11.6 Software and Resources (pg. 303)
11.7 Bibliographic Notes (pg. 303)
11.8 Exercises (pg. 305)
12. Structured Querying (pg. 307)
12.1 Introduction (pg. 307)
12.2 SPARQL (pg. 308)
12.3 Relational Processing of Queries over Knowledge Graphs (pg. 311)
12.4 NoSQL (pg. 316)
12.5 Concluding Notes (pg. 330)
12.6 Software and Resources (pg. 330)
12.7 Bibliographic Notes (pg. 332)
12.8 Exercises (pg. 333)
13. Question Answering (pg. 337)
13.1 Introduction (pg. 337)
13.2 Question Answering as a Stand-Alone Application (pg. 339)
13.3 Question Answering as Knowledge Graph Querying (pg. 346)
13.4 Concluding Notes (pg. 358)
13.5 Software and Resources (pg. 358)
13.6 Bibliographic Notes (pg. 360)
13.7 Exercises (pg. 362)
V. KNOWLEDGE GRAPH ECOSYSTEMS (pg. 365)
14. Linked Data (pg. 367)
14.1 Introduction (pg. 367)
14.2 Impact and Adoption of Linked Data Principles (pg. 377)
14.3 Important Knowledge Graphs in Linked Open Data (pg. 379)
14.4 Concluding Notes (pg. 386)
14.5 Software and Resources (pg. 386)
14.6 Bibliographic Notes (pg. 387)
14.7 Exercises (pg. 388)
15. Enterprise and Government (pg. 391)
15.1 Introduction (pg. 391)
15.2 Enterprise (pg. 392)
15.3 Governments and Nonprofits (pg. 402)
15.4 Where Is the Future Headed? (pg. 407)
15.5 Concluding Notes (pg. 408)
15.6 Software and Resources (pg. 409)
15.7 Bibliographic Notes (pg. 410)
15.8 Exercises (pg. 412)
16. Knowledge Graphs and Ontologies in Science (pg. 415)
16.1 Introduction (pg. 415)
16.2 Biology (pg. 417)
16.3 Chemistry (pg. 423)
16.4 Earth, Environment, and Geosciences (pg. 427)
16.5 Concluding Notes (pg. 433)
16.6 Software and Resources (pg. 434)
16.7 Bibliographic Notes (pg. 435)
16.8 Exercises (pg. 436)
17. Knowledge Graphs for Domain-Specific Social Impact (pg. 439)
17.1 Introduction (pg. 439)
17.2 Domain-Specific Insight Graphs (pg. 441)
17.3 Alternative System: DeepDive (pg. 450)
17.4 Applications and Use-Cases (pg. 452)
17.5 Concluding Notes (pg. 460)
17.6 Software and Resources (pg. 461)
17.7 Bibliographic Notes (pg. 463)
17.8 Exercises (pg. 465)
Bibliography (pg. 467)
Index (pg. 511)

Mayank Kejriwal

Mayank Kejriwal is Research Assistant Professor at the University of Southern California's Viterbi School of Engineering.

Craig A. Knoblock

Craig A. Knoblock is Executive Director of the Information Sciences Institute at the University of Southern California, where he is also Research Professor of both Computer Science and Spatial Sciences as well as Director of the Data Science Program.

Pedro Szekely

Pedro Szekely is Principal Scientist and Director of the Center on Knowledge Graphs at the University of Southern California's Information Sciences Institute.

eTextbook
Go paperless today! Available online anytime, nothing to download or install.

Features

  • Bookmarking
  • Note taking
  • Highlighting