A Gentle Introduction to Effective Computing in Quantitative Research

What Every Research Assistant Should Know

by Harry J. Paarsch and Konstantin Golyaev

ISBN: 9780262333979 | Copyright 2016

Click here to preview

Tabs

This book offers a practical guide to the computational methods at the heart of most modern quantitative research. It will be essential reading for research assistants needing hands-on experience; students entering PhD programs in business, economics, and other social or natural sciences; and those seeking quantitative jobs in industry. No background in computer science is assumed; a learner need only have a computer with access to the Internet. Using the example as its principal pedagogical device, the book offers tried-and-true prototypes that illustrate many important computational tasks required in quantitative research. The best way to use the book is to read it at the computer keyboard and learn by doing.

The book begins by introducing basic skills: how to use the operating system, how to organize data, and how to complete simple programming tasks. For its demonstrations, the book uses a UNIX-based operating system and a set of free software tools: the scripting language Python for programming tasks; the database management system SQLite; and the freely available R for statistical computing and graphics. The book goes on to describe particular tasks: analyzing data, implementing commonly used numerical and simulation methods, and creating extensions to Python to reduce cycle time. Finally, the book describes the use of LaTeX, a document markup language and preparation system.

This book covers exactly the material I ask my research assistants and students to learn and it covers it very well. Pattern recognition and data organization have become key tools of good working economists. The authors are proven experts and very creative quantitative researchers and it is good for us that they have written this ‘gentle introduction.’

Thomas J. Sargent, Nobel Laureate in Economics, 2011

This book is a good place for the novice programmer to begin.

James J. Heckman, Nobel Laureate in Economics, 2000

This book will be invaluable to almost everyone except the most advanced computer geeks. It is an easy-to-read guide to the essential practical computer tools that are almost never formally taught in class but which every PhD student needs to know in order to be productive. Senior economists and faculty can also profit from this book by helping them reduce their dependence on computer-literate research assistants to do many tasks that Paarsch and Golyaev show are easy to do themselves with just a small amount of investment.

John Rust, Professor of Economics, Georgetown University
Expand/Collapse All
Contents (pg. vii)
Prologue (pg. xvii)
Acknowledgments (pg. xxxi)
Notation (pg. xxxv)
1 Introduction (pg. 1)
1.1 Theory (pg. 3)
1.2 Data (pg. 4)
1.3 At the End of the Day (pg. 7)
2 Productivity Tools (pg. 9)
2.1 Opening a Terminal Window (pg. 9)
2.2 Working on the Command Line (pg. 10)
2.3 Some UNIX Commands (pg. 10)
2.4 Shortcuts (pg. 33)
2.5 Text Editors (pg. 39)
2.6 Other Tools for Text Processing (pg. 40)
2.7 Regular Expressions (pg. 42)
2.8 Shell Scripts (pg. 43)
2.9 Dealing with Dependencies (pg. 46)
2.10 Environment and Shell Variables (pg. 48)
2.11 Using Other Computers Remotely (pg. 50)
2.12 Running Long Jobs Remotely (pg. 53)
2.13 Saving Space (pg. 56)
2.14 Archiving Files (pg. 56)
2.15 Version Control (pg. 57)
2.16 Package Managers (pg. 58)
2.17 UNIX File Systems (pg. 58)
2.18 Uniform Resource Identifiers (pg. 60)
3 Organizing Data (pg. 61)
3.1 Spreadsheet (pg. 61)
3.2 Data Modeling (pg. 63)
3.3 Relational Algebra (pg. 67)
3.4 Basic SQL (pg. 75)
3.5 Solved Example (pg. 82)
3.6 NoSQL (pg. 109)
4 Simple Programming (pg. 121)
4.1 Python (pg. 121)
4.2 Important Concepts in Computer Science (pg. 124)
4.3 Basic Grammar (pg. 125)
4.4 Useful Modules and Packages (pg. 172)
4.5 Python Template (pg. 179)
4.6 Design Documents, Flowcharts, and Unit Testing (pg. 181)
4.7 Miscellaneous Topics (pg. 183)
4.8 Bringing It All Together (pg. 184)
5 Analyzing Data (pg. 225)
5.1 Is Your Answer Right? (pg. 225)
5.2 Methods of Sampling Data (pg. 228)
5.3 Useful Data Formats (pg. 232)
5.4 R System (pg. 233)
5.5 Useful R Packages (pg. 276)
5.6 Connecting R to SQLite (pg. 289)
5.7 Python Library pandas (pg. 292)
5.8 Python or R? (pg. 301)
5.9 Training, Validation, and Testing (pg. 302)
5.10 Fixed-Effect Regressions (pg. 308)
6 Geek Stuff (pg. 317)
6.1 Hardware (pg. 317)
6.2 Algorithmics (pg. 325)
6.3 Some Programming Paradigms (pg. 345)
6.4 Graph Theory (pg. 353)
7 Numerical Methods (pg. 359)
7.1 Round-off and Truncation Errors (pg. 359)
7.2 Linear Algebra (pg. 366)
7.3 Finding the Zero of a Function (pg. 375)
7.4 Solving Systems of Nonlinear Equations (pg. 380)
7.5 Unconstrained Optimization (pg. 389)
7.6 Constrained Optimization (pg. 432)
7.7 Approximation Methods (pg. 442)
7.8 Numerical Integration (pg. 450)
7.9 Solving Differential Equations (pg. 465)
7.10 Simulation (pg. 477)
7.11 Figures and Graphs (pg. 496)
8 Solved Examples (pg. 499)
8.1 Linear Algebra: Portfolio Allocation Problem (pg. 499)
8.2 Unconstrained Optimization: Duration Model (pg. 504)
8.3 Linear Programming: LAD-Lasso Estimator (pg. 521)
8.4 Quadratic Programming: Support Vector Machines (pg. 527)
8.5 Numerical Integration: Gauss-Hermite Quadrature (pg. 539)
8.6 Simulation: Demand for Change (pg. 542)
8.7 Resampling: Quantifying Variability (pg. 548)
8.8 Makefile: Dealing with Dependencies (pg. 564)
8.9 Git: Version Control (pg. 568)
9 Extensions to Python (pg. 589)
9.1 Profiling Python Code (pg. 592)
9.2 C Programming Language (pg. 593)
9.3 C Extensions to Python (pg. 614)
9.4 FORTRAN Programming Language (pg. 617)
9.5 FORTRAN Extensions to Python (pg. 628)
9.6 Numba (pg. 629)
10 Papers and Presentations (pg. 631)
10.1 LATEX (pg. 633)
10.2 BIBTEX (pg. 641)
10.3 Beamer (pg. 649)
10.4 Incorporating PGF/TikZ Figures (pg. 656)
10.5 Other TEX/LATEX Tricks (pg. 658)
10.6 ConTEXt (pg. 658)
11 Final Thoughts (pg. 661)
11.1 Amdahl’s Law (pg. 663)
11.2 MapReduce (pg. 664)
11.3 Summary (pg. 668)
Appendices (pg. 669)
Appendix A: The Virtual Machine (pg. 671)
A.1 Installing the Virtual Machine (pg. 671)
A.2 Downloading the Virtual Machine (pg. 672)
A.3 Special Instructions (pg. 673)
Appendix BRecommended Reading (pg. 675)
B.1 Computers (pg. 675)
B.2 Programming (pg. 675)
B.3 Software Tools (pg. 675)
B.4 Databases (pg. 676)
B.5 Python (pg. 676)
B.6 Econometrics and Statistics (pg. 677)
B.7 Algorithmics (pg. 677)
B.8 Numerical Methods (pg. 678)
B.9 Data Mining and Machine Learning (pg. 678)
B.10 Information and Markets (pg. 679)
References (pg. 681)
About the Authors (pg. 695)
Name Index (pg. 697)
Subject Index (pg. 703)

Harry J. Paarsch

After initial appointments at the University of British Columbia and the University of Western Ontario, Harry J. Paarsch held the position of Professor of Economics and Robert Jensen Research Fellow in the Henry B. Tippie College of Business at the University of Iowa and subsequently Chair in Economics at the University of Melbourne. From 2011 to 2014, he worked as an applied economist and data scientist for Amazon.com.


Konstantin Golyaev

Konstantin Golyaev is an applied economist and data scientist who lives and works in Seattle.


eTextbook
Go paperless today! Available online anytime, nothing to download or install.
Device Compatibility

Features

  • Bookmarking
  • Note taking
  • Highlighting
Support