Introductory Economic Statistics: A Data-Driven Approach using R

Author

Jordan Adamson

Published

07.11.2025

Preface

This Rbook introduces students to econometrics without parametric assumptions and formulas. In many ways, it is a modern version of “Introductory Econometrics: Using Monte Carlo Simulation with Microsoft Excel” by Barreto and Howland, updated to adhere to modern statistics teaching guidelines and give econometrics students the best tools for their labor market. Altogether, students learn to produce statistical analyses of economic data relevant to both the private and public sector, as well as an intuitive foundation for more advanced courses including nonparametric statistics, program evaluation, forecasting, structural econometrics, and more.

Students are introduced to the basics of statistical programming using R alongside the theoretical analysis of economic data using R. This teaches applied statistics relevant to general students, leaving the classical statistics program to mathematics departments and poor applied practices in the dust. This Rbook is organized into three substantive parts: univariate, bivariate, and multivariate data analysis.

The first part has three notable differences from a typical intro to statistics book. First: students more deeply learn to use and interpret the Histogram, ECDF, and Boxplot (and to avoid 3D pie charts and other chart junk). They work with actual data before abstract probability theory (initially limited to simple events or intervals, with sums of random variable and transformations available optionally later). Second: students learn the basics of probability theory with real world data and computer simulations. I aimed to replace mathematical proofs with simulations whenever possible. This allows less emphasis on classical probability theory mechanics as well as fewer “t and z drills”. Confidence intervals and hypothesis tests are covered via boostrapping, for example, so students learn the conceptual approach rather than a formula. Three: students learn the theory and practice of univariate statistics before moving to bivariate statistics, rather than mixing uni-and-bivariate content. Business textbooks often introduce both types of data, then cover univariate statistics, and return to bivariate statistics much later. Math textbooks typically introduce students to probability theory long before concrete applications. This textbooks includes many practical examples, including on how to analyze data interactively and communicate results.

Part II covers bivariate data analysis. Notably, it introduces “local relationships”. As such, students gain familiarity with modern statistical methods, like loess. Students learn to incorporate marginal thinking into their empirical analysis early on, and also to quantitatively evaluate their models instead of simply assuming \(Y=X\beta+\epsilon\). This rectifies two major shortcomings in econometrics education: theory classes emphasize marginal effects while empirical classes emphasize average effects, and model selection is mostly an afterthought. Part II also covers statistical reporting using R + markdown, which research suggests is a good combination for students 1 2. The boostrapping approach to hypothesis testing covered in part I extends to bivariate data, naturally covering not just canonical statistics like correlation but also other important statistics like differences in medians or distributions.

Part III refines material from several introductory econometrics textbooks building on the foundations laid in the previous chapter. Notably, it covers linear models from a “minimum distance” perspective and introduces “local relationships” in multivariate context. (We operate under the maxim “All models are wrong” and do not prove unbiasedness.) The chapter on observational data introduces students to various interdependence issues (temporal, spatial, economic), followed by a chapter introducing experimental designs and methods relevant to business and economics students. Also included is a novel chapter on “Data scientism” that more clearly illustrates the ways that simplistic approaches can mislead rather than illuminate. (I stress “gun safety” instead of “pull to shoot”, which is missing from many econometrics textbooks that start with “Assume \(Y=X\beta+\epsilon\)”.) Overall, there is a more humble view towards what we can infer from linear regressions that opens the door towards more advanced courses in model development and interpretation.


Although any interested reader may find it useful, this Rbook is primarily developed for my students.

If you use this Rbook, please cite

@book{Adamson2025_Rbook,
  title={Introductory Economic Statistics: A Data-Driven Approach using R},
  author={Adamson, Jordan},
  year={2025},
  publisher={Bookdown},
  url={https://jadamso.github.io/Rbooks/}
}

Please also report any errors or issues at https://github.com/Jadamso/Rbooks/issues.