Introductory Economic Statistics: A Data-Driven Approach using R
Preface

This Rbook introduces business and social science students to statistics with minimal parametric assumptions and formulas. Students are introduced to the basics of statistical programming using R alongside the theoretical analysis of data. In many ways, it is a modern version of “Introductory Econometrics: Using Monte Carlo Simulation with Microsoft Excel” by Barreto and Howland, updated to adhere to modern statistics teaching guidelines and give econometrics students the best tools for their labor market. Altogether, students learn to produce statistical analyses of economic data relevant to both the private and public sector, as well as an intuitive foundation for more advanced courses including nonparametric statistics, program evaluation, forecasting, structural econometrics, and more. This Rbook is organized into three substantive parts: univariate, bivariate, and multivariate data analysis.
Part I has three notable differences from a typical intro to statistics book. First: students more deeply learn to use and interpret the Histogram, ECDF, and Boxplot (as well as avoid 3D pie charts and chart junk used in business statistics books). They work with actual data before abstract probability theory (initially limited to simple events or intervals, with sums of random variable and transformations available optionally later). Second: students learn the basics of probability theory with real world data and computer simulations. I aimed to replace mathematical proofs with simulations whenever possible. This allows less emphasis on classical probability theory mechanics as well as fewer “t and z drills”. Confidence intervals and hypothesis tests are covered via boostrapping, for example, so students learn the conceptual approach rather than a formula. Three: students learn the theory and practice of univariate statistics before moving to bivariate statistics, rather than mixing uni-and-bivariate content. (Business textbooks often introduce both types of data, cover univariate statistics, and return to bivariate statistics much later, for example, while Mathematics textbooks often introduce students to probability theory long before concrete applications.) These three differences allow for a general conceptual understanding of statistics “out of the gate” and practical skillset that enables students to do more with actual datasets.
Part II covers bivariate data analysis as its own topic, in depth. After covering the basics, such as correlation and simple regression, this part notably introduces “local relationships” like loess. As such, students learn to incorporate marginal thinking into their empirical analysis early on. Students also learn to quantitatively evaluate their models early on (instead of simply assuming \(Y=X\beta+\epsilon\), as commonly done in most econometrics textbooks). This rectifies two major shortcomings in econometrics education: a disconnect between theory classes that emphasize marginal effects and empirical classes that emphasize average effects, and model selection is mostly an afterthought. Part II also covers statistical reporting using R + markdown, which research suggests is a good combination for students 1 2. As such, this part includes many practical examples on how to analyze data interactively and communicate results. The boostrapping approach to hypothesis testing covered in part I extends to bivariate data, naturally covering not just canonical statistics like correlation but also other important statistics like differences in medians or distributions. Part II lays the conceptual foundation for understand statistical relationships and provides a powerful skillset for applied data analysis.
Part III expands the same approach above to multivariate data analysis. Notably, it covers linear models from a “minimum distance” perspective and introduces “local relationships” in multivariate context. This refines material from several introductory econometrics textbooks and actually teaches the maxim “all models are wrong” instead of how to prove unbiasedness. The chapter on observational data introduces students to various interdependence issues (temporal, spatial, economic), followed by a chapter introducing experimental designs and methods relevant to business and economics students. Also included is a novel chapter on “Data scientism” that more clearly illustrates the ways that simplistic approaches can mislead rather than illuminate. (I stress “gun safety” instead of “pull to shoot”, which is missing from many introductory textbooks.) Students gain practical skills in multiple linear regression and a humble appreciation about what we can infer from them. The concepts covered provide strong preperation for more advanced models and methods, as well as a concrete foundation for more advanced theoretical inquiry.
Although any interested reader may find it useful, this Rbook is primarily developed for my students.
If you use this Rbook, please cite
@book{Adamson2025_Rbook,
title={Introductory Economic Statistics: A Data-Driven Approach using R},
author={Adamson, Jordan},
year={2025},
publisher={Bookdown},
url={https://jadamso.github.io/Rbooks/}
}Please also report any errors or issues at https://github.com/Jadamso/Rbooks/issues.