A Practical Extension of Introductory Statistics in Psychology using R
This book aims to provide a practical extension of introductory statistics typically taught in psychology into the general linear model (GLM) using R.
Chapter 1 Introduction
Typically, introductory univariate statistics courses in psychology cover the following inferential analyses (plus or minus a few more analyses):
- One Sample t-test
- Dependent Samples t-test
- Independent Samples t-test
- One-Way Analysis of Variance (ANOVA)
- Factorial ANOVA
- Simple Linear Regression
These conventions may be useful for quickly talking about a particular statistical analysis with others; however, thinking of these analyses as derivatives (or special cases) of the GLM (i.e., ordinary least squares [OLS] regression) lends itself to understanding more advanced statistical techniques. Given that, the book will provide some evidence along with R code for others to see how the aforementioned analyses can be analyzed within the GLM framework with identical answers. The GLM is not a new idea, but an idea that needs emphasizing.
1.1 What exactly is the GLM?
The general linear model is a unified statistical framework that allows us to think about all of the above analyses (and much more) with a single concise formula:
\[Y = \beta X + \varepsilon\]
where \(Y\) represents a dependent variable (DV) or a set of DVs, \(\beta\) represents a set of regression coefficients in addition to the constant, \(X\) represents an independent variable (IV) or a set of IVs, and \(\varepsilon\) represents the error around the model.
This should look familiar as the formula is similar to the simple linear regression formula or the slope-intercept form learned in algebra.
1.2 How will this book be covered?
We will go over each of the typical introductory statistics taught in psychology in five steps:
- State the null and research hypotheses
- Perform the statistical analysis in R
- Statistical decision (using an alpha (\(\alpha\)) = 0.05, two-tailed, which is the arbitrary and ubiquitous convention in psychology)1
- APA statement (bare minimum)
For each analysis, the traditional approach will be performed first followed by the GLM approach for steps 1 and 2. The goal is to show the similarities and differences between stating the null and research hypotheses in step 1 as well as how the analyses are identical using both approaches in step 2.
1.3 What won’t this book cover?
This book won’t go into assumptions of the inferential tests or exhaustively its respective formulas. This book will also not exhaustively review data manipulation, transformations, and visualization in R as there are other books that already do this well (e.g., R for Data Science by Wickham & Grolemund).
1.4 Why R?
We chose to use R to analyze and write this book because R forces us to write out how we performed our analyses (and write helpful comments if we’re nice). These codes allow us and others to re-analyze the data exactly as we have. This is not always the case with statistical software that use GUIs.
1.5 Issues and Recommendations
Hopefully, there aren’t any bugs or errors in the book, but you find any issues please report them on our github issues page. This book is also ideally meant to be a live open-source book that can be edited by us and others (you) in perpetuity by creating pull requests on our github pulls page.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Good read on p-values from the American Statistical Association (ASA): https://www.tandfonline.com/doi/full/10.1080/00031305.2016.1154108↩︎