TQMP.org | last release
http://www.tqmp.org
The Quantitative Methods for Psychology Content, volume 13(1).
TQMP.org | last releasehttp://www.tqmp.org/logo.gif
http://www.tqmp.org
Adding bias to reduce variance in psychological results: A tutorial on penalized regression
http://www.tqmp.org/RegularArticles/vol13-1/p001/p001.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p001
Helwig, Nathaniel E. -- Keywords: penalized least squares, ordinary least squares, ridge, lasso, elastic net. -- Abstract : Regression models are commonly used in psychological research. In most studies, regression coefficients are estimated via maximum likelihood (ML) estimation. It is well-known that ML estimates have desirable large sample properties, but are prone to overfitting in small to moderate sized samples. In this paper, we discuss the benefits of using penalized regression, which is a form of penalized likelihood (PL) estimation. Informally, PL estimation can be understood as introducing bias to estimators for the purpose of reducing their variance, with the ultimate goal of providing better solutions. We focus on the Gaussian regression model, where ML and PL estimation reduce to ordinary least squares (OLS) and penalized least squares (PLS) estimation, respectively. We cover classic OLS and stepwise regression, as well as three popular penalized regression approaches: ridge regression, the lasso, and the elastic net. We compare the different penalties (or biases) imposed by each method, and discuss the resulting features each penalty encourages in the solution. To demonstrate the methods, we use an example where the goal is to predict a student's math exam performance from 30 potential predictors. Using a step-by-step tutorial with R code, we demonstrate how to (i) load and prepare the data for analysis, (ii) fit the OLS, stepwise, ridge, lasso, and elastic net models, (iii) extract and compare the model fitting results, and (iv) evaluate the performance of each method. Our example reveals that penalized regression methods can produce more accurate and more interpretable results than the classic OLS and stepwise regression solutions.
Graphical methods for multivariate linear models in psychological research: An R tutorial
http://www.tqmp.org/RegularArticles/vol13-1/p020/p020.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p020
Friendly, Michael et al. -- Keywords: Canonical discriminant plots, Hypothesis-Error (HE) plots, MANOVA, MANCOVA, MMRA. -- Abstract : This paper is designed as a tutorial to highlight some recent developments for visualizing the relationships among response and predictor variables in multivariate linear models (MLMs), and implemented in convenient packages for R. These models include multivariate multiple regression analysis (MMRA), multivariate analysis of variance (MANOVA) and multivariate analysis of covariance (MANCOVA). The methods we describe go well beyond what can be understood and explained from simple univariate graphical methods for the separate response variables. We describe extensions of these methods for the case of more than just a few response variables, where the important relationships can be readily seen in the low-dimensional (2D) space that accounts for most of the relevant information. As befits the tutorial nature of this paper, we analyze some sample psychological research studies utilizing these multivariate designs, showing examples in R. In the process, we also take up several practical problems related to the assumptions of MLMs, and how these can be dealt with using graphical methods. Finally, we provide guidelines to aid researchers in conducting multivariate research, pertaining to the analysis, visualization, and reporting of such designs. The graphical and statistical methods described here are all freely available and implemented in the R packages candisc, car, heplots, and mvinfluence.
interpretation of the point-biserial correlation coefficient in the context of a school examination
http://www.tqmp.org/RegularArticles/vol13-1/p046/p046.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p046
LeBlanc, Vincent et al. -- Keywords: Tutorial, Examination, Assessment, Point-Biserial Correlation, Corrected Point-Biserial Correlation. -- Abstract : When creating an examination for educational purposes, one must make sure that the entire curriculum is covered. However, it can be difficult to do so without requiring a large number of questions. Also, it is desirable for those questions to be capable of accurately discriminating students who understand from those who do not, without being too difficult or easy. A practical way of identifying questions that fit these criteria is to study the point-biserial correlation ($r_{pb}$) between the success on a question and the number of questions correctly answered. This tutorial explains what the $r_{pb}$ is and how to use it through the interpretation of effect sizes and significance testing applied to real data. It also presents the corrected point-biserial correlation ($r_{pb}^*$), which is more suited for significance testing when one of the variables is partially determined by the other.
ivote: A simple system to conduct polls and quiz in class settings
http://www.tqmp.org/RegularArticles/vol13-1/p057/p057.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p057
T. Groulx , Julien et al. -- Keywords: Put keywords here, in a comma separated list. -- Abstract : Teaching tools are necessary to promote learning and encourage the understanding of course content. Among those tools, televoting is one by which students can answer anonymously on a question presented by an instructor. Here, we present ivote, a small application that can be used to that end. ivote is free and does not require registration from the professor nor the students, and does not present advertisements. This software can be used in conjunction with PowerPoint with minimal programming; the votes can also be exported to SPSS, Excel, Mathematica or R easily. Students can express their vote as long as they have an internet access either by entering a URL or by scanning a QR code.
Testing the number of required dimensions in exploratory factor analysis
http://www.tqmp.org/RegularArticles/vol13-1/p064/p064.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p064
Achim, Andr\'e -- Keywords: Exploratory Factor Analysis, Monte Carlo Comparisons, Number of Factors. -- Abstract : While maximum likelihood exploratory factor analysis (EFA) provides a statistical test that $k$ dimensions are sufficient to account for the observed correlations among a set of variables, determining the required number of factors in least-squares based EFA has essentially relied on heuristic procedures. Two methods, Revised Parallel Analysis (R-PA) and Comparison Data (CD), were recently proposed that generate surrogate data based on an increasing number of principal axis factors in order to compare their sequence of eigenvalues with that from the data. The latter should be unremarkable among the former if enough dimensions are included. While CD looks for a balance between efficiency and parsimony, R-PA strictly test that $k$ dimensions are sufficient by ranking the next eigenvalue, i.e. at rank $k+1$, of the actual data among those from the surrogate data. Importing two features of CD into R-PA defines four variants that are here collectively termed Next Eigenvalue Sufficiency Tests (NESTs). Simulations implementing 144 sets of parameters, including correlated factors and presence of a doublet factor, show that all four NESTs largely outperform CD, the standard Parallel Analysis, the Mean Average Partial method and even the maximum likelihood approach, in identifying the correct number of common factors. The recommended, most successful NEST variant is also the only one that never overestimates the correct number of dimensions beyond its nominal $\alpha$ level. This variant is made available as R and MATLAB code as well as a complement incorporated in a Microsoft Excel file.
An extension of within-subject confidence intervals to models with crossed random effects
http://www.tqmp.org/RegularArticles/vol13-1/p075/p075.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p075
Politzer-Ahles, Stephen -- Keywords: confidence intervals, repeated measures, linear mixed effects, within-subjects confidence intervals, crossed random effects. -- Abstract : A common problem in displaying within-subject data is that of how to show confidence intervals that accurately reflect the pattern of significant differences between conditions. The Cousineau-Morey method \parencite {c05,m08} is a widely used solution to this issue; however, this method only applies to experimental designs with only one repeated-measures factor (e.g., subjects). Many experimental designs in fields such as psycholinguistics and social psychology use crossed random effect designs where, e.g., there are repeated measures both for subjects and stimuli. For such designs, extant methods for showing within-subject intervals would require first aggregating over stimuli, and thus such intervals might be a less accurate reflection of the statistical significance patterns if the data are actually analyzed using a method that takes both random effects into account (e.g., linear mixed-effects models). The present paper proposes an extension of the method described above to address this problem; the proposal is to scale the data using a mixed-effects model, rather than using the means from each subject, and then calculate confidence intervals from the data scaled thusly. Analysis of a sample of crossed random effect datasets reveals that intervals calculated using this method give a slightly more accurate reflection of the pattern of statistical significance in the between-condition differences.
The unweighted ``harmonic mean'' solution for unbalanced anova designs : A detailed argument
http://www.tqmp.org/RegularArticles/vol13-1/p095/p095.pdf
http://dx.doi.org/10.20982/tqmp.13.1.p095
Laurencelle, Louis -- Keywords: Anova, Unbalanced design, Harmonic mean solution. -- Abstract : The treatment of unbalanced designs in analysis of variance (anova) has a long and still controversial history, an issue being the choice between the so-called harmonic mean or unweighted solution and the classical weighted solution. We here argue in favour of the unweighted, i.e. equally weighted solution, based on the following reasons. The classical solution gives more weight to the means obtained from a more numerous group of data, thus inducing a positive bias in the computation of the between-group mean square, irrespective of the groups' effect sizes. Indeed, this differential weighing is at variance with the determination and handling of effect sizes, whose values are kept free of the various group sizes implied, so that the final weighted' $F$ statistic cannot stand for a truthful reflection of those. Besides, the oft-quoted argument around the demographic representativeness of the various groups compared is specious in the context of most anova applications, the purpose of anova being to compare groups/conditions one to the other, whatever their sample sizes. Finally, in the cases of two- or multi-way designs, the weighted solution precludes the calculation of truly orthogonal and additive variance components, the linear regression' alternatives for this problem being complex and essentially arbitrary. The harmonic mean solution preserves orthogonality and additivity in the variance decomposition for multi-dimensional designs, is congruent with effect sizes and entails no differential bias in the calculation of the $F$ test whatever the sample sizes. On the other hand, it suffers from a positive bias in the $F$'s significance, a bias negligible for mildly unbalanced group sizes and aptly corrected by Rankin (1974) modified degrees of freedom.
Digging in the Dark Triad: A Replication of ``Machiavellianism and Dating: Deception and Intimacy''
http://www.tqmp.org/ReplicationStudies/vol12-3/r006/r006.pdf
http://dx.doi.org/10.20982/tqmp.12.3.r006
Durand, Guillaume -- Keywords: Replication, Machiavellianism, Dating, Dark Triad, Deception. -- Abstract : The present study is a replication and an extension of Dussault, Hojjat, and Boone (2013). Machiavellianism and dating: Deception and intimacy. Social Behavior and Personality: an international journal, 41(2), 283-294. The results support the positive association between Machiavellianism and the use of deceptive dating strategies, but fail to detect a correlation between Machiavellianism and perceived intimacy. The extension to the Dark Triad reveals that psychopathy is a stronger predictor than Machiavellianism to assess deceptive dating strategies, and supports a relationship between higher degrees of narcissism and history of intimate behaviors.