Robust scientific knowledge is contingent upon replication of original findings. However, replicating researchers are constrained by resources, and will almost always have to choose one replication... Show moreRobust scientific knowledge is contingent upon replication of original findings. However, replicating researchers are constrained by resources, and will almost always have to choose one replication effort to focus on from a set of potential candidates. To select a candidate efficiently in these cases, we need methods for deciding which out of all candidates considered would be the most useful to replicate, given some overall goal researchers wish to achieve. In this article we assume that the overall goal researchers wish to achieve is to maximize the utility gained by conducting the replication study. We then propose a general rule for study selection in replication research based on the replication value of the set of claims considered for replication. The replication value of a claim is defined as the maximum expected utility we could gain by conducting a replication of the claim, and is a function of (a) the value of being certain about the claim, and (b) uncertainty about the claim based on current evidence. We formalize this definition in terms of a causal decision model, utilizing concepts from decision theory and causal graph modeling. We discuss the validity of using replication value as a measure of expected utility gain, and we suggest approaches for deriving quantitative estimates of replication value. Our goal in this article is not to define concrete guidelines for study selection, but to provide the necessary theoretical foundations on which such concrete guidelines could be built.Translational Abstract Replication-redoing a study using the same procedures-is an important part of checking the robustness of claims in the psychological literature. The practice of replicating original studies has been woefully devalued for many years, but this is now changing. Recent calls for improving the quality of research in psychology has generated a surge of interest in funding, conducting, and publishing replication studies. Because many studies have never been replicated, and researchers have limited time and money to perform replication studies, researchers must decide which studies are the most important to replicate. This way scientists learn the most, given limited resources. In this article, we lay out what it means to think about what is the most important thing to replicate, and we propose a general decision rule for picking a study to replicate. That rule depends on a concept we call replication value. Replication value is a function of the importance of the study, and how uncertain we are about the findings. In this article we explain how researchers can think precisely about the value of replication studies. We then discuss when and how it makes sense to use replication value as a measure of how valuable a replication study would be, and we discuss factors that funders, journals, or scientists could consider when determining how valuable a replication study is. Show less
Prediction rule ensembles (PREs) are a relatively new statistical learning method, which aim to strike a balance between predictive performance and interpretability. Starting from a decision tree... Show morePrediction rule ensembles (PREs) are a relatively new statistical learning method, which aim to strike a balance between predictive performance and interpretability. Starting from a decision tree ensemble, like a boosted tree ensemble or a random forest, PREs retain a small subset of tree nodes in the final predictive model. These nodes can be written as simple rules of the form if [condition] then [prediction]. As a result, PREs are often much less complex than full decision tree ensembles, while they have been found to provide similar predictive performance in many situations. The current article introduces the methodology and shows how PREs can be fitted using the R package pre through several real-data examples from psychological research. The examples also illustrate a number of features of package pre that may be particularly useful for applications in psychology: support for categorical, multivariate and count responses, application of (non)negativity constraints, inclusion of confirmatory rules and standardized variable importance measures.Translational AbstractThis manuscript presents prediction rule ensemble (PRE) methodology. This is a relatively new nonparametric exploratory regression method, which has been found to provide predictive performance close to that of modern machine-learning algorithms like random forests, while the fitted model consists of a small number of rules and predictor variables. These rules are statements of the form if [condition] then [prediction], which are relatively easy to interpret by human decision makers (e.g., psychologists, medical doctors). These rules can be used for identifying persons or subgroups at higher or lower risk for a given disorder, for example: if [gender = male & age > 55 & symptom A is present] then [log-odds of having the disorder + 5]. The current paper introduces PRE methodology, shows how PREs can be fitted using the R package pre and how the results can be interpreted. This is shown through three real-data examples from psychological research: predicting chronic depressive trajectories, predicting academic achievement among first-year psychology students and predicting last-week substance use in a randomized clinical trial. The examples also serve to illustrate features of package pre that may be particularly useful for applications in psychology, for example its support for categorical, multivariate and count responses, and the possibility of identifying high- or low-risk subgroups only. Show less
Recent years have seen an emergence of network modeling applied to moods, attitudes, and problems in the realm of psychology. In this framework, psychological variables are understood to directly... Show moreRecent years have seen an emergence of network modeling applied to moods, attitudes, and problems in the realm of psychology. In this framework, psychological variables are understood to directly affect each other rather than being caused by an unobserved latent entity. In this tutorial, we introduce the reader to estimating the most popular network model for psychological data: the partialcorrelationnetwork. We describe how regularization techniques can be used to efficiently estimate a parsimonious and interpretable network structure in psychological data. We show how to perform these analyses in R and demonstrate the method in an empirical example on posttraumatic stress disorder data. In addition, we discuss the effect of the hyperparameter that needs to be manually set by the researcher, how to handle non-normal data, how to determine the required sample size for a network analysis, and provide a checklist with potential solutions for problems that can arise when estimating regularizedpartialcorrelationnetworks. (PsycINFO Database Record (c) 2018 APA, all rights reserved) Show less