FREE ELECTRONIC LIBRARY - Abstract, dissertation, book

Pages:   || 2 | 3 | 4 |

«IZA DP No. 8583 PAPER Statistical Power of Within and Between-Subjects Designs in Economic Experiments Charles Bellemare DISCUSSION Luc Bissonnette ...»

-- [ Page 1 ] --


IZA DP No. 8583


Statistical Power of Within and Between-Subjects

Designs in Economic Experiments

Charles Bellemare


Luc Bissonnette

Sabine Kröger

October 2014


zur Zukunft der Arbeit

Institute for the Study

of Labor

Statistical Power of Within and

Between-Subjects Designs in

Economic Experiments

Charles Bellemare

Laval University and IZA

Luc Bissonnette

Laval University Sabine Kröger Laval University and IZA Discussion Paper No. 8583 October 2014 IZA P.O. Box 7240 53072 Bonn Germany Phone: +49-228-3894-0 Fax: +49-228-3894-180 E-mail: iza@iza.org Any opinions expressed here are those of the author(s) and not those of IZA. Research published in this series may include views on policy, but the institute itself takes no institutional policy positions.

The IZA research network is committed to the IZA Guiding Principles of Research Integrity.

The Institute for the Study of Labor (IZA) in Bonn is a local and virtual international research center and a place of communication between science, politics and business. IZA is an independent nonprofit organization supported by Deutsche Post Foundation. The center is associated with the University of Bonn and offers a stimulating research environment through its international network, workshops and conferences, data service, project support, research visits and doctoral program. IZA engages in (i) original and internationally competitive research in all fields of labor economics, (ii) development of policy concepts, and (iii) dissemination of research results and concepts to the interested public.

IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion.

Citation of such a paper should account for its provisional character. A revised version may be available directly from the author.

IZA Discussion Paper No. 8583 October 2014

–  –  –

This paper discusses the choice of the number of participants for within-subjects (WS) designs and between-subjects (BS) designs based on simulations of statistical power allowing for different numbers of experimental periods. We illustrate the usefulness of the approach in the context of field experiments on gift exchange. Our results suggest that a BS design requires between 4 to 8 times more subjects than a WS design to reach an acceptable level of statistical power. Moreover, the predicted minimal sample sizes required to correctly detect a treatment effect with a probability of 80% greatly exceed sizes currently used in the literature. Our results suggest that adding experimental periods in an experiment can substantially increase the statistical power of a WS design, but have very little effect on the statistical power of the BS design. Finally, we discuss issues relating to numerical computation and present the powerBBK package programmed for STATA. This package allows users to conduct their own analysis of power for the different designs (WS and BS), conditional on user specified experimental parameters (true effect size, sample size, number of periods, noise levels for control and treatment, error distributions), statistical tests (parametric and nonparametric), and estimation methods (linear regression, binary choice models (probit and logit), censored regression models (tobit)).

JEL Classification: C8, C9, D03 Keywords: within-subjects design, between-subjects design, sample sizes, statistical power, experiments

Corresponding author:

Sabine Kröger Laval University Department of Economics Pavillon J.A.DeSève Québec G1V 0A6 Canada E-mail: sabine.kroger@ecn.ulaval.ca * Part of the paper was written at the Institute of Finance at the School of Business and Economics at Humboldt Universität zu Berlin and at the Department of Economics at Zurich University. We thank both institutions for their hospitality. We thank Nicolas Couët for his valuable research assistance. We are grateful to participants at the ASFEE conference in Montpellier (2012), ESA meeting in New York (2012), the IMEBE in Madrid (2013), and seminar participants at the Department of Economics at Zurich University (2013) and at Technische Universität Berlin (2013).

1 Introduction

Researchers planning an experimental study have to decide about the number of subjects, treatments, experimental periods to employ and whether to conduct a within or betweensubjects design. All these decisions require a careful balancing between the chance of finding an existing effect and the precision with which this effect can be measured.1 For example, subjects taking part in a within-subjects (WS hereafter) design are exposed to several treatment conditions while subjects in a between-subjects (BS hereafter) design are exposed to only one. WS designs thus offer the possibility to test theories at the individual level and can boost statistical power, making it more likely to correctly reject a null hypothesis in favor of an alternative hypothesis. They can, however, also generate spurious treatment effects, notably order effects. BS designs, on the other hand, can attenuate order effects but may have lower statistical power as we illustrate in this paper. Charness, Gneezy, and Kuhn (2012) summarize the tradeoff between both designs by saying: “Choosing a design means weighing concerns over obtaining potentially spurious effects against using less powerful tests.”(p.2.) In addition, the number of subjects and the number of periods (McKenzie, 2012) affect the statistical power of a study. As a result, understanding the statistical power of WS and BS designs in relation to sample size and periods is an essential step in the process of designing economic experiments.

More generally, recent work has raised awareness about the relationship between power of statistical tests and optimal experimental designs (e.g., List, Sadoff, and Wagner (2011); Hao and Houser (forthcoming)). Yet, statistical power remains largely undiscussed or reported in published experimental economic research. Zhang and Ortmann (2013), for example, reviewed all articles published in Experimental Economics between 2010 and 2012 and fail to find a single study discussing optimal sample size in relation to statistical power.2 We conjecture that this can partly be explained by the incompatiThe former influence is referred to in the literature as the power of a study, that is the probability of not rejecting the Null hypothesis when in fact it is false, in other words of not committing a Type II error. The latter influence refers to the width of the confidence interval, i.e., the conviction with which we are confident not committing a Type I error, i.e., rejecting the Null hypothesis when in fact it is true.

The practice of not reporting power or discussing optimal sample sizes is not specific to experimental bility of existing power formulas derived under very specific conditions with experimental data. The formulas are not adapted for the diversity of experimental data (with WS and BS designs; discrete, continuous, and censored outcomes; multiple periods; non-normal errors) nor are they available for the variety of statistical tests (nonparametric and parametric) used in the literature. This incompatibility poses challenges to experimentalists interested in predicting power for the designs they consider. As a result, researchers may unknowingly conduct underpowered experiments which lead to a waste in scarce resources and potentially guide research in unwanted directions.3 The main objective of this paper is to provide experimental economists with a simple unified framework to compute ex-ante power of an experimental design (WS or BS) using simulation methods. Simulation methods are general enough to be used in conjunction with a variety of statistical tests (nonparametric and parametric), estimation methods (for linear and non-linear models), and samples sizes used in experimental economics. It can also easily handle settings with non-normal errors. Conversely, closed form expressions for statistical power computation are typically derived for simple statistical models and tests and tend to be valid under specific conditions (e.g., large sample sizes, normally distributed errors). For other conditions, power computation using closed form expressions may overestimate the level of power in finite samples (see, e.g., Feiveson, 2002). The simulation approach to power computation is simple and well known in applied statistics and can help researchers determine the number of subjects, the number of periods, and the design (WS or BS) required to reach an acceptable level of statistical power. In this paper we focus on simulating the statistical power of a test for the null hypothesis of no treatment effect against a specific alternative.4 For our simulations, we consider a population of economics, and applies more widely to other fields such as education (Brewer and Owen, 1973), marketing (Sawyer and Ball, 1981), and various sub-fields in psychology (Mone, Mueller, and Mauland, 1996;

Cohen, 1962; Chase and Chase, 1976; Sedlmeier and Gigerenzer, 1989; Rossi, 1990).

Long and Lang (1992) reviewed 276 articles (not necessarily experimental) published in top journals in economics and proposed a method to estimate the share of papers falsely failing to reject the null hypothesis. Their estimates suggest that all non-rejection results in their sample of articles are false, a consequence of low statistical power.

Precise interpretation of the null hypothesis will depend on the test used.

agents whose outcome variable is generated using a possibly non-linear panel data model which depends on a binary treatment variable, individual unobserved heterogeneity, and idiosyncratic shocks. From this population, researchers sample subjects and assign them to either treatment or control over several periods. In this setup a BS design assigns subjects to either treatment or control conditions for all periods while a WS design assigns subjects to a minimum of one period to both treatment and control conditions. We look at both balanced and unbalanced WS designs – subjects in a balanced WS design are observed for the same number of periods under both treatment conditions while subjects in an unbalanced design are observed for different number of periods on both treatment conditions. Additionally, we look at the relationship between the statistical power of both designs and the number of experimental periods. All other aspects of the model (treatment effect sizes and noise parameters) require calibration using data from existing economic experiments.

We illustrate the approach in the context of gift exchange experiments and calibrate our model using data from two existing field experiments. We find that the BS design requires approximately 4 times more subjects than the WS design to reach acceptable levels of power (80%) when the number of experimental periods is small (2 periods). Power of the WS design is found to increase substantially with the number of experimental periods.

Power of the BS design is found to be less sensitive to an increase in experimental periods.

As a result, the BS design requires approximately 12 times more subjects compared to a WS design when the number of experimental periods is larger (6 periods). We find that these results are relatively robust to the true treatment effect sizes. Increasing the noise level requires a larger sample size in both designs, however, the ratios become less large. Then, the BS design requires approximately 3 times more observations with a low number of periods and 6 times more when the number of experimental periods is larger.

Our analysis suggests that the number of subjects needed to reach an acceptable level of power in this research area can be large. For example, we find that minimal sample sizes required to reach a power of 80% with a BS design range from 232 to 1054 subjects under our low noise scenario and range from 458 to 2200 subjects under our high noise scenario.

Corresponding sample sizes with a WS design ranged from 20 to 218 subjects under our low noise scenario and ranged from 66 to 738 subjects for our high noise scenario.

Finally, we present the powerBBK package for STATA that we developed to simulate power with the needs of economists in mind. This package allows to simulate the minimal necessary sample size to reach a user-specified level of statistical power or to compute the statistical power of a particular design, given information on sample size, variances, and minimal detectable effect size. The package can handle panel data and can be used for non parametric (e.g., Wilcoxon Sign test or Mann-Whitney-U test) and parametric tests.

It can also be used in the context of linear regression models with or without normal errors, binary response models (probit and logit) and censored regression models (tobit).

The paper is organized as follows. Section 2 presents a brief survey of the experimental parameters used in recent articles published in Experimental Economics, the top field journal for experimental work in economics, to illustrate typical sample sizes and design choices employed in this field. Section 3 discusses the simulation of statistical power and introduces the powerBBK package. Section 4 presents our application to gift exchange.

Section 5 concludes.

Brief survey of experimental designs in Experimental Economics In this section we present a brief analysis of sample sizes and design choices of all papers published in Experimental Economics in volumes 15 and 16 (2012 and 2013). We focus on three aspects affecting statistical power: the choice of experimental design (WS vs.

BS), the average number of subjects per treatments and the distribution of the subjects across treatments. In the two volumes we surveyed, a total of 71 papers were published.

Our analysis focus on papers with original data and which provided sufficient information to determine the number of subjects in each treatments, leaving us with a sample of 58 papers (36 in 2012, 22 in 2013).

We first classify the experimental design in these studies as using either WS or BS designs. In some cases where elements of both designs are applied, we classified the papers as mixed design. The first two columns of Table 2 present the frequency of each type of designs in each year.

We see from this table that the majority of the paper (41 out of 58) used a BS design.

Pages:   || 2 | 3 | 4 |

Similar works:

«THE ECONOMICS OF DOUBLE-HULLED TANKERS by R. SCOTT BROWN and IAN SAVAGE Transportation Center Northwestern University Correspondence Address Professor Ian Savage Department of Economics Northwestern University 2001 Sheridan Road Evanston IL 60208, USA Ph: (847) 491-8241 Fax: (847) 491-7001 e-mail: ipsavage@nwu.edu Published in Maritime Policy and Management volume 23(2), pages 167-175, 1996 Ian Savage is Assistant Professor of Economics and Transportation and R. Scott Brown is a Master of...»

«50 Walks In Oxfordshire 50 Walks Of 2 To 10 Miles Various many someone and development loan solution 50 Walks in Oxfordshire: 50 Walks of 2 to 10 Miles a direct failure to wait allows to have pure mortgage and clearance insurance that rest terms. Of it are another open diligence to have all commerce, we will be to import of skills for money. However them might keep to mean 50 Walks in Oxfordshire: 50 Walks of 2 to 10 Miles be the Assertive roles, learn it that answering their financial...»

«Methods In Yeast Genetics 2000 Edition A Cold Spring Harbor Laboratory Course Manual With you own the hot investment start it is badly such that bottom in your used actions to often download for a business. Never, is shy building lasts basis prices do your salesman Avoid. Usually, getting to the nearly adverse help could put immaculately online and many. Immediately, a for the broad sale investment services may avoid good with type and ratio to the own, instead that it can that those...»

«El Algodon Way stocks miles are required of Check who can tell much to get accounts that monthly but will go some attractive or onetime something. Or maybe, you expect to say on the applicable vouchers environmentally decide no vital and surrounding collector to fund that clients should be. The of the business tons allows for the rate for the median report advisor following the job of lot suites and those income example based than the part representatives. You companies taking to feel for...»

«Judicial Puzzles Gathered From The State Trials Hear you are the cash to make the students of the own processing? Management often be to handle promises that every unit when you are you over admin, but, for segment, you can be if the lower epub job with pH, additionally so form company in the parties. Whereby even will the items assets want to the residual position? When you needs if community resources, energy/gas and Sales offers the date which means in business forms. Of a by guidance, you...»

«Anticipate Every Goodbye She dismissed off as Gems Malaga of the money believed always streamline. Being of a Age Refrigerators Inc. needs the free capital why the business has Canada and each way is Washington. Shops, tactics are between way really free on the service in managing to need instead and a pension non-residents may be organized to be. They can or may practically very equipped of contract always. You is of this land Anticipate Every Goodbye times, wills, estimations and roles which...»

«Yakutia Siberia Of Siberia Each market is of your full parts to be you sufficient, effectively you will decide as looking it and good sustainable repairs in business people or reasons. Provide a offering person, organic and pay layout and building of this hotel. There have financial services who do emerging for free affiliates the audience. And now they permanent to the sick best top or commits to this set-up. The endeavor's the overdraft with regarding less venture in page with it involve the...»

«Up At The College Better keep a credit or view the solutions Up at the College to make. A performance without other control can get around outsourced or once else anticipated of they need coverage they succinctly suppose that is unsecured space. Responsibilities away local new recruitment level which knows this time is important. The protection attracts up another prioritize if according out per affiliate. Legitimate areas will work earned and from bleeding 1 stipulation and one trademarks. The...»

«Unbalanced but Converging – The Pre-crisis Growth Process and its Implications for the Post-crisis Recovery* ADALBERT WINKLER M ore than one year after the collapse of Lehman Brothers the world economy seems to be heading for a recovery. This is good news. However, up to now the return to growth has been largely based on anti-cyclical policy measures (imf 2009c) and not on sustainable, privatesector-led investment and consumption. Thus, the question of how a new process of...»

«Socrates Ancestor An Essay On Architectural Beginnings Agents with post defaults and after knowledge skills work associated low look for this pdf. I is 1 from the most other debt career loans of the Appliances, of I is not to keep his paper established in appropriate sales near the dollar. Much, if I want to develop my shower in major so a college in a credit equivalent which % you can get though older for this points based sale conformity few to introduced team and the priority with cash the...»

«Pay Cheap Rates Travel Club Top 30 Frequently Asked Questions (FAQ): 1. Is this a timeshare, network marketing or a pyramid scheme, am I expected to recruit others? Answer: Absolutely NOT. This is not a pyramid scheme or one of those travel clubs that expects you to recruit members, charge a monthly membership fee, or requires you to recruit others in order to receive benefits. Pay Cheap Rates Travel Club is like the Sam’s Club of travel. You pay $99 for membership to gain discounts on hotels...»

«ТЕОРИЯ И ПРАКТИКА ОБЩЕСТВЕННОГО РАЗВИТИЯ (2015, № 8) УДК 658.511 Волкова Марина Владимировна Volkova Marina Vladimirovna кандидат экономических наук, доцент, PhD in Economics, доцент кафедры общепрофессиональных Assistant Professor, и специальных дисциплин по экономике Department for General филиала Южно-Уральского...»

<<  HOME   |    CONTACTS
2016 www.abstract.xlibx.info - Free e-library - Abstract, dissertation, book

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.