Assessment in Statistics Courses 1


Assessment in Statistics Courses: More Than a Tool for Evaluation

Anthony J. Onwuegbuzie

Howard University

Nancy L. Leech

University of Colorado at Denver

Correspondence should be addressed to Anthony J. Onwuegbuzie, Department of Human

Development and Psychoeducational Studies, School of Education, Howard University,

2441 Fourth Street, NW, Washington, DC 20059, or E-Mail: (tonyonwuegbuzie@aol.com)

Abstract Assessment in Statistics Courses 2 The current assessment reform movement in statistics encourages instructors to think more broadly about cognitive measures which assess student learning. In response, statistics instructors have begun incorporating innovative methods of assessment into their courses, the most common of these procedures being authentic assessment, performance assessment, and portfolio assessment. This paper will discuss areas to consider for assessment, problems with typical assessments, and statistical authenticity for understanding student learning.

Assessment in Statistics Courses 3 Assessment in Statistics Courses: More Than a Tool for Evaluation As stated in the May 2000 edition of the Educational Researcher, the theme of the American Educational Research Association (AERA) 2001 annual meeting was “What we know and how we know it” (AERA, 2000, p. 27). Moreover, AERA called for “penetrating and weighty discussions around issues of research methodologies, rigor, standards—within every research paradigm” (AERA, 2000, p. 27). As the annual meeting theme suggests, discussions about epistemological, ontological, and axiological underpinnings of educational research are paramount. Nowhere is such dialogue as important as in the field of statistics. This importance stems from the fact that virtually every graduate student enrolled in programs representing the field of education is required to take at least one statistics and/or quantitative-based research methodology course (Mundfrom, Shaw, Thomas, Young, & Moore, 1998).

Unfortunately, for many of these students, statistics is one of the most difficult courses in their programs of study (Schacht & Stewart, 1990). Additionally, research indicates that many college students experience high levels of statistics anxiety when confronted with statistical ideas, problems, or issues, instructional situations, or evaluative situations (Feinberg & Halperin, 1978; Onwuegbuzie & Daley, 1996;

Onwuegbuzie & Seaman, 1995; Roberts & Bilderback, 1980; Zeidner, 1991). The levels of statistics anxiety experienced by as many as 80% of students (Onwuegbuzie, 1998) can be so great that undertaking a statistics class is regarded by many as extremely negative, and perhaps, more importantly, as a major threat to the attainment of their degrees. In fact, as a result of anxiety, students often delay enrolling in statistics courses for as long as possible, sometimes waiting until the final semester of their

(Onwuegbuzie, 1997a, 1997b; Roberts & Bilderback, 1980). Moreover, many students do not regard statistics to be a relevant or important component of their degree programs, but merely a pervasive obstacle that they must overcome in order to graduate (Gal & Ginsberg, 1994). This appears to be the case for both undergraduate and graduate students.

Students who view statistics classes as obstacle courses tend to exhibit external loci of control, coupled with overwhelming fear of failing these courses (Onwuegbuzie, DaRos, & Ryan, 1997). Indeed, using phenomenological techniques, Onwuegbuzie et al. (1997) found that failure anxiety is extremely prevalent among students enrolled in statistics classes. According to these researchers, failure anxiety comprises the following three dimensions: study-related anxiety, test anxiety, and grade anxiety.

Study-related anxiety involves anxiety experienced when preparing for a test. Test anxiety pertains to anxiety experienced while taking a statistics test. Finally, grade anxiety refers to the anxiety that arises from students’ expectations of their final grades. These expectations often are incongruent with reality. For some students, the expectation may be too high, whereas for others, it may be too low. In either case, it can be anxietyinducing.

Students with one or more of these components of failure anxiety, when compared to their less-anxious counterparts, seemingly obsess with the assessment measures used by statistics instructors (Hubbard, 1997). In particular, these students tend to be preoccupied with past or upcoming in-class examinations (Onwuegbuzie et al., 1997). Consistent with this finding, using the Statistical Anxiety Rating Scale (STARS) created by Cruise and Wilkins (1980), Onwuegbuzie (1998) found students

dimensions of the STARS. All the effect sizes, as measured by Cohen’s (1988) d, corresponding to these comparisons involving test and class anxiety were greater than.60.

Disturbingly, not only has statistics anxiety been found to be related negatively to statistics achievement (Elmore, Lewis, & Bay, 1993; Lalonde & Gardner, 1993;

Onwuegbuzie & Seaman, 1995; Zeidner, 1991), but this construct has been reported to be the best predictor of achievement in research methodology (Onwuegbuzie, Slate, Paterson, Watson, & Schwartz, 2000) and statistics (Fitzgerald, Jurs, & Hudson, 1996) courses. Most recently, using path analytical techniques, Onwuegbuzie (2000) found that statistics anxiety, alongside achievement expectation, played a central role in the prediction of performance in statistics courses, mediating the relationship between statistics achievement and the following variables: research anxiety, study habits, course load, and the number of statistics courses taken. Moreover, using an experimental design, a causal link between statistics anxiety and course achievement has been documented (Onwuegbuzie & Seaman, 1995). Further, again using experimental techniques, students with poor examination-taking coping skills have been found to attain lower levels of performance on timed statistics examinations than do students with adequate coping skills (Onwuegbuzie & Daley, 1996).

The fact that high levels of underachievement and test anxiety prevail in statistics courses has led to calls for reform in the ways in which students are assessed in these classes (Gal & Ginsburg, 1994). Interestingly, until recently, many statistics instructors thought of assessment only in terms of testing and grading (Garfield, 1994). Indeed, because learning statistics typically was viewed as mastering a specific set of skills,

tests of computational skills and rote memorization (Hawkins, Jolliffe, & Glickman, 1992). As such, items on these tests tended to examine skills in isolation of a real-life problem context and did not necessarily assess whether students fully understood statistical concepts, were able to integrate statistical knowledge to solve a novel problem, were able adequately to communicate statistical findings, or were able to communicate effectively utilizing statistical terminology (Garfield, 1994). Moreover, some students who produced a correct response to an item on these traditional statistics tests often did not understand this solution or the underlying question behind it (Jolliffe, 1991). Yet, as noted by Onwuegbuzie (2000), the purpose of assessment should be multifold, including the following: (1) providing information which will facilitate decisions regarding the improvement of instruction; (2) motivating and helping students to structure their learning endeavors; (3) providing individual information to students about the extent to which they are mastering the material covered; (4) reinforcing learning by providing students with indicators of what aspects of the curriculum they have not yet mastered, and on which they should focus; (5) informing instructors about how well the classes appear to understand particular topics and what topics should be re-introduced; (6) providing diagnostic information to instructors about individual students’ strengths and weaknesses in understanding new material;

and (7) providing an overall indicator of students’ performance levels (Busk, 1998; Garfield, 1994; National Council of Teachers of Mathematics [NCTM],

summary statistic is unable to inform students as to what aspects of the curriculum they have not yet mastered, nor, in the absence of a thorough item analysis, does such a statistic inform the instructor of students’ areas of weakness. Moreover, as the goals and objectives for the teaching of educational statistics continue to evolve as we progress through the 21st century, traditional assessments are more apt to be misaligned to desired student outcomes.

Rather, as envisioned and advocated by the National Council of Teachers of Mathematics (NCTM), measures of statistics performance should be an active process that yields information about students’ progress towards the achievement of course goals and objectives on an on-going process. According to NCTM (1993), when the information derived from assessment instruments is consistent with course goals and is used effectively to inform instruction, it serves to promote student learning as well as to monitor it. In fact, assessments should be used not only to provide information to students and instructors alike, but also in research on teaching and learning statistics, as well as in assessing the efficacy of different curricula or pedagogical techniques (Garfield, 1998).

In light of the aforementioned criteria, a comprehensive approach to assessment is needed, beyond that of traditional testing and grading (Onwuegbuzie, 2000).

Encouragingly, rather than being an activity distinct from instruction, as until recently has been the case in statistics courses, assessment is now being utilized as an integral part of both teaching and learning (Mathematical Sciences Education Board, 1993).

Thus, the current assessment reform movement in statistics encourages instructors to incorporate cognitive measures that assess student learning more extensively (Garfield,

begun utilizing creative methods of assessment in their courses (Onwuegbuzie, 2000).

Before deciding on the method(s) of assessment to use in a statistics class, the instructor must reflect upon a myriad of considerations. These considerations comprise the context in which the course is taught, the desired content of the course, and the preferred pedagogical style of the instructor. The relationships among these variables are presented in Figure 1.


Indeed, as can be seen from this figure, the context of teaching statistics represents the first consideration for statistics instructors. That is, before deciding how to assess statistics learning, the instructor should take into consideration the context in which the class is taught. Next, the educator should then simultaneously take into account the intended content of the course (i.e., curriculum) and her/his pedagogical style. After considering these three components, the instructor is now ready to design the course assessments. However, it should be noted that the relationship among the content, pedagogical style, and assessment is somewhat recursive. That is, just as the content and pedagogical style influence the eventual assessment tools used in the statistics course, the type of assessment techniques incorporated can influence both the content and pedagogical style. Considerations regarding the context, content, and pedagogical style are discussed below.

have been made, the statistics instructor is then ready to design an assessment package. There are five basic considerations and three dimensions necessary to consider when thinking about assessments for statistics. The fundamental decisions, as noted by Garfield (1994), include the following five dimensions: (a) what to assess, (b) the purpose of assessment, (c) how to assess it, (d) who will undertake the assessment, and (e) the action to be taken by the instructor and the nature of feedback given. Clearly, these five facets are dependent on one another. The first component, what to assess, comprises concepts, skills, applications, attitudes, and beliefs (Garfield, 1994). The second consideration, the purpose of assessment, forces the instructor to reflect upon his/her philosophical underpinnings for assessing statistics learning. The third consideration, namely, how to assess statistics learning, depends largely on the purpose of the assessment. For example, if the purpose of the assessment is to evaluate students’ ability to communicate statistical findings to groups of individuals, then the instructor is more likely to require oral presentations.

The fourth consideration of the assessment framework is who will undertake the assignment. Possible administrators are the course instructor, peers, and the students themselves. Although the former prevails, it is important for students to learn how to evaluate and to apply their own knowledge and skills (Garfield, 1994). One way of helping students to engage in self-assessment is via scoring rubrics (Wilson & Onwuegbuzie, 1999). These rubrics allow students to apply scoring criteria to their own work, as well as to their peers, so that they can learn how their ratings compare to those of their instructor (Wilson & Onwuegbuzie, 1999). Other ways of assisting students to self-assess their work is by providing them with model papers and exemplars of good

standards expected by the statistics teacher (Garfield, 1994).

The fifth and final consideration is the action that the instructor intends to take based on the results of the assessment and the nature of feedback provided to students. According to Garfield (1994, paragraph 26), “this is a crucial component of the assessment process that provides the link between assessment and improved student learning.” These five considerations then form a useful framework for designing assessment tools in statistics courses.

