Have a Question?
Ask the Graduate
College at our new
Doctoral Dissertation Announcement
Candidate: Lori A. Wingate
Doctor of Philosophy
Department: Interdisciplinary Ph.D. in Evaluation
Title: The Program Evaluation Standards Applied for Meta-evaluation Purposes: Investigating Interrater Reliability and its Implications for Use
Dr. Chris Coryn, Chair
Dr. Arlen Gullickson
Dr. Leslie Cooksy
Date: Friday, October 16, 2009 10:00 a.m. - Noon
4405 Ellsworth Hall
Metaevaluation is the evaluation of evaluation. Metaevaluation may focus on particular evaluation cases, evaluation systems, or the discipline overall. Leading scholars within the discipline consider metaevaluation to be a professional imperative, demonstrating that evaluation is a reflexive enterprise. Various criteria have been set forth for what constitutes excellence in evaluation. In the context of educational program evaluation, the dominant criteria are the Program Evaluation Standards, developed by the Joint Committee on Standards for Educational Evaluation.
There has been widespread acceptance and application of the Standards, and their use is advocated by major organizations and several of the leading scholars and textbooks in evaluation. Concurrently, metaevaluation has received increasing attention within the evaluation discipline. Despite these two important developments in the field, there has been little empirical study of the Standards and their role in metaevaluation practice. There is an implicit assumption concerning their use as a tool for metaevaluation that comparable judgments about a given evaluation would be reached by different individuals when they use the Standards as criteria for assessing the evaluation’s quality. This issue concerns interrater reliability among metaevaluators when they use the Standards as rating criteria. Since reliability is a prerequisite for validity, that is a critical assumption worthy of empirical investigation.
The legitimacy of this assumption is investigated in this study by having thirty individuals—ten evaluation doctoral students, ten evaluation practitioners, and ten evaluation scholars—rate the same ten program evaluations using Standards as criteria. The overall purpose of the study is to assess interrater reliability in this context using multiple measures. The results show uniformly low interrater reliability, which has direct implications for how metaevaluations should be performed and how their results should be used.