JMDE
Journal of MultiDisciplinary Evaluation
Number 1,
October 2004
Editors
E. Jane Davidson &
Michael Scriven
Associate
Editors
Chris L. S. Coryn & Daniela C. Schröter
Assistant
Editors
Thomaz Chianca
P. Cristian Gugiu
Paul A. Lamphear
Mary Keating
Nadini Persaud
John S. Risley
Lori Wingate
BrandonYouker
Webmaster
Dale Farland
—The news and thinking
of
the profession and discipline of evaluation
in
the world, for the world—
A peer-reviewed
journal published in association with
The Interdisciplinary Doctoral Program in
Evaluation
The
Editorial Board
Katrina
Bledsoe
Robert
Brinkerhoff
Tina Christie
J. Bradley Cousins
Lois-Ellen Datta
Stewart Donaldson
Gene
Glass
Richard
Hake
John
Hattie
Ana
Carolina Letichevsky
Mel
Mark
Michael
Quinn Patton
Nick
Smith
Robert
Stake
James
Stronge
Dan
Stufflebeam
Helen
Timperley
Bob
Williams
Introduction
Welcome to the first issue (October,
2004) of the Journal of MultiDisciplinary Evaluation!
As we ‘go to press’ there are 629 people signed up for notification of its
appearance, from about 50 countries. Please pass the internet address along to
your friends and colleagues, and tell them that all issues will continue to be
available by a single click directly from our home page.
This issue is close to 150 pages,
but it’s split into three parts for easier downloading. And it’s designed to
facilitate selective reading: find your way around by looking at the Table of
Contents, below, and clicking on a section or subsection title to go directly
there. Be sure to check out the Essay Competition, which is buried in a short
piece called “Zen and the Art of Everyday Evaluation”—and consider entering an
essay (it only needs to be 500 words or so). Also think about us for an article
(or a letter or a memo)—see the Mission Statement for details on submissions.
And get a sense of what’s happening in evaluation around the world through the
90 pages of our Global Review—of regions (Part II) and of journals (Part III).
Can you enrich this with more about evaluation in your part of the world or
your publication? Join our emerging group of onsite correspondents by bringing
us all up to date—follow the model of our coverage of
In the next issue, we’ll have: (i) some serious coverage of the arguments about methods of
demonstrating causation in evaluation; (ii) discussion of valid and invalid
efforts at controlling cultural bias in evaluation; (iii) the beginnings of an
item pool for testing competence and proficiency in evaluation. And more!
Table of Contents
Part I
Mission for the Journal of
MultiDisciplinary Evaluation
Editorial: The Fiefdom Problem,
Scriven, M
Unpacking the Participatory Process, Weaver, L. & Cousins, J. B
Zen and the Art of Everyday
Evaluation, Scriven, M
Part II
Evaluation Activities in
Africa, Lamphear, P. A
Evaluation Activities in
Australasia, Risley, J. A
The State of Evaluation in
Canada, Coryn, C. L. S
Evaluation in Europe: An
Overview, Schröter, D. C
Evaluation Activities in the
United Kingdom, Risley, J. A
Evaluation in Eastern Europe
and the Middle East, Gugiu, P. C
Part III
What’s Happening in AJE
(2003-2004), Wingate, L
Evaluation: The International
Journal of Theory, Research and Practice (2003-2004), Schröter, D. C
The Japanese Journal of
Evaluation Studies, Risley, J. A
Journal of Evaluation and
Program Planning, Switalski Schinker, R
New Directions for Evaluation
(Vol. 102), Keating, M
Michael Scriven
A. Why a new journal?
1. We have excellent journals in
evaluation, and it would be hard to argue for simply adding one more of their
kind to their numbers. But if professional evaluation is going to help improve
the world, as many of us strongly believe it can, it must take seriously the
task of communicating current developments and skills to the evaluators,
evaluation users, and would-be evaluators amongst those people in the world who
can’t afford to subscribe to the traditional journals or attend the traditional
workshops and courses of study. Those people include impecunious students in
the industrialized nations, as well as impecunious teachers and community
members there, and most people in the primarily rural/agricultural nations. So
this journal is different in that it’s free. It won’t reach everyone who could
use it, because not everyone can get to and use a computer terminal with online
capability, and read English, but it will be available to several million
people that can, and that number is increasing fast.
2. As some of you know, the great
war between the commercial publishers that control most of the scholarly
journals, and the great libraries that have been making those publishers rich
via the massive increases in library subscriptions has at last resulted in a
battle won for scholarship. After an abortive effort at negotiation by, amongst
others, the State University of New York libraries, the University of
California recently simply refused to pay the latest increase, and the
publishers backed down, cutting about $1 million dollars (U.S.) off the annual
bill. Harvard and Cornell are simply canceling 300
journal subscriptions between them; the Research Triangle Libraries (Duke, UNC,
NCSU) are doing the same. It’s hard to say how that war will turn out, but
scholarly interests are obviously served by facilitating the option of online
publication, and the Senates at
3. There are many other niches in
the journal world that need to be filled besides radically reducing the cost of
access, given that we start with the belief that the existing evaluation
journals are extremely good, and that direct competition with them would be
counter-productive. One of these niches, in our opinion, is the need to move
towards some coverage of significant evaluation happenings in countries outside
4. Another niche. We want to publish
good ideas, and we don’t care whether they are embedded in a typical journal
article, although those are the vehicles that get the peer review treatment. If
you can express your idea in a clearly written paragraph or two, or in a memo,
or in a letter, and it looks to the editors like something worthwhile, we’ll
publish it. Your thoughts might be reactions to your own experiences, to the
experiences of others, or to previously published material, which could include
a well-known book or article, not necessarily one reviewed here. No, that last e-mail
you sent to EVALTALK probably isn’t going to qualify. But it might dress up
well, with some serious further thought—and with some attention to reactions
from others on ETALK—if it’s not too esoteric. Remember, our readership won’t
consist of PhDs in philosophy or psychology!
5. And another niche. We’ll review
some books, sometimes books that have been out for quite a while but that have
been gradually gathering importance or a following. But we often won’t review
them in the usual way: we might use two or three reviewers, who might include
an ally, a critic, and a bystander. That’s often more interesting and useful to
the reader than a single review. And we’ll also encourage the authors to reply
to the reviews, in the same or the next issue. Later, the reviewers can reply
to the author’s comments. In other words, we want the serious discussion of
major emerging movements or themes in evaluation to be strongly supported in
this journal. In the same spirit, we’ll hope to get submissions of dialectic
pieces—double articles, with one responding to the other.
6. And…. Authors can add postscripts
to their articles, a year after they are published…. Or several years
later. They can’t alter the original text, and the postscript will be date-stamped,
but it can set the record straight when they want to do this, or strengthen the
arguments if they want to do this. All articles will be archived and available
to the searcher in the usual way.
7. Moreover…. This isn’t just a
research journal. It’s a journal aimed at communicating about evaluation to a
very diverse readership. That may mean that it should be partly instructional,
too. The model of hybrid journal/magazine publications such as Scientific
American is worth taking seriously. Along with new research results, they
often publish overviews of material that the expert knows well, but the
outsider or student in that particular field knows little about. In that
spirit, too, we’ll do some reportage on what other journals are covering, for
those who can get them through a library. Another common feature of
publications like Scientific American is an inquiries column where
an expert responds to questions from the field. To the extent that our
resources permit, we’ll explore the inclusion of that kind of material. And
that means you can submit that kind of material. Instructors might submit what
seems to them a neater treatment of logic models than is found in the standard
texts; or their responses to the most common misconceptions about evaluation from
students in their mid-career extension course for ward nurses, and someone else
may respond to their articles. Could we have an Ethics column? Perhaps, if good
questions and good answerers can be found.
8. Furthermore…. In the 0th
issue of JMDE, which was to be just an introductory flourish to show we’re here
and working, there’s a not-too-serious piece called ‘Zen and the Art of
Everyday Evaluation’. Zen masters are famous for their use of puzzles, known as
koans, which illustrate some deep point in Zen
thought. There’s an evaluation koan in this article,
and it’s the first of what we hope will be a series of problems or puzzles that
we’ll publish from time to time. And of course there will be some prizes for
the best answers, usually an interesting book. If you come across or think up
an interesting puzzle about evaluation, send it in! We will probably dig up a
prize for the year’s best entry. This article and many handwritten pages were
on a clipboard stolen from Michael Scriven in
9. Besides which…. What else could
we do that would be interesting and useful? We welcome your suggestions.
(10)You’re already thinking about the use of photos? Right, so have we, though
the technical problems are not trivial for the software we’re using. (11) You
thought of color, too, perhaps for concept maps and
logic diagrams? You can bet we’ll be working on that, it’s a potentially
substantial advantage of the online medium. (12) How about cartoons? Send them
in; become the first famous evaluation cartoonist! (13) What about material
from the dozen other fields of evaluation that have attained professional
status, such as policy studies and personnel evaluation and product evaluation?
That’s one of the reasons for the title; we want to encourage border crossing,
and there’s perhaps room for more of it than finds its way into the existing
journals. (14) And how about exploiting the greatest strength of online
publication: the response speed? We will put out special issues when it seems
urgent to do so: for example, it might have been helpful to do one on the
‘Causal Wars’ that split the evaluation community last year, with of course
both sides well represented. This is not a vehicle for a partisan approach to
evaluation: to the extent that we can provide diversity and civility, which
will be our aim.
10. We have some other ideas, but
perhaps 14 suggestions will be enough to indicate that JMDE (“Jim Dee”) has a place on the team bench. With your help, we
can fill that place and expand it too.
B. Why this title?
We considered many titles. Googling
them revealed that almost all had been taken or virtually taken. But we rather
like this one, because it suggests something that’s important to us, the notion
that the essence of evaluation, not just historically but in practice today, is
its multiple lineage. We’ll try to illustrate that in the pages we publish, and
hope that authors will be attracted by it. And there’s nothing esoteric about
the title: the phrase “multidisciplinary evaluation” generated 318,000 hits on Google
recently, so the term is one in common use, notably in the medical and
psychiatric fields where it refers to the efforts at diagnosis that require
specialists from very different fields to collaborate. In program evaluation,
this most obviously connotes the collaboration between the subject matter
expert and the evaluation expert. But that’s just an epidermal analysis. The
fact is that there’s often a need for an expert cost-analyst, an expert focus
group or survey specialist, an expert on text analysis or case study, maybe an
attorney or an organizational development or a community development
specialist, or an expert on another culture or from a distinctive community.
Many of us become pretty good at several of these specialties, but the big
shops often have them on staff or standby.
Moreover, there is often a multiple
disciplinary interaction at the subject-matter level, not just in applied
psychology and medicine; for example, an authority on eLearning
prefaced an online discussion a couple of weeks ago by saying “e-Learning
involves multiple disciplines e.g., philosophy, psychology, pedagogy,
anthropology, artificial intelligence (e.g., Artificial Intelligence in
Education (AIED)), and human computer interaction.” Evaluation of e-learning
courses or programs, and many other kinds of evaluand,
is often, perhaps typically, like this; and it may be good to pay more
attention to this feature of it than we have done in the past. Hence the title.
(And why JMDE, not JME? Out of respect for the Journal of Moral Education and the Journal of Management Education!)
C. Who is producing it?
The co-editors will be Jane Davidson from
Special thanks, too, to the Canadian
Government, for funding the development and free distribution of the software
we are using, designed precisely for the management of online, free access,
journals; and to Professor Willinsky, of the
D. How Can Others Help With It?
(i) Please help to
spread the word that a new journal is available, with a broad vision and
interests. And, (ii) since its value will depend on what it publishes, make
sure to keep JMDE in mind for things
you’d like to have published. We will make that as easy to do as we can,
including eventually an effort to publish material in your native language.
Remember that you should be able to reach a whole new audience through us, a
very important part of the world’s population. And remember that online
refereed journals are now widely endorsed as respectable entries in your cv. (iii) If you have special interests or skills that
you’d like to be sure are represented in JMDE,
sent us a note and a sample or two of your work. (iv) Everyone, please think
about other things we can do that aren’t already well done; and (v) suggest the
most interesting puzzles about evaluation you have or you encounter—they can
form the basis for a cutting edge discussion here. Other ways to help are
mentioned throughout the earlier sections.
Practical postscripts: (a) In the interests of
quality peer-reviewing, articles submitted to JMDE should be written without detectable authorship in the
manuscript itself, only in the covering letter—which won’t go out to the
referees. If you can, please use Microsoft Word with 1” margins all round, 1.5
line spacing, and Times 14 point font; e-mail if possible. We don’t insist on
APA style or any other; just intelligibility and consistency. Please don’t
submit an article that is under consideration elsewhere, it wastes referee and
editorial time. In return, we’ll get you a decision very quickly, within three
weeks from receipt.
(b) The JMDE
effort is a kind of safety-net counterpart—in the field of publishing brief
scholarly materials—to the AEA Monograph Series. The latter provides direct
cost-competition to the publishers of hardcopy books, by publishing books at
$15. That market is one in which one can’t compete without some cash flow to
cover author’s time and printing costs, so free online access is not feasible,
and paid online access is still not secure. The big commercial publishers in
both domains—books and journals—are substantially similar, led by Elsevier and Kluwer, so the aim is to shake their increasingly
life-threatening grip on the distribution of scholarly knowledge, at least in
the field of evaluation.
(c) When writing to us, to ensure attention,
add “JMDE” to whatever else you put
in the subject line. These virus-ridden days, no one should open attachments
that cannot be identified prior to opening.
Michael Scriven
NOTE: Editorials
in JMDE represent the personal views of the editor who signs them, not of the
journal's editors or staff as a group. They are somewhat uncommon in scholarly
journals, but JMDE is a somewhat uncommon journal. Correspondingly, you will
not be surprised to hear that they are published with the thought of
stimulating a discussion, or at least reactions, so please send in your considered
reflections on them!
The emergence of dominant countries in world politics is
marked by a history of the amalgamation of fiefdoms—mini-empires usually ruled
despotically by a baron, prince, king, or maharajah. Usually the fiefdoms were
too small to defend against some of their neighbors, and they were often too
small for major economies of scale in production. Hence they formed alliances
through marriage, trade, or mere covenants. Of course, these are fragile links,
compared to complete unification, so the path to better defense, industrialization,
and further expansion—as well as riches for the conqueror—lay along the latter
path, which often was unilateral and of course it also resulted in an entity
powerful enough to invade or dominate still larger but reluctant fiefdoms and
eventually countries. The great empires, from West to East, developed in this
way, and it is often said that this is the way that the present leadership in
the
These thoughts about fiefdoms and their fate are occasioned
by two recent events, and one persistent problem in the evaluation world. The
first of the recent events is the Causal Wars that began last year, which
remind us that the world of ideas is not immune to the bare-faced use of
political power, misrepresentation, and ad hominem
argumentation in the struggle for ideological and economic control. The other
is a request to all presenters at a major series of educational workshops and
seminars this past summer—not the Evaluators' Institute, by the way—that they
should adhere to the definitions and structuring of evaluation provided in some
online resources provided by the sponsors. This seems harmless enough—and was,
I am sure, merely an effort to avoid confusion amongst the attendees—until one
studies these definitions and structure. Then one discovers something that, one
recollects unhappily, has now become too frequent an occurrence: a multiple and
major failure to grasp the essential elements of many of the basic concepts of
our field. The definitions provided for terms shared with statistics, social
science methodology, or common English are quite adequate: but definitions of
terms unique to evaluation reflect a severe lack of clarity about these
concepts. And now one recollects that there are other foundations,
organizations, and educational institutions that are prominent in the
evaluation business, and deserve much credit for their support and work in that
field, where the same tendency to standardize on confused interpretations of
these concepts has become part of the—conscious or unconscious—efforts at
‘branding’, that is, the effort to leave a distinctive mark on some part of the
field that will demonstrate one’s own contribution.
The result of each fiefdom standardizing on their own
(significantly different) usage is of course just the kind of confusion at the
macro level that the standardizers are trying to
avoid in their own bailiwick: a person learning or using one set of definitions
will have trouble understanding and communicating with those trained to another
version. We've already seen this happening quite often on Evaltalk.
If combined with the kind of economic and political enforcement that has
occurred in the Causal Wars takeover of most of the federal funding for
educational research, where some $500 million per annum is now (de facto)
reserved for those with the 'right views’ on the highly controversial issue of
establishing causation, we will seriously undercut the possibility of progress
towards an understanding of the nature of our field, and of our discoveries in
it, whether it's conceived as a discipline, a profession, or a set of
practices. In other words, the political cycle from fiefdom to empire is
playing out again in our domain, and we should be concerned that evaluation
funding restrictions, for philanthropies, will follow the federal precedent in
being totally restricted to those willing to share particular variants of
standard conceptual frameworks that lack adequate justification for the
variation.
This is a good moment to remind ourselves of the classic
disaster of this type, the stupid blunders of the statisticians who casually
redefined perfectly good words in the English language in such a way as to confuse
millions of students and citizens for most of a century. To redefine
‘reliability’ so as to exclude its common meaning which includes validity,
instead of using ‘consistency,’ was the first of a series of analogous
mistakes, where ‘significance’ was next to suffer, and then ‘explanation’ as
abused by factor analysts[1].
The current attempt to redefine ‘evidence-based practice’ in medicine, public
health, social services, education, etc., is at least one where more
sophisticated arguments are being used.
Back to the fiefdom problem. The third trigger for this
concern with the Balkanization of evaluation—that is, unnecessary
fragmentation, confusion, and attendant hostility, with the shadow of dictatorship
in the background—is of much greater importance to the world at large. In the
field of international development, it has become increasingly clear that the
situation with the evaluation of interventions is far from satisfactory. This
areas has long been one of concern to thoughtful evaluators, because of the
combination of limited external oversight with the usual strong (though
tragically short-sighted) double-barreled motivation for doing superficial or
zero evaluation—namely, that serious evaluation might make you look bad, and it uses valuable resources. This
appeal to both risk-management and fiscal conservatism is always hard to beat[2].
More detailed analysis, especially by Paul Clements, one of the faculty for our
doctoral program in evaluation here, makes clear by on the ground
meta-evaluation studies in
Related to this example is the recurrent tendency for
agencies to issue RFPs for ‘external evaluations,’ in
which they overspecify the design all the way down to
overspecifying the requirements for bidders[4].
Doing this of course undercuts externality to the point where it loses most of
its contribution to credibility and seriously attacks validity. A tempting way
to extend the fiefdom, of course, and nearly as bad as sole-sourcing the
contract to a friendly consultant. In other words, how to make an external
evaluation into an internal one.
What else can be done to avoid both the linguistic
confusion and the Balkanization of research—and the funding of research—on
evaluation? We might be able to learn something from what happens in
philosophy, the field where nothing is taken for granted, all concepts are up
for reformulation, and very different interpretations of the key ones are
taught at different colleges, depending on which school of thought is dominant
amongst the resident faculty. Doesn’t this just show that one can’t hope to
prevent multiple interpretations of key concepts? I believe the main lesson to
be learnt is more fundamental: one must treat the definitions of key existing
concepts as an extremely serious matter, not a matter of casual linguistic
convenience (which is true only with neologisms). Conceptual schemes, and the
definitions that go with them, are powerful instruments of analysis and hence
persuasive support for particular interpretations, not minor precursors to it
(a point well made in Zen and the Art of
Motorcycle Maintenance, by the way).
Constructively speaking, I will also take two steps myself:
first, I will propose to a few leading organizations engaged in teaching,
supporting, and propagating evaluation, that we need to hold a small conference
of interested parties on a double topic, which we might call “Finding Common
Ground”. The agenda would cover: (i) standardizing terminology
where possible, the reasons for doing this, and the limits of such attempts; and (ii)
finding compromise positions on major conceptual issues,
such as the one about causation. This is a natural marriage of goals, since the
difference between common definitions and common analyses is only a gradual
one.
Second, I will take care, in the doctoral program that I
run, to stress the existence of, the case for, and the need to tolerate, alternative
conceptual schemes and definitions besides the ones for which I argue—although
not to treat this as a matter for arbitrary decision, but rather as something
that requires serious justification. That’s a tough distinction to make. I hope
others will join in this conscious effort, or write to JMDE explaining why they think this is an undesirable strategy—or
one in need of major extensions.
ENDNOTES 1. The most important potential relevance of this
editorial is to the problem of evaluation in
2. No good evaluator would read the above without noting
that it can also be seen as an attempt by someone who invented a fair number of
the terms in the evaluation vocabulary to extend his own fiefdom. While I do
think that people who invent terms have some obligation to argue against
careless shifts from their original meanings, they also have an obligation to
be open-minded about serious arguments
for modification or clarification of the original definitions. I make an effort
in the Evaluation Thesaurus not
to ‘brand’ the dozen or so terms I have introduced, like meta-evaluation, impactee, and the formative/summative distinction, with
any claim to authorship, hoping thereby to free others to suggest modifications
to the definitions. And I’m now inclined to think that the arguments, notably
by Michael Quinn Patton and Eleanor Chelimsky, for
adding a third category to formative and summative have merit, although I originally
took those two types to be exhaustive. In an essay in Alkin’s
Evaluation Roots (Sage, 2004) I
suggest one might use “ascriptive” to identify
certain evaluations—-for example, an evaluation done by a military historian of
Napoleon’s use of cavalry—that are aimed at neither improvement of an evaluand, nor macro-decisions about it[5],
but simply at determining/ascribing merit, worth, or significance ‘for its own
sake’.[6]
There, I’m not incorrigible; how about you?”
Example: here’s one of the World Bank’s definitions:
Meta-evaluation—The term is used for evaluations designed
to aggregate findings from a series of evaluations. It can also be used to
denote the evaluation of an evaluation to judge its quality and/or assess the
performance of the evaluators. Meta évaluation Évaluation concue comme une synthèse des constatations tirées
de plusieurs évaluations. Le terme est également utilisé pour désigner
l’évaluation d’une évaluation
en vue de juger de sa qualité et/ou d’appréMetaevaluación Este término
se utiliza para evaluaciones
cuyo objeto es sintetizar constataciones de un conjunto de evaluaciones. También puede utilizarse para indicar la evaluación de otra evaluación a fin de juzgar
su calidad
Comments by MS. The definition treated as primary—the one
in the first sentence—is a simple confusion of meta-evaluation with
meta-analysis. The second definition is correct and of course quite different.
Arguably, the former will not result in an evaluative conclusion, but in an
analytic conclusion of the following (non-evaluative) kind: “The evaluations
studied lead to the conclusion that on balance, the new meningitis vaccine is
not unduly risky for those with compromised immune systems.” A meta-evaluation
always leads to an evaluative conclusion, of the form “This evaluation is
sound/unsound/clear/unclear/credible/ not credible.”
Lynda Weaver & J. Bradley
Cousins[7]
Introduction
Interest in
collaborative forms of inquiry has increased dramatically in recent years in
evaluation and social science research. One consequence of such interest has
been the emergence of many different forms or genres of collaborative inquiry,
such as stakeholder-based evaluation, deliberative democratic evaluation,
practical participatory evaluation, transformative participatory evaluation,
empowerment evaluation, and the like. In order to ensure clarity of purpose and
application, it is necessary to differentiate among such approaches. One such
framework—originally proposed by Cousins, Donohue and Bloom (1996) and later
developed by Cousins and Whitmore (1998)—applies not only to collaborative and
participatory forms of evaluation but to forms of applied social research in a
broader sense. Within the framework consideration is given to both the goals
and interests of collaborative inquiry (i.e., pragmatic, political,
epistemological) as well as to dimensions of process (i.e., control of
technical decision making, stakeholder selection, depth of participation).
This paper questions the adequacy of the
process dimensions of the earlier version or our framework. Our ongoing
analysis of process dimensions reveals that one of the dimensions—stakeholder
selection—is problematic and requires reconsideration. In this paper we
re-present the framework and describe enhancements to the process dimension
component. By way of illustration, we then apply the framework to two separate
case examples of practical participatory evaluation. This work is relevant to
the study and practice of evaluation because it helps clarify differences among
versions of collaborative inquiry and thereby helps reduce confusion that may
arise in discussions about, or applications of, such approaches. The enhanced
process component of the framework allows interested parties to graphically
depict the continua for a given inquiry project in order to portray differences
in collaborative evaluation approaches. It also provides the basis for the
development of research tools that could be used for empirical inquiry into
participatory processes in social inquiry and their effects.
We identified three primary goals and interests
associated with collaborative social inquiry, derived in the first instance,
from Levin (1993), but found them to resonate with other conceptions such as
Mark and Shotland (1985) and Garaway
(1995). Any given collaborative research project, we suggest, would be
characterized by a primary emphasis on one or some combination of the three
goals and interests. First is the pragmatic justification. Collaborative
inquiry is purported to lead to instrumental consequences and to increase the
usefulness of the knowledge that is created. In this sense, collaborative
inquiry takes on a problem-solving orientation. Members of the community of
practice engage with researchers or evaluators to produce knowledge that bears
upon identifiable practical problems. To the extent that the research is
grounded in the context for use and thereby rendered meaningful to those
responsible for problem solving, decision making or policy making, the
knowledge produced will be of greater use.
A second justification is political and
is ideologically rooted in normative conceptions of social justice and the
democratic process. The primary interest of collaborative inquiry that
subscribes to such political aims is to promote fairness through the
involvement of individuals associated with all groups with a stake in the
research (e.g., applied study, evaluation) or the focus for research (e.g.,
programme, policy). Through direct involvement and participation in the
research process, persons from oppressed groups or marginalized sectors that do
not normally have a voice in policy or programme decision making are now
provided with such opportunities. The focus for politically-oriented
collaborative inquiry is very much emancipatory or
concerned with the amelioration of social inequities inherent in the societal
structures of the status quo.
The third and final justification for collaborative
inquiry is epistemological, the primary aim being the production of
valid knowledge or representations of underlying social phenomena. Recent
challenges to the dominant paradigm for research in the social sciences—logical
empiricism—have been many and varied and stem from
fundamental distinctions made in conceptions of reality and of knowledge. In
his comprehensive review and integration of constructivist conceptions of research
in the social sciences Schwandt (1997) epitomizes the
concept of the ‘localness’ of knowledge and the importance of context as the
essence of constructivism. While constructivist conceptions of research are
undeniably rooted in relativist epistemologies, others have argued from
different footing and similarly placed a premium on context. Huberman (1994), for example, proposes a perspective
regarding knowledge production, utilization and dissemination that might be
termed ‘revisionist-traditionalist.’ He argues that knowledge can indeed be
transported from one context or setting to another but that its reception,
interpretation and integration into the local context determines its impact and
sustainability. His construct ‘sustained interactivity’ suggests that
reciprocal effects on knowledge user and producer communities will arise from
enhanced contacts between the two. The argument is aligned with a justification
for collaborative inquiry that aims to enhance the validity of the produced
knowledge.
Process Dimensions of Collaborative
Inquiry
Quite apart from considerations of the aims of
collaborative inquiry, we identified dimensions of form as being important and
suggested them to be fundamental in characterizing various collaborative
approaches to systematic inquiry (Cousins & Whitmore, 1998). Each may be
thought of as a Likert-type rating scale along which
any given application of collaborative inquiry may be described. Initially, we
identified three such dimensions—control of technical decision
making, stakeholder selection, depth of participation—but
through ongoing analysis came to the view that one of these dimensions was
confounded and therefore conceptually inadequate. We ultimately teased apart
the dimension ‘stakeholder selection’ into three distinct dimensions of form or
inquiry process. The resulting framework consists of five dimensions of form.

Taken together, a given collaborative inquiry
might be represented diagrammatically in the form of a ‘radargram,’
shown in Figure 1. In the figure we represent hypothetical examples of three
distinct forms of collaborative evaluation. We now turn to a discussion of each
in terms of its justification and depiction according to our process
dimensions.
Figure 1: Five dimensions of form
in collaborative inquiry
Practical-participatory evaluation (P-PE): Our prior work (Cousins & Whitmore, 1998)
differentiated between two streams of participatory evaluation on the basis of
the primary aims of the inquiry. The first we called Practical Participatory
Evaluation (P-PE) an approach that is very much concerned with practical
problem solving and providing support for ongoing programme and/or
organizational decision making (see, e.g., Cousins & Earl, 1995). In P-PE,
members of the evaluation community work in partnership with members of the
programme community to implement evaluations typically seeking to inform
programme improvement initiatives. Instrumental (support for discrete
decisions) and conceptual (educative function) uses of evaluation findings and
process use, are likely to be observed as a benefit of P-PE. Figure 1 shows
that technical decision making in P-PE is typically shared between the
evaluator and non-evaluator stakeholders. Diversity in participation is likely
to be limited as non-evaluator stakeholders are typically primary users, those
with vested interest in the programme who are in a position to enact change.
Power relations among non-evaluator stakeholders are likely to be neutral since
the interests of programme managers and implementers are usually those most
often represented. This, however, is not necessarily the case. Since only a
limited number of non-evaluator stakeholders participate in the inquiry, the
process would be logistically manageable and feasible. Finally, in P-PE
participants are normally involved extensively in a wide variety of the inquiry
tasks, including data analysis and reporting.
Transformative Participatory Evaluation (T-PE): Brunner and Guzman (1989) describe
an approach to participatory evaluation that has been implemented in
evaluations of programmes in developing countries for some considerable time.
The approach has decided links with other forms of collaborative inquiry such
as participatory action research (PAR) and participatory rural appraisal (PRA)
which are normative in intent and seek to ameliorate identified social
inequities. Through participation, non-evaluator stakeholders develop their
capacity for self-determination and develop rich understandings of the often
oppressive forces operating in the local context. This stream of inquiry, which
is ideologically grounded and political in intent, we labelled transformative
participatory evaluation (T-PE) (Cousins & Whitmore, 1998). In T-PE control
of technical decision making is also likely to be balanced between trained
evaluators and non-evaluator stakeholders. While evaluators wish to adopt the
role of facilitator, there is a need for them to teach participants inquiry
methods and the logic of evaluation, Participants would include programme
practitioners but in most cases would also involve intended programme
beneficiaries as members of the evaluation team. Other interested parties
including government officials, NGO personnel, and representatives of donor
agencies are equally likely to be involved. Participation, then, would be
highly diverse, and given the range of value perspectives having legitimate
input a degree of conflict in interests is to be expected. The diverse nature
of participation would naturally lead to logistical challenges and raise into
question the feasibility of the inquiry. Finally, as was the case with P-PE, non-evaluator
stakeholders would be involved in a wide range of technical inquiry tasks and
activities; this being an important element of the capacity building and
empowering force of T-PE.
Stakeholder-based evaluation (SBE): Many years ago the concept of stakeholder-based
evaluation was introduced through a collection of papers by such renowned
contributors as Weiss, Stake and Murray (Bryk, 1983).
It was portrayed as being a recommended evaluation strategy when values
conflict among stakeholder groups regarding programme purpose or goals was
evident. Evaluators would seek to understand evaluation issues from multiple
perspectives and the evaluation would be responsive to the exigencies of the
local context. In SBE, the evaluator would remain firmly in control of the
evaluation and its implementation. Normally a range of stakeholder perspectives
would be systematically taken into account and therefore a significant degree
of diversity in perspective was to be expected. Best suited to circumstances
where programme goals and means are contentious, SBE processes are normally
witness to significant differentials in power relations and conflicts of
interest. However, with the evaluator firmly in control of the evaluation
implementation, the project could be expected to be manageable. Finally,
evaluators would most often involve non-evaluator stakeholders in deliberations
about the evaluation issues to be addressed and then later, in helping to
interpret evaluation findings. Therefore depth of participation would be limited
to a consultative role on behalf of non-evaluator stakeholders.
With these three
hypothetical examples we can see that the approaches discussed differed
considerably in both goals and interests as well as the operational form taken.
The framework described above provides a useful means of capturing such
variation among the different collaborative approaches. We now turn from the
hypothetical to the actual case in order to demonstrate the utility of the
framework in more concrete terms.
Actual Case Applications
The case examples
we selected are independent projects on which we worked separately in the
capacity of evaluators. The first case (reported by Weaver) is in the domain of
hospice/palliative care in the Canadian context: a P-PE of the Volunteer
Resources to determine how to improve the programme and to prepare for
downsizing of the palliative care unit. The second case (reported by Cousins)
is a cross-cultural P-PE of an educational leadership training programme in
Evaluation of Canadian Hospice/Palliative Care
Unit: The sole
chronic care hospital in
The part-time Volunteer Coordinator has the
responsibility of training and supervising the compliment of volunteers. At the
time of the evaluation, there were approximately 60 volunteers on the roster.
Each volunteer comes to the unit weekly for a four-hour shift anytime from
0700h to 2300h any day of the week. Usually, three volunteers are scheduled at
the same time to cover the entire unit.
The need to evaluate the volunteer resources
arose from the proposed restructuring of the unit 12 to 18 months in the
future. As part of the overall preparations to downsize the number of beds and
allocated resources, management made plans to obtain feedback from the team
members about the future unit. Attention was focused on the volunteer resources
because they had not been evaluated formally for many years, and they are an
integral, essential part of patient and family care. A commitment was therefore
made by senior and middle management to conduct a formative evaluation for two
purposes: (1) to evaluate the current volunteer resources and (2) to plan for
the restructured, downsized unit. Senior management made the decision to
conduct the evaluation. A working committee, of which Weaver was a member, was
created and a work plan was drawn up.
The major reason behind choosing to be
participatory in this evaluation was to be pragmatic. The working committee
could make decisions quickly if the stakeholders were sitting together at the
table, and the content of the questionnaires would be exhaustive with all
stakeholder groups’ input. The political rationale was an important consideration
because if management had not included volunteers and nurses in the process,
they would not be as likely to accept the recommendations for change to the
volunteers’ working conditions and policies. Lastly, the philosophy of
collaboration in the evaluation reflected the nature of the interdisciplinary
and holistic care rendered on the palliative unit. The collaborative evaluation
effort would inform management of volunteers’ issues, and the volunteers would
feel integral to decision making that affects their working conditions. In
summary, while the primary justification for the evaluation was practical,
political concerns most certainly factored in.
Having described the evaluation in terms of its
background and motivation, we now turn to an analysis of its implementation in
operational terms. Weaver rated the inquiry project in terms of its process
dimensions using the five dimensions described above. The results appear in
Figure 2. These we describe below.

Figure 2: Comparison of Canadian
and Indian P-PE cases
1. Control of technical decision making (2.5): Control of technical decisions was
shared equally by all committee members. This dimension was actually one that
caused strain among group members. At first, questions about technical aspects
of the evaluation from the volunteers, Volunteer Manager and nurse were handled
quickly by the evaluator and/or the Director of Patient Care (DPC). Resentment
was expressed by one of the volunteer committee members. She stated she felt
like her purpose was to be a rubber stamp for decisions “already made”. The
conflict stemmed from trying to follow evaluation rigour without enough
explanation or without consideration for the non-evaluators’ ideas. By
consciously realizing the problem associated with this dimension of
participatory evaluation, the group overcame the friction.
2. Diversity among stakeholders selected for
participation (4.5): Diversity was achieved on the
working committee by recruiting representatives from four groups of
stakeholders. Management was represented by the DPC, the Palliative Care
Volunteer Coordinator (PCVC) and the hospital’s Director of Volunteer
Resources. Three volunteers were asked to participate, each one with a
different length of service on the unit (range from one year to over 10 years).
A nurse brought the care team’s perspective. Weaver, the evaluation consultant
from the
3. Power relations among participating
stakeholders (2): The intent of the group was to
ensure a balance of power among all committee members. In reality, this balance
took time to achieve since it was first necessary to overcome the more
customary hierarchy in the work setting where management has power over others.
Having three volunteers helped them feel more powerful as a group, then as
individuals. The conflict mentioned above concerning ‘control of decision
making’ also skewed the power structure at first. In the end, the group was
cohesive and respectful of each other and conflict seemed to dissipate.
4. Manageability of evaluation implementation (3.5):
Resources and timing for the evaluation project impacted directly on this
dimension. The committee was capped at eight members to balance diversity with
functionality. Initially, the data collection was to be limited to a literature
search and a mailed volunteer survey. An outspoken nurse suggested that an evaluation
would not be complete without the nurses’ opinions since they work so closely
with volunteers. A brief survey was, therefore, also administered to the nurses
on the unit. Some logistical challenges were experienced, with the amount of
data collected in the two surveys being fairly voluminous.
5. Depth of participation (5): Each member of the working committee
participated extensively in the evaluation process. As a group they determined
the necessary information required to answer the evaluation goals, edited the
questionnaires drafted by Weaver, assisted with qualitative data content
analysis, and interpreted the findings. As a group, they will put forth
recommendations to the Programme Management Committee in terms of how
volunteers will function in the restructured unit. The only jobs that were
conducted by Weaver alone were the analysis of the quantitative data and the
creation of the presentation material. Participation in all aspects of the
evaluation was evident.
Evaluation of Indian Educational Leadership Programme:
The Educational Leadership Programme (ELP), centred in
The impetus for the evaluation came from ELP’s creators, developers and implementers, specifically
administration and staff of the Centre for Educational Management and
Development (CEMD) in
For the initial formative phase of the
evaluation, we adopted a participatory approach with external evaluation team
members from
On the first of two planned site visits, we
developed collaboratively a set of guiding evaluation questions and a programme
logic model and then proceeded to systematically examine programme
implementation and effects using a mix of quantitative and qualitative methods.
Methods employed were an extensive document review of archival information, a
questionnaire survey of ELP alumni and a comparison group of non-alumni
counterparts, focus groups of alumni and instructional staff, case studies in
schools at which ELP alumni were currently located, a cost-effectiveness
analysis of financial records, and a comparative analysis of structure and
content of the ELP against five other educational leadership programmes, mostly
situated in western cultural jurisdictions.
Once planning was complete, data collection,
analysis and reporting responsibilities were assigned, with members of the
Canadian and Indian teams both contributing. Reports were sent to Cousins
electronically by Indian team members and he subsequently developed a complete
draft of the report. This draft served as the basis for the second site visit,
where a series of meetings over a four-day period were used to develop the
draft report, correct inaccuracies, identify and fill omissions and most
importantly, to develop a draft set of recommendations for programme
improvement.
Following the site visit, Cousins revised the
report and presented a list of 25 recommendations for ongoing development of
the ELP. Through distance the list was finalized and the report completed and
printed and bound. The plan was for CEMD to work with these recommendations for
approximately one year, at which time an external team from
1. Control of technical decision making (3): Control
was shared and balanced. The evaluation began with a site visit and three days
of planning. Cousins acted as facilitator in the analysis of stakeholder
groups, their interests, and the implications for evaluation issues and
questions to be addressed. He also provided input about the participatory model
and expectations for shared decision making. Throughout the project, Indian
evaluation team members relied on their knowledge of context and the program
itself to inform evaluation decision making. The resulting evaluation was quite
sophisticated involving several sources of data, methods of inquiry and bases
for comparison.
2. Diversity among stakeholders selected for
participation (3): Non-evaluator stakeholders participating directly
in the evaluation were predominantly members of the CEMD staff and included the
Director. The organization was very collaborative and the Director supportive
of her staff. The five or so staff members participating directly on the
evaluation had extensive professional backgrounds and skills in program
development and implementation. They had prior training in business, education
and other applied social science fields. In addition, several members of the leadership
programme alumni were occasional participants in evaluation team meetings. They
served in an advisory capacity as did a few other individuals, including a
university professor and an American who had participated in the development of
the programme in the mid 90’s.
3. Power relations among participating
stakeholders (2.5): Among the Indian team members, occasional
differences of opinion surfaced but the process was, for the most part,
conflict-free and highly cooperative. Considerable support was provided to the
Canadian members of the team. Indian team members felt comfortable in voicing
their opinion and challenging proposals for planned action. They routinely
questioned assumptions and raised concerns. One such concern had to do with the
overarching goal of comparing the ELP with western educational leadership
programmes. The Director of the NGO, and original architect of the ELP, remained
intent on her resolve that the evaluation would yield such a comparison but not
without extended dialogue about the merits of this strategy. Why, for example,
could the programme not be considered more directly in terms of its relevance
to education in the South Asian context? Another related conflict emerged over
a recommendation concerning expected contact hours for the ELP participants. The
exchange was between Canadian and Indian team members, Cousins successfully
arguing from the point of view of western standards, as had been agreed by the
entire team.
4. Manageability of evaluation implementation (3.5): The process, by and large, was
manageable although complications arose as a function of the scope of the
project relative to allocated resources and limits on communication due to
geographic separation between Canadian and Indian counterparts. Telephone
communications were highly impractical. Initial spotty use of e-mail exchanges
became more streamlined and useful as the project unfolded. One Indian team
member was identified as the project contact person and all communications went
through her. Ultimately, large quantities of data and draft reports were
transferred electronically in condensed format, a system that proved to be very
reliable and efficient. Other challenges to manageability were grounded in
competing demands especially on Cousins, but also on members of the Indian
evaluation team. At times, evaluation tasks were difficult to get to in the
face of more immediate and pressing demands. The preparation of the final
polished and formatted version of the report was delayed for several months,
for example.
5. Depth of participation (5): Without
question Indian team members participated in all phases of the evaluation
process. Planning was done collaboratively during the first site visit. The
program practitioners drafted initial versions of questionnaires and interview
schedules and reacted to drafts of focus group questions. They implemented the
questionnaire survey of alumni and a comparison group of practising principals
and helped to interpret statistical summaries provide by Cousins. They carried
out several focus groups and case school data collection site visits. Through
exchanges with the Canadian counterparts, they acted on recommendations for
data analysis and reporting. Ultimately, the second site visit was a protracted
and intensive cross-method interpretation session. Once the final report was
compiled as a complete whole by Cousins, the Indian team members provided
extensive constructive feedback and suggestions for change.
Discussion
Figure 2 shows the distribution of process
dimension ratings for each of the two cases. Empirically, these ratings should
be treated with caution since we did not endeavour to establish inter-rater agreement, and therefore inter-subject differences
are likely to be inherent in the ratings. The point of the exercise was to test
the application of the process framework to concrete collaborative projects.
We were successful in applying the ratings and
showing similarities and differences between the two projects. Both projects
had similar rationales with the main emphasis being practical. Conceptual,
instrumental and symbolic consequences of the project were anticipated. The
projects looked quite similar in terms of the five process dimensions that we
identified. Control was balanced, a diverse group of participants were
involved, and power relations were not a defining issue. The projects tended to
be somewhat unwieldy and to involve non-evaluator stakeholders in a full range
of evaluation tasks.
If the projects were to be framed as P-PE`s it is interesting to note some differences from the
hypothetical example in Figure 1. The hypothetical example was developed by
Cousins based on his experience over time with P-PE (e.g., Cousins & Earl,
1995). In the present cases more diversity was observed than would be expected.
Also, probably for a related reason, the projects were somewhat difficult to
manage. Otherwise, the P-PE experiences were similar to previous reported
experiences.
One interesting observation regarding the use
of the process framework was that intra-project variability was in evidence. Ratings
according to some process dimensions could be observed to shift over time as
was the case with the ‘control of technical decision making’ dimension in the
palliative care case. Also the nature of conflict among participants was seen
to shift to a more neutral posture during the evaluation in over time in that
case. In the ELP context, advisory structures were set up and informed the
evaluation in various ways. These committees revealed diversity in
participation at an aggregate level but such diversity was seen to be more
limited at the evaluation team level. These observations may be construed as
limitations in the current application because ratings were made on an
aggregate or holistic basis. However, they speak to the dynamic nature of the
participatory process. The implications of the aforementioned limitations for
ongoing research using the framework would be to invoke longitudinal designs
that capture varying units of analysis.
Despite the limitations of the present test of
the framework, the reconceptualized version of process dimensions for
collaborative inquiry shows promise for being a helpful way to think about
collaboration. Potentially the framework could be used to guide research on
collaborative, participatory and empowerment processes, conditions affecting
them and their consequences and effects, preferably using longitudinal,
multilevel designs as mentioned above. We have argued elsewhere that such
research is badly needed (Cousins, 2003). Despite a good deal of reflective
anecdotal reporting of practice (not unlike that reported in the present paper)
more intensive empirical efforts such as indepth case
study research, longitudinal qualitative and quantitative designs are few and
far between. Yet interest in participatory inquiry is on the rise. Further,
some studies have shown that implementation can be extraordinarily challenging
and may lead to blatantly unsuccessful outcomes. The present tool will help
researchers to clarify important implementation issues perhaps as a way of
linking these to antecedent conditions or even consequences, intended and
unintended.
The tool can also be of use to evaluation
practitioners, donor agencies and others interested in collaborative modes of
inquiry. Much is written about such processes but evidence suggests that
projects touted to be participatory are anything but. This was the clear
conclusion of a recent study of alleged participatory studies in the education
sector in sub-Saharan
Author Biographies
Lynda Weaver
For more than 2 decades, Lynda
Weaver has worked in the area of health care services research, with a focus on
program planning and evaluation. She completed a Masters of Health
Administration from the
J.
Bradley Cousins
Brad Cousins, Ph.D. (
References
Brunner, I. & Guzman, A. (1989).
Participatory evaluation: A tool to assess projects and empower people. In R.
F. Conner & M. Hendricks (Eds.), International
innovations in evaluation (pp. 9-18).
Bryk, A. (1983). Stakeholder-based evaluation. New
Directions in Program Evaluation, No. 17.
Cousins, J. B.
Cousins, J. B., Donohue, J. J.,
& Bloom G. A.
Cousins, J. B., & Earl, L. M.
Cousins, J. B., & Whitmore, E.
Garaway, G. B. (1995). Participatory
evaluation. Studies in Educational
Evaluation, 21, 85‑102.
Gregory, A. (2000). Problematizing participation: A critical review of
approaches to participation in evaluation theory. Evaluation, 6(2), 179‑199.
Huberman, M. (1994). Research utilization:
The state of the art. Knowledge and
Policy, 7(4), 13‑33.
Levin, B. (1993). Collaborative
research in and with organizations. Qualitative
Studies in Education, 6(4), 331‑340.
Mark, M., & Shotland,
R. L. (1985). Stakeholder‑based evaluation and value judgments. Evaluation Review, 9, 605-626.
Meier, W. (1999). In search of indigenous participation in eduation sector studies in Sub-Saharan Africa.
Unpublished Master's thesis,
Schwandt, T. A. (1997). Reading the
"problem of evaluation" in social inquiry. Qualitative Inquiry 3(1), 4-25.
The First JMDE Essay Competition
Michael
Scriven
Zen
Buddhism was often said, by those subscribing to other varieties of Buddhism,
to be “the last guest at the table.” This was a kindly reference to its nouveau
status, since it was not part of the centuries-long Indian and then Chinese
phases in the history of Buddhism, only emerging in the last historic phase of
development, in
Despite
this skepticism about rational theology, some highly intelligent, although
disparate, efforts have been made by Western intellectual writers such as Aldous Huxley and Arthur Koestler
to express the essence of Zen. The disagreement amongst them is considerable:
it would not be hard to find sources amongst them that would deny every comment
made so far in this note, although these are remarks based on august sources in
Western history of philosophy. Robert Persig’s Zen
and the Art of Motorcycle Maintenance is one of the most interesting of these
high literary efforts and is perhaps unique in a respect that is not commonly
remarked: it is about evaluation. This is quite overt, since the subtitle of
the book is An Inquiry into Value, and the Platonic inscription is:
And what is good, Phaedrus,
And what is not good––
Need we ask anyone to tell us these things?
And
here is what Persig says about his own effort:
“I
would like not to cut new channels of consciousness but simply dig deeper into
old ones that have become silted in with the debris of thoughts grown stale and
platitudes too often repeated. “What’s new?” is an interesting and broadening
eternal question, but one which, if pursued exclusively, results only in an endless
parade of trivia and fashion, the silt of tomorrow. I would like, instead, to
be concerned with the question “What is best?” a question that cuts deeply
rather than broadly, a question whose answers tend to move the silt downstream.
There are eras in human history in which the channels of thought have been too
deeply cut, and no change was possible, and nothing new ever happened, and
“best” was a matter of dogma, but that is not the situation now.” (p. 16). The
quotes are all from the importantly altered 25th anniversary edition
(Morrow, 1999).
That
can be an inspiring thought for us in evaluation, at the start of a new
century. Evaluation is a discipline built by constructing a science out of
extensions of everyday evaluation, just as probability theory is an extension
into mathematics of everyday reasoning about games of chance. From time to time
in these pages we will revisit Persig’s theme and
thoughts. On this occasion, we’ll simply pose an anti-koan,
a puzzle about what is “best,” a puzzle that is intended to produce, not
laughter at the flounderings of reason but rather
reason’s best exercise in the pursuit of value. The puzzle is this:
“In the evaluation of
revolutions—political or intellectual, in medicine, in warfare, or in
education—should one use, as a basis, the values of the victor; or those of the
vanquished; or both; or neither?”
We
read all the time about revolutionary new technology, revolutionary new ways to
prevent or cure diseases, or to deal with crime or terrorism; or revolutions in
the government of countries or the treatment of the oppressed. Each of these
presents an evaluation problem for the historian or other evaluator and it’s a
notoriously difficult problem to deal with. Is there any common thread that
should run through the best approaches, whatever disciplines are involved?
Perhaps a good problem for those interested in multidisciplinary evaluation!
There
will be a small prize for the best short essay on this topic; perhaps
appropriately, a good book, suitably inscribed. The winning submission, and
perhaps others deemed amongst the best entries, will be published here. Five
hundred words should suffice; a thousand will be considered excessive in the
present context. The prize will be awarded by mid-summer, 2005; but the topic
will not be closed to further discussion in these pages thereafter. Suggestions
for the Second JMDE Essay Competition will also be welcomed and considered
carefully.
Evaluation Humor
Michael Scriven
A section of JMDE
whose importance is much greater than its size . . . maybe.
Here's an opening entry . . . could this be the
most important reference in your professional life?
The Journal
of Nondestructive Evaluation. Convince your clients that evaluation is a
positive force! give them a copy of a good article on Appreciative Inquiry and
a copy of the Table of Contents of, for example, the current issue of JNDE (vol. 23), which follows:
1. Influence of Wall Thickness on the
Ultrasonic Evaluation of Small Closed Surface Cracks and Quantitative NDE
2. Rayleigh Wave Propagation for the
Detection of Near Surface Discontinuities: Finite Element Modeling
3. Residual Magnetic Flux Leakage: A Possible
Tool for Studying Pipeline Defects
4. Review of Advances in Quantitative Eddy
Current Nondestructive Evaluation
5. Using a Single Transducer Ultrasonic
Imaging Method to Eliminate the Effect of Thickness Variation in the Images of
Ceramic and Composite Plates
A gift subscription is probably not a good idea,
especially at $650!
Paul A.
Lamphear
The major purpose
of the association is to build individual and institutional capacity in policy,
program and project evaluation in
The first
conference of the African Evaluation Association, held in
It appears that
these lectures, the training seminars, and 80 papers that were submitted on
evaluation, helped to ‘jumpstart’ the development of the national associations.
Although there does not appear to be any evaluation journals specific to or
published in Africa, AfrEA has worked on several
projects to increase evaluation capacity and foster a consistent professional
approach for evaluators. In 2002, the association completed the "African
Evaluation Guidelines", a cultural adaptation of the
At the 2nd AfrEA conference in 2002, a variety of international
evaluators were invited, including keynote speakers Prof. Anna Madison of
Cornell University, USA, Ada Ocampo,
Leader of the Latin American Evaluation Network, and Penny Hawkins, President
of the Australasian Evaluation Association. After 5 days of trainings and paper
submissions, the association recommended that the African Evaluation Guidelines
(AEG) be adopted by all the National Networks, by Government and Public bodies,
and by UN Agencies and other Multinational Organizations performing evaluation
in
The Niger Network of Monitoring and
Evaluation (ReNSE) has over 140 documents (in
English or French) and in 2004 has published their first newsletter (in
French), available from their website, to promote expertise in Nigerian
evaluators.
AfrEA
supports dissemination of monitoring and evaluation resource materials focused
on
AfrEA
has significant connections with international organizations and has activities
currently sponsored by:
African Development Bank (AfDB)
Agence
Intergouvernementale de la Francophonie
Catholic Relief Services
Canadian Institutes for Health
Canadian International Development Agency
(CIDA)
CARE International
Danish Agency for Development Assistance
(DANIDA)
Family Health International (FHI)
International Development Research Centre
(IDRC)
World Conservation
Norwegian Ministry of Foreign Affairs
UNAIDS
UNCHS
UNDP
UNICEF
UNIFEM
World Bank
The third national Conference
of the African Evaluation Association will be held in
John
S. Risley
General
Summary of Activities
The Australasian
Evaluation Society (AES) produces, and posts on their website (www.aes.asn.au), an e-newsletter
approximately twice per year. The AES also holds an annual conference, usually
in September or October. The 2004 conference is in October near
From reading the
editorials and other non-refereed articles in the Evaluation Journal of Australasia (EJA) it appears that the evaluation profession in
Evaluation Journal of
A recent editorial
in Evaluation Journal of Australasia (EJA) noted the history of AES
publications. The society launched the EJA
in 1989. Then from 1993 through 2000 AES published both EJA and Evaluation News &
Comment. In 2001 these publications merged to form the new series of EJA. The journal is published by the AES
bi-annually (though recently there have been delays in publishing new
editions). AES posts the two copies preceding their most recent issue on their
web site. The journal includes refereed and non-refereed articles, editorials,
interviews with evaluators from both within and without the region, book
reviews, research reports, and information about the annual AES conference.
Issues addressed in
EJA included much information
concerning cultural appropriateness, indigenous peoples, and diversity in
evaluation. This may be a reflection of the recent AES conference themes. There
is some material drawing distinctions about evaluation aspects specific to
Subjects of
refereed articles in recent issues of EJA
include: evaluation of options for changing port ownership in
Some of the
refereed articles had very little to say about evaluation. For example, one of
these articles (Burton & Rajan, 2002) concerned a
case study evaluation of 15 people seriously injured in workplace accidents.
The authors described the project’s goal as exploring the social and economic
consequences to society from these workplace injuries. The article discussed the
methodology of the study, the experiences of the researchers, and the lessons
learned from their research experiences. The methodology was basically a
semi-structured interview of injured workers, their family members, employers,
etc. The lessons learned by the researchers were: 1) interviewing can be
exhausting, 2) diversity of the project team was essential, and 3) it was
difficult to remain objective after seeing the suffering of the injured
workers.
One interesting
article (Sigsgaard, 2002) addressed an unusual
methodology (in evaluation research), the Most Significant Change (MSC)
methodology. The author, Peter Sigsgaard, works at a
Danish NGO called “MS” on measurement and evaluation issues. He gave examples
of his experience using MSC in evaluating partnership-based economic
development programs in
Sigsgaard
(2002) contrasts this approach with one previously used by MS in evaluating
these programs, in which they would conceive of indicators to measure and then
cast about looking for these indicators. This led to lots of time spent looking
for, and not finding, specific data.
It makes intuitive
sense to ask program consumers what changes are occurring due to the program.
It does highlight the need to be careful how one measures program changes.
References
Sigsgaard,
P. (2002). Monitoring without indicators: an ongoing testing of the MSC
approach. Evaluation Journal of
Chris
L. S. Coryn[11]
Background and General Context of Organized
Evaluation in
The Canadian
Evaluation Society (CES)—
|
ü
Ontario |
|
|
ü
Manitoba |
|
|
ü
Alberta |
|
|
ü
Quebec |
|
Since 1991 the CES membership has grown to over 1,750 individual Canadian and student members, as well as over 100 international members (CES, 2004).
The CES offers a
wide range of resources and services for practicing evaluators and students of
the discipline including: a comprehensive Web site (available in English and
French); an evaluation report bank (academic, government, and private sector
reports); a fully-searchable database—the Grey
Literature Bank (unpublished documents of interest to evaluators); a professional
development series of workshops; an annual
conference (including the upcoming 2005 joint conference with the American
Evaluation Association); and the Canadian
Journal of Program Evaluation. The CES efforts are strongly supported by
the Government of Canada, which has its own specialized evaluation unit; Evaluation and Data
Development (EDD). EDD is one of the largest evaluation shops in the
Federal Government of Canada, and focuses primarily on governmental initiatives
including analysis of government policy and evaluation of government programs,
foe example, Human Resources Development Canada (HRDC) programs. Other
government contingencies which influence the Canadian evaluation field include
the National Science and Engineering
Research Council, the Social Science
Research Council, Transport Canada, Industry Canada, Health Canada, the Treasury Board Secretariat, and the Canadian
International Development Agency; each of which are also sponsors of the CES.
Informed decision making is further facilitated by Statistics Canada a
provider—federally legislated—of statistical data for the whole of Canada and
each of its provinces that is intended to inform Canadian citizens and other
key stakeholders regarding Canada's population, resources, economy, culture,
and society.
In the summer of
2001 the CES announced their new vision, mission, and goals for the future
(Canadian Evaluation Society Newsletter, Summer 2001):
Vision:
The Canadian Evaluation Society will be the leader for evaluation in
Goals:
1.
Leadership—To
provide leadership to individuals and organizations in support of evaluation
theory and practice in
2.
Knowledge—To
improve the state of evaluation theory and practice.
3.
Advocacy—To
promote the importance of an evaluation culture.
4.
Professional
Development—To promote and facilitate the enhancement of evaluation
capacity for members and non-members.
The CES also supports various student initiatives including
the CES
Student Case Competition and student
paper contest (for undergraduate and graduate students in the field of
evaluation). The CES Student Case Competition (initiated in 1996), is an annual
event in which teams of three to five students from
Canadian colleges and academic institutions compete in the analysis of an
evaluation case file. In a preliminary competition, all teams receive on the
same day the key to an evaluation case file that has been hidden on the Web.
They have five hours to prepare an analysis and then submit it by e-mail for
judging by an expert panel. The three best teams are invited to participate in
a final round, held at CES's annual conference, in
which they must analyze a new case and present findings and recommendations
before a live audience. The team that makes the best presentation takes
possession of the Case Competition Trophy for a year, receives prizes, and is
given visibility in various publications.
Evaluation Education Programs in
As of 2000 (CES), over 25 Canadian
institutions/colleges/universities offered more than 100 evaluation-related
courses across a wide array of academic disciplines (e.g., psychology,
political science, public administration, economics)—a complete institution,
department, and course list is available at http://www.evaluationcanada.ca/txt/outline200106.pdf.
Professional Development of Canadian
Evaluators
The CES plans to
focus on two key areas in the upcoming years: (1) professional development of
its members, and (2) advocacy on behalf of the evaluation function. The
articulation of a Core Body of
Knowledge (CBK) will guide the Society's professional development and
advocacy activities (Canadian Evaluation Society, 2004). The CBK comprises
theories, skills, and best practices that people must possess to plan, carry,
out, and report on valid and reliable evaluations of programs or policies in
governments, not-for-profit organizations, and businesses.
Essential Skills. Much of the emphasis on professional
development is funneled through the CES Essential Skills Series. Regional
chapters offer this series as well as any other form of training they consider
adequate for their members. These essential skills include:
1.
Understanding
Program Evaluation
§
Key terms and
concepts
§
Benefits of program
evaluation
§
Basic steps in the
evaluation process
§
Major approaches to
program evaluation
§
Formatting
evaluation questions
§
Designing an
evaluation
§
Evaluating with
limited resources
§
Analyzing and
reporting evaluation results
§
Reducing resistance
to evaluation
§
Involving staff and
clients in the evaluation process
§
Increasing
evaluation utilization
§
Making evaluations
ethical and fair
2.
Building an Evaluation
Framework
§
Identifying who the
client is and what the client needs
§
Basic concepts of
needs assessment
§
Major approaches to
assessing client needs
§
Evaluation methods
for "getting close to the client"
§
Building an
evaluation framework through logic models
§
Involving managers
and staff in building an evaluation framework
§
Relating program
design to client needs
§
Defining program
components
§
Formulating
indicators for program success
§
Using the evaluation
framework for linking program performance to client needs
3.
Improving Program
Performance
§
Using evaluation as
a management tool for improving program performance and enhancing internal
accountability
§
Basic concepts of
monitoring and process evaluation
§
Monitoring program
performance with existing administrative data and information systems
§
Developing ongoing
data collection instruments and procedures
§
Linking process
evaluation to program decision-making
§
Assessing client
satisfaction
§
Understanding
continuous quality improvement
§
Using program evaluation
for building a "learning organization"
4.
Evaluating for
Results
§
Defining program
results
§
Major approaches to
evaluating results
§
Developing results
measures
§
Designing outcome
evaluations
§
Validity and
reliability
§
Appropriate use of
quantitative and qualitative techniques
§
Relating program
results to program costs
§
Understanding
program benefits
§
Measuring program
equity and responsiveness to community needs
§
Communicating
evaluation findings
§
Using evaluations to
improve program effectiveness and accountability
(Canadian Evaluation Society, 2004)
Certification of Evaluators in
This latter issue — developing a form of
certification for members — would be a major step for the CES. Therefore,
it was the subject of an in-depth study
of the experience of several other organizations with certification (Long
& Kishchuk, 1997). A second study, carried out in
1999, reports on a pilot
survey of clients and employers (Stierhoff, 1999) on their views
regarding certification of evaluators.
Canadian
Journal of Program Evaluation
The Canadian Journal of Program Evaluation (CJPE) was launched in 1986 and is published
twice a year (available at www.cjpe.ca). CJPE is sponsored by the CES and the
§
Articles
on all aspects of the theory and practice of evaluation, including methodology,
evaluation standards, implementation of evaluations, reporting and use of
studies, and the audit or meta-evaluation of evaluation.
§
Research
and Practice Notes that provide practical examples of the applications
of particular methodologies or procedures within the context of a particular
study or group of studies.
§
Book
Reviews of relevance to the practice in
(Canadian Journal of Program Evaluation,
2004)
Review of the past
eight issues (from Spring 2001 to Spring 2004) of CJPE revealed a number of insights into the journal's thematic
trends. The journal does, in fact, promote and publish articles on theory,
practice, implementation, and standards, for example. Notable examples include
Christie & Rose (2003)—The language
of evaluation theory: Insights gained from an empirical study of theory and
practice, Levin-Rozalis (2003)—Evaluation and research: Differences and
similarities, Morris (2002)—The
inclusion of stakeholders in evaluation: Benefits and drawbacks.
In 2001, the CJPE devoted a special issue to
provincial evaluation policy and practice in
§
Has not acquired an identity of its own
§
Tends to neglect key issues
§
Loses emphasis on rigor
§
Is dominated by program monitoring
§
Is insufficiently connected with management
needs
Perspectives across
The Western
Canadian Perspective. (Malatest, 2004)
Strength: Development of evaluation
methodologies—in recent years the provincial and federal agencies have
recognized the requirement of good evaluation.
Weakness: Inadequate planning of
program evaluations—awareness and use of evaluation tools are often an
afterthought.
Threat: Reduced program evaluation
capacity—the ability to design and manage complex evaluation activities has
been compromised (e.g., lack of resources).
Program
Evaluation in
Strength: Growing
sophistication—evaluators are more skilled and better qualified.
Weakness: Dependence on performance
measurement—to the exclusion of more relevant, complex outcomes.
Threat: Devaluation—avoidance of
serious evaluation (e.g., focus on accountability rather than improvement).
Program
Evaluation in
Strength: Commitment—
Weakness: The paradigm—the current
approach is to assist the government in determining redirection of funding.
Threat: Capacity—Public and non-profit
organizations need to demonstrate effectiveness, yet they are limited in their
capacity to meet this demand.
Teaching
and Learning Evaluation in
Strength: Self-definitional
capacity—the time for evaluation to define itself and establish itself as a
distinct discipline is "now."
Weakness: Lack of disciplinary
focus—disciplines view evaluation differently rather than having a common
ground.
Threat: Disconnection—evaluation as
part of management is under threat (e.g., lack of common ground).
This paper is an outsider's perspective of evaluation in
References
Bradley,
S. E. (2001). Evaluation in the government of
Cabatoff, K. (2001). The long march from evaluation to
accreditation: Québec's new government management framework. The Canadian Journal of Program Evaluation,
16(special issue), 73-88.
Canadian
Evaluation Society (2004). Canadian
evaluation society website. Available at http://www.evaluationcanada.ca/site.cgi?s=1&ss=0&_lang=an
Canadian
Evaluation Society (2001-2004). Canadian
evaluation society quarterly newsletter. Available at http://www.evaluationcanada.ca/site.cgi?s=4&ss=3
&lang=an
Canadian
Evaluation Society (2004). CES guidelines
for ethical conduct. Available at http://www.evaluationcanada.ca/site.cgi?section=5&ssection=4&_lang=
an
Canadian
International Development Agency (2004). Canadian
International Development Agency website. Available at http://www.acdi-cida.gc.ca/index.htm
Canadian
Journal of Program Evaluation (2004). Canadian
journal of program evaluation website. Available at http://www.evaluationcanada.ca/
site.cgi?s=4&ss=2&_lang=an
Cousins, J. B. (2004). Personal communication.
Evaluation
Data and Development (2004). Evaluation
Data and Development website. Available at http://www11.hrdc-drhc.gc.ca/pls/edd/hrdc.main
Gauthier,
B.,
Health
Industry
Hicks, K. (2001). Program evaluation in the government of
the Northwest Territory, 1967-2000. The
Canadian Journal of Program Evaluation, 16(special issue), 107-114.
Long, B. & Kishchuk, N.,
(1997). Professional certification: A
report to the national council of the Canadian evaluation society on the
experience of other organizations. Canadian Evaluation Society.
McDavid, J. C. (2001). Program evaluation in
Mowry, S., Clough, K., MacDonald, B., Pranger,
T., & Griner, D. (2001). Evaluation policy and
practice in provincial governments province of
Ross,
A. (2001). Evaluation in
Segsworth, B. (2001). Evaluation policy and practice in
Social
Sciences Research Council (2004). Social
Science Research Council website. Available at http://www.sshrc.ca/
Stierhoff, K. A. (1999). The certification of program
evaluators: A pilot survey of clients and employers. Canadian Evaluation
Society.
Statistics
Transport
Treasury
Board of
Warrack, B. (2001). Program evaluation in the
Daniela Schröter[12]
The
Landscape of European Evaluation
The umbrella organization of evaluation in
The web site of the EES provides a good overview
about the evaluation community including lists of European and international
evaluation associations and networks, evaluation journals, events, and other
online resources. Currently,
the EES provides links to 13 national or multinational European organizations
as well as 5 regional networks within the
|
|
|
European
Evaluation Society, Danish Evaluation Society, Finnish Evaluation Society, French Evaluation
Society,
German Evaluation Society, International Program
Evaluation Network, Irish Evaluation Network, Italian Evaluation
Association, Polish Evaluation
Society, Spanish Public Policy
Evaluation Society, Swedish Evaluation
Society, Swiss Evaluation Society, UK Evaluation Society (the following are regional UK networks: Cymru Evaluation Network, Scottish
Evaluation Network, London Evaluation
Network, Midlands Evaluation
Network, North West Evaluation Network),
Walloon Evaluation Society |
|
Figure 1.
National and Multinational Evaluation Societies in |
Evaluation in
The EES holds conferences
biennially. From September 30 to
The Development of
Evaluation in
Leeuw (2004) asked if European evaluation
is still an “infant industry” and illuminates the European type of “evaluation
industry”. His book chapter will serve as the foundation for the following
sections. Rist, Furubo, and
Sandahl’s (2002)[14]
assessed countries worldwide on eight dimensions to determine levels of
development in evaluation. The dimensions included:
·
Evaluation
activity
·
Supply
of evaluators
·
Training
capacity
·
National
discourse
·
Organized
evaluation meetings
·
Evaluation
infra-structure within the public sector
·
Evaluation
infra-structure within parliament
·
Evaluations
carried out by Supreme Audit Offices (see Leeuw 2004,
63).
While not all European countries were assessed within this
study results indicated most intense evaluation efforts in North and West
European countries. However, data was either insufficient or indicated only
moderate training capacity for evaluators in Europe, which as Leeuw argues is plausible in view of the fact that
evaluation has not been established well at the university level in form of
evaluation studies. On the other hand, national discourse and organized
meetings were available and as indicated by other contributions in this issue
of JMDE not only stimulate debate and
discussions, but also provide platforms for trainings. Additionally, Rist et al. found that evaluation in the public sector was
more widely available than evaluation within parliament. Last but not least,
evaluations carried out by Supreme Audit Offices were most developed in
Historically,
The European Evaluation Market
Based on a study conducted in 1999, Leeuw
describes the European evaluation market as a growing market. While the
response rate in the study was rather limited, findings indicated that the
evaluation market was growing faster on the European and national levels than
in regions. Most evaluations conducted were related to policy and respondents
indicated that methods utilized usually derived from the evaluators’ specific
subject areas. Moreover, the regional evaluation market was perceived as rather
fragmented and it was thought that international competition on the European
evaluation market would be constrained due to cultural factors. For instance, Leeuw pointed out that one respondent said that it was even
hard to hire a British evaluator for an Irish setting. This is due to language
constraints and an understanding of the different organizational cultures. On
the European level, this leads to evaluations which are conducted by teams of
evaluators from multiple nations. Leeuw refers to
such arrangements as “(quasi)professions” (p.68). Moreover, Leeuw
argues that top-down processes thwart good evaluation practice. While
evaluation in
Evaluation on the
European Union Level[15]
On the European level, initial forms of program evaluations
began in the 80s, were focused on research and technology development programs,
and were based on practices prevalent in first wave evaluations. A shift
occurred in 1995, when a new evaluation scheme was introduced that demanded
evaluation of research and framework programs in form of annual monitoring and
five-year periodic assessments. Leeuw states:
The assessments
can be understood as a combination of an ex post evaluation of the previous
program, an intermediate evaluation of the current program and an ex ante
appraisal of future activities (2004, 69).
However, while
evaluation on the Union level always focused on regulatory policy, formal
evaluation systems or databases for the Directorates General are insufficient
and “the Council and Parliament pass[ed] a small number of ‘sunset’ regulations
which include a formal evaluation clause given a deadline (especially in the
field of Competition Policy)” (Leeuw, 2004, 69). The
results of reporting, however, are neither called nor could be classified as
evaluation. Other foci, especially cost-effectiveness and cost-benefit
evaluations were yielded by management reforms in the 90’s and are “supervised
by the Directorate General for Budgets and Financial Control” (Leeuw, 2004, 71).
In 1996, steps for
more systematic evaluations of policies were undertaken and a “decentralized
model in which the operational Directorates General are responsible for
establishing systematic evaluation procedures for the programs they are
executing” was developed to improve evaluation practice (Leeuw,
2004, 71). As a result, each Directorate had to designate one evaluation
official who is responsible for establishing an annual evaluation plan and for
determining program to be evaluated. The Directorates’ evaluation plans are
assembled into the “Commission’s Annual Evaluation Program”. The Directorate
General for Budget “coordinates evaluation activities and maintains an
overview of the evaluation findings across the Commission services. It also
provides methodological
guidance and support, helps with procurement
of evaluation expertise and maintains evaluation
networks within and outside the Commission (see website).
Unique features of the Evaluation Commission include a broad definition of the
concept of evaluation and its direct link to budget:
Not only does it
[evaluation] encompass ex post and midterm evaluation, but it also cover ex
ante exercises… evaluation projects are to be framed so that they correspond to
identifiable entities in the Community budget and to be timed so that results
are available when they are relevant for budgetary decisions (Leeuw, 2004, 72).
Current Issues in
European Evaluation[16]
Leeuw
refers to different elements of current developments in
Most central topics
for evaluation within
·
The
increasing importance of evaluation for civil society
·
Evaluation
for Parliaments (Do parliament decisions have effects?)
·
Evaluation
for public policy partnerships
·
Decentralization
of evaluation
·
Potentials
for evaluation of social programs from a non-managerial standpoint
·
Evaluation
of information and communication technology products, processes, and outcomes
(web-based communication, training, the internet as knowledgebase)
·
Auditing
versus evaluation
·
Evidenced–based
evaluation
·
Learning
from evaluation
·
Effective
implementation and utilization of performance management systems in public
management.
Overall, evaluation appears to be a vast growing market in
Leeuw, F.L. (2004). Evaluation
in
News
from the community (2004). In: Evaluation:
The International Journal of Evaluation Theory, Research and Practice, 10(3):
380-381.
Stern,
Elliot (Ed.). Evaluation: The
International Journal of Evaluation Theory, Research and Practice, 9(4)-10(3).
Stern, Elliot (2004). What shapes European evaluation: A personal reflection. In: Evaluation: The International Journal of Evaluation Theory, Research and Practice, 10(1): 7-15.
The European Evaluation Society (2004). The European Evaluation Society website. Available at: http://www.europeanevaluation.org/
John
S. Risley
General Summary of Activities
The UK Evaluation
Society (UKES; www.evaluation.org.uk)
was founded in 1994 and is composed of over 150 individual and corporate
members. Most of these are individual members. UKES hosts an annual conference
each year in December and jointly conducts seminars and conferences with other
professional organizations. The society also sponsors an e-mail discussion
list, Eval Chat, publishes a thrice yearly
newsletter, The Evaluator, and
produces Evaluation: The International
Journal of Theory, Research and Practice.
UKES has five
regional networks. Three of these networks, the Scottish Evaluation Network,
the London Evaluation Network, and the North West Evaluation Network are
established. The other two, the Cymru Evaluation
Network (
The UKES website
offers a host of information and links on evaluation topics, including:
·
evaluation guidelines for good practice from
different national evaluation associations,
·
a list of postgraduate courses on evaluation
taught throughout the
·
links to 21 national/regional evaluation society
websites,
·
an evaluation glossary (including an entry on
“chatty bias”)
·
a short but wide-ranging bibliography of
evaluation books
Evaluation:
The International Journal of Theory, Research and Practice
The journal Evaluation is published quarterly by
Sage. Through the end of October it is available free online at
evi.sagepub.com. I reviewed the last two years of Evaluation (the January 2003 issue through the July 2004 issue) and
categorized each article according to Lori Wingate’s adaptation of Michael Scriven’s analogy for understanding disciplines. Wingate
identified four categories of focus for journal articles—practice, methods,
theory, metatheory—that I used below and one
category—history—that I eliminated because no articles fit the description.
Practice issues
dominated the 37 articles from the last two years (48.6 percent). The practice
articles mainly dealt with the related issues of evaluation use and stakeholder
participation. An article by Taut & Brauns (2003)
examines social and psychological explanations for resistance to evaluation and
offers strategies for overcoming evaluation resistance.
Many articles I
categorized in the practice area concerned evaluation in different
fields—healthcare, bidding for public services, welfare policy. These articles
did not discuss different evaluation approaches or models, so I did not
categorize them under theory.
Over one-fifth of
the articles (21.6 percent) concerned theory. Three of these eight articles
concerned theory-based evaluations—with two generally favorable and one
generally unfavorable toward the approach—while other evaluation approaches
addressed included qualitative, desk screening and implementation evaluation.
Hearn, Lawler and Dowswell (2003) addressed the
dominance of the positivist approach to most healthcare evaluation and argued
that an inclusion of “nonpositivist, qualitative, and
process-oriented evaluation” would improve our understanding of health programs
and policies.
I categorized six
articles (16.2 percent) as methods articles. Interestingly, all of these
articles focused on quantitative methods of data collection and analysis. Sverdrup (2003) discussed the use of time-series databases
of complaints data to evaluate laws and regulations.
The metatheory category included five articles (13.5 percent)
across 2003-2004. Virtanen and Uusikylä
(2004) address the “paradigm crisis” in evaluation that stems from evaluators’
different assumptions about causality. These authors describe four alternative
models (which they term ideal models) for evaluation considering: 1) how
explicitly causality has been taken into account, and 2) how well the model
enhances public-sector accountability.
The model
reflecting both a strong link between causality and the evaluation design and
an emphasis on public accountability is termed “transparent democracy”.
“Scientific inquiry” signifies a strong link between the evaluation design and
causality without an emphasis on accountability. The “explorative inquiry”
model is characterized by a high degree of emphasis on accountability and a
difficulty in distinguishing causal effects. Finally, an evaluation using the
“symbolic evaluation” model serves a symbolic purpose rather than a “true
pursuit of learning.” (89)
References
Hearn, J., Lawler,
J., & Dowswell, G. (2003) Qualitative
evaluations, combined methods and key challenges: General lessons from the
qualitative evaluation of community intervention in stroke rehabilitation. Evaluation. 9: 30-54.
Sverdrup,
S. (2003). Towards an evaluation of the effects of laws: Utilizing time-series
data of complaints. Evaluation. 9:
325-339.
Taut, S., & Brauns, D. (2003). Resistance to evaluation: A
psychological perspective. Evaluation.
9: 247-264.
Virtanen,
P., & Uusikylä, P. (2004) Exploring the missing
links between cause and effect: A conceptual framework for understanding
micro–macro conversions in programme evaluation. Evaluation. 10: 77-91.
P. Cristian Gugiu
The state of evaluation in
Compared to
Evaluation
Journals and Newsletters
East European Journals
Several representatives of the European
Evaluation Society (EES) report that no one knows of any journal or newsletter
publications in
According to Barbara Rosenstein, Ph. D., Chairperson of the
Israeli Association for Program Evaluation (IAPE), the IAPE has published, to
date, eight newsletters, in both Hebrew and English on evaluation.
Israeli Journal: Studies in Educational Evaluation
Studies in
Educational Evaluation (SEE) is published in English. The majority of articles were not
published by Israelis. Authors were dispersed throughout the world including
the
A great many of the articles were purely research articles,
a few of them described an evaluation case study, and a fair number of them
discussed a specific methodology that could be used in evaluation.
Evaluation
Societies
European
Evaluation Society (http://www.europeanevaluation.org/)
The primary goal of the European Evaluation
Society (EES) is to promote theory, practice and utilization of high quality
evaluation especially, but not exclusively, within the European countries. This
goal is obtained by bringing together academics and practitioners from all over
EES held its sixth conference on September 30 to
|
|
Over three-quarters of the
presenters came from West European countries including Belgium (2.4 percent),
Denmark (2.7 percent), Finland (4.2 percent), France (4.8 percent), Germany
(9.3 percent), Greece (0.3 percent), Iceland (0.3 percent), Ireland (2.1
percent), Italy (15.9 percent), Netherlands (5.4 percent), Norway (1.5
percent), Portugal (2.7 percent), Spain (5.7 percent), Sweden (5.7 percent),
Switzerland (4.5 percent), and the United Kingdom (9.0 percent). The remaining
presenters included countries from Asia (Japan, 0.6 percent; Korea, 0.9
percent), Australasia (Australia, 2.4 percent; New Zealand, 0.6 percent),
Africa (Angola, 0.3 percent; Guinea Bissau, 0.3 percent; Kenya, 0.3 percent;
Nigeria, 0.9 percent), East Europe (Austria, 2.4 percent; Bosnia and
Herzegovina, 0.3 percent; Czech Republic, 0.3 percent, Poland, 1.2 percent), the
Middle East (Egypt, 0.3 percent; Israel, 0.3 percent, Palestine, 0.3 percent),
North America (Canada, 0.9 percent; United States, 5.1 percent), and Latin
America (Colombia, 0.6 percent; Mexico, 1.8 percent).
|
|
There were slightly more male presenters than
female presenters.[17]
However, this statistic was primarily influenced by the large number of West
European presenters. Five of the seven other regions had an equal or greater
number of female presenters than male presenters.
|
|
An examination of the type of jobs presenters
worked in revealed that the majority of them worked for a university or college
in their native country. The two next largest groups included people who worked
in private industry or for the government.[18]
It was interesting to note the differences in distribution of job type among
the eight regions. For seven of the eight regions, presenters typically worked
at a university. However, for
|
|
Polish
Evaluation Society (http://www.pte.org.pl/)
The Polish Evaluation Society (PES) began in 2001
and set out to build an evaluation culture and popularize evaluation as a
social and democratic process. To this end, it sought to (a) organize studies,
courses and trainings; (b) conduct evaluation research; (c) exchange
experiences with other societies, institutions and organizations; (d) organize
meetings, seminars and conferences, (e) publish in the area of evaluation, and
(f) provide consulting and advising services.
The Polish Evaluation Society has very strict
rules as to the educational qualifications of its members. Most members are
still strongly connected with the academic environment, either via didactic
activity or scientific research (
Members of PES are professional evaluators who
also conduct marketing research and other research on social character. They
have wide experience in the field of the evaluation which they gained in the
process of conducting a variety of research for Polish and international
organizations such as Polish Children and Youth Foundation, Public Interest
Institute, government organizations such as European Integration Committee, the
ministry of Education, service sector companies such as Daewoo, and EU
institutions such as European Parliament and European Commission. Members of
PES use different paradigms and research perspectives. A Rich variety of the
activities and approaches is an advantage of this organization.
Romanian
National Assessment and Examination Service (http://www.edu.ro/snee.htm)
The National Assessment and Examination Service
(NAES) was established in 1998 by the Romanian Government as the first
national, independent body providing professional expertise in educational
assessment and examinations in
NAES is actively involved in national and
international projects (e.g. the British Council, QUATRO Fontys—PTH Eindhoven) and maintains professional contacts with
universities, research institutes, governmental and nongovernmental
institutions and organizations in the field (e.g. CITO—The Netherlands, EDC—USA
etc.). Their headquarters in
Israeli Association for
Program Evaluation (http://www.iape.org.il/)
The Israeli Association for Program Evaluation
(IAPE) is a non-profit, professional organization comprised of academics,
practitioners and users of program and project evaluation in a variety of
fields—psychology, education, social services, health, business,
and others. The goals of the organization include (a) increasing the use of
program evaluation and its findings, (b) encouraging the development of the
theory of program evaluation, (c) advancing the essential recognition of
program evaluation as a means of improving the effectiveness of social and
educational interventions, (c) promoting the recognition of program evaluation
as a profession, (d) serving the communities and the populations involved in
program evaluation, (e) contributing to the influence of program evaluation on
decision making, (f) supporting and influencing evaluation practice in Israel,
and (g) creating and developing professional ties among evaluators and users of
evaluation in Israel. To this end, the IAPE has sought to (a) organize
conferences focusing on issues of concern to the evaluation community; (b)
create an electronic and regular mail network that provide information about
issues concerning evaluation in Israel and abroad; (c) establish connections
with evaluation organizations throughout the world; (d) participate in the
worldwide forum of evaluation associations, (e) circulate of a list of members
to evaluation consumers in Israel; and (f) publish a newsletter containing
articles, discussions, and events of interest to the evaluation community in
Israel.
Thomaz Chianca and Brandon Youker[19]
In the past ten
years, evaluation, as a professional field, has undergone significant
development in several countries in
The first
professional evaluation organization that was formed in the region was the
Central American Evaluation Association (ACE) in 1989 that has had its
headquarters, since then, in
Only eight years
later, in 2002 new evaluation professional organizations were established in
LAC. Given their specific contexts,
PREVAL (Program for Strengthening the
Regional Capacity for Evaluation of Rural Poverty Alleviation Projects in Latin
America and the Caribbean)—a joint effort between IFAD and, from 1995-2000, the
Inter-American Institute for Cooperation on Agriculture (IICA), and, from
2000-2007, with the Centro de Estudios para la Promoción del Desarrollo (Center of Studies for Development Promotion)—has
played an strategic role in the region since 1995, contributing directly to the
creation of the national evaluation networks in Peru and Colombia. In its first
two phases (1995-2000 and 2000-2004), PREVAL focused its work on strengthening
the evaluation capacity of IFAD projects to reduce rural poverty in the region.
In its third phase (2004-2007), PREVAL will broaden its objectives to work more
closely with governments, organizations offering technical assistance in
monitoring and evaluation, as well as national evaluation and monitoring
networks/associations in the region. PREVAL has established an important network
of evaluators working with projects aimed at alleviating rural poverty, and has
produced an important body of knowledge in this area published in Spanish. It
is also important to recognize the key role played by the International
Organization for Cooperation in Evaluation (IOCE)—comprising all national
and regional evaluation organizations around the globe—in fostering the
establishment of evaluation organizations in the region. IOCE held an important
planning meeting in
In September 2003,
representatives from the four existing evaluation organizations in the region
got together in
There are at least four
electronic discussion lists on evaluation in the LAC region: RELAC, PREVAL, the
Brazilian Evaluation Network,
and the ILPES/CEPAL.
It is not
over-optimistic to assume that very soon we will witness a significant increase
in the number of evaluation professional organizations in LAC.
Use
of Professional Evaluation in Key Societal Sectors in LAC
There has been
significant growth in the use of professional evaluations by the government,
the nonprofit sector, and at least in the field of personnel evaluation, in
large private businesses. In the government arena, initiatives related to
national educational evaluation/assessment systems, innovations in government
administration systems, and social development programs supported by
international cooperation agencies are major factors influencing such growth.
In education, the
establishment of evaluation mechanisms has been extensive from basic (K-12) to
higher education in many countries within the region. In
The idea of
reducing the size of the state and making it more effective and efficient
(state reform) has strongly influenced virtually all countries in the region.
Such an idea brings along a strong push for the establishment of control
systems on expenditures as well as for implementation of planned activities
that usually involve monitoring and, to some extent, evaluation. Several
countries have created structures, usually subordinated to the ministry of
planning, that are in charge of dealing with monitoring and internal evaluation
of governmental efforts. Examples of such structures are the System of
Information, Evaluation, and Monitoring of Social Programs (
In the area of
social development, virtually all programs supported by international
cooperation agencies such as the World Bank, Inter-American Development Bank,
World Health Organization, and United States Agency for International
Development (USAID), are required to be evaluated both internally as well as by
using external evaluators. These organizations have played a major role in
introducing innovations in evaluation as well as advocating for the use of
quality professional evaluations within government funded initiatives. Several
examples of such evaluations are already publicly available from the agencies’
websites (e.g., USAID and the World
Bank-Operations Evaluation Department (OED). The Latin American Institute
for Social and Economic Planning (ILPES),
subordinated to the United Nations Economic Commission for Latin America and
the Caribbean (CEPAL), has been an
important reference in providing evaluation support to country-level government
evaluators by offering supporting materials (publications); evaluation
training; and networking opportunities for professionals working in evaluations
of governmental social-development programs in the region.
Initially
influenced by international foundations investing in the region the fast—growing
nonprofit sector in
The W.K. Kellogg Foundation is one of the
international foundations that have significantly invested in the development
of evaluation in LAC. In 1995 and 1997, the foundation sponsored two groups of
LAC evaluators (a total of approximately 40 professionals) in in-depth training
programs in evaluation at The
Evaluation Center—Western Michigan University. Some of the participants
of such training opportunities are assuming leadership roles in evaluation in
the creation of evaluation organizations in their home countries.
Several foundations
and institutes are commissioning and/or developing evaluations throughout the
region. In Brazil, for instance, some of the nonprofit organizations that are
very active in evaluation include: Fundação Carlos Chagas, Fundação Cesgranrio, Instituto Ayrton Senna, Fundação ABRINQ Foundation, Instituto FONTE,
Fundação Roberto Marinho, and Fundação IOCHPE.
Another interesting
movement influencing the growth of evaluations in the third sector is the
increasing number of private businesses investing in social initiatives, based
on the idea of social responsibility. Such organizations have a different
culture (focus on control and efficiency) from the nonprofit organizations investing
in the sector, and have made an important push to support the establishment of
monitoring and evaluation systems in the initiatives sponsored by them. In
The extent of
evaluation use in the private sector is not very public. It is evident that
several corporations and other private business have made serious efforts to
evaluate their products, projects and personnel. Reports on such efforts,
however are not easily accessible and the evaluators working in this area have
almost no contact with other evaluators working in the public and nonprofit
sectors. No doubt more extensive exchange of experiences between these
professionals has great potential to be beneficial to all, but some important
barriers such as prejudices from both sides (e.g., ‘private sector only look at
profits;’ ’public and nonprofits are always inefficient’) need to be overcome
before such approximation has any chance of succeeding.
Body
of Original Publications in Evaluation
Though there are
virtually no evaluation specific journals in LAC, there are several journals
related to education, health, and social sciences with strong evaluation
content. Some examples include:
· La Revista de Ciencias
Sociales (Journal
of Social Sciences—Costa Rica)
· Revista
Ensaio – Avaliação e Políticas Públicas em Educação (Evaluation and Public Policy in
Education—Brazil)
· Estudos em
Avaliação Educacional (Educational Evaluation Studies—Brazil)
·
Cadernos
de Saúde Pública
(Journal of Public Health—
·
Revista
Avaliação Psicológica
(Journal of Psychological Evaluation—
·
Revista da Rede de Avaliação
Institucional (Journal of the
Institutional Evaluation of Higher Education Network—
·
Cuadernos
de Investigación de la Escuela de Gerencia Social (Journal of Inquiry of
the School of Social Management—Venezuela)
· Revista del Instituto de
Investigaciones en Ciencias de la Educación (Journal of the
Education Science Investigation Institute—Argentina)
· Acción y Reflexión Educativa (Educative Action and Reflection—Panama)
·
Planejamento e Políticas Públicas (Planning
and Public Policy—
· Revista de Administração Pública (Journal of
Public Administration—Brazil)
The footnoted
social science journals[21]
have regularly published the intellectual products of LAC evaluators.
It is critical to
acknowledge the substantial collection of accessible evaluation publications
such as books, manuals, newsletters, technical reports, etc. that are available
in most Latin American countries. There
are several websites such as the Latin American Institute for Social and
Economic Planning (ILPES), and the Programme for
Strengthening the Regional Capacity for Evaluation of Rural Poverty Alleviation
Projects in Latin America and the Caribbean (PREVAL) that provide an extensive
collection of evaluation publications in the field of evaluation throughout the
region.
There are two
excellent annotated bibliographies that provide published reference materials
that address several aspects of evaluation in LAC. The first publication, The Annotated Bibliography of International Programme Evaluation, edited by Russon
& Russon[22]
has a chapter by Antoinette B. Brown and Ada Ocampo, on
There are a few LAC
universities and training institutions that offer masters level programs,
specifically in evaluation. At the masters’ level, there are at least five
universities offering such program:
·
Professional
Masters in Evaluation of Social Programs and Projects. Universidad de
Costa Rica.
·
Masters in
Socio-Economic Evaluation of Investment Projects. Universidad Panamericana.
·
Masters of Science in Project
Management and Evaluation. University of the
·
Masters in Project
Evaluation. Universidad del CEMA,
· Masters in Social Projects
Evaluation. Universidad Autónoma de Guadalajara.
Guadalajara, Jalisco, México.
At the
certification level there are quite a few programs offered in different
countries including:
·
Course on
Evaluation of Social Programs and Projects. Centro de Empreendedorismo Social e Administração
em Terceiro Setor—CEATS (
·
Diploma in Evaluation
of Projects. Universidad de Concepción.
·
Diploma
in Evaluation of Social Projects. Pontificia Universidad Católica
de Chile.
·
Diploma in
Planning and Evaluation of Projects. Universidad de Chile.
·
Diploma in
Planning and Evaluation of Socioeconomic Projects. Centro de Análisis y Evaluación de Política
Pública—CAEP—Monterrey, Mexico. (
·
International
Certificate of Project Planning, Evaluation, and Management—Inter-American Development
Bank. Centro de Investigaciones Territoriales
·
Post Graduate in Formulation
and Evaluation of Projects. Universidad Americana.
There are also several
short-term evaluation training courses facilitated by different organizations
within the region. Some of the best sources to identify such training
opportunities include: (a) Nota Informativa
del ILPES sobre Evaluación
de Proyectos y Programas (ILPES
Informative Note on Program and Project Evaluation); (b) PREVAL; and FONTE Institute. The following is a
sample of the recently offered short-term courses in some LAC countries:
·
X International Course on Planning
and Evaluating Public Investment Projects. Offered by CEPAL/ILPES.
Sept 27 to Oct 22,
·
Internet-based
course on Planning and Evaluation of Agricultural and Agri-Industrial
Projects. Offered by REDCAPA and Austral
University of Chile. Sep 1 to
·
International
Course on Logic Model, monitoring and Evaluation. Offered by
ILPES/CEPAL and the Spanish Cooperation Agency (AECI). Jun 21 to
·
International
Course on Use of Socio-Economic Indicators for the Evaluation of Impact of
Poverty Reduction Programs. Offered by ILPES/CEPAL and the Spanish Cooperation
Agency (AECI). May 3—14.
·
Utilization-Focused
Evaluation by Michael Quinn Patton. Sponsored by the Brazilian
Evaluation Network, UNICEF-Brazil, and FONTE Institute.
·
Collaborative
Evaluation by Rita O’Sullivan. Sponsored by the Brazilian Evaluation
Network, UNICEF-Brazil, and FONTE Institute.
The report makes no
claims to be comprehensive and does lack significant information, mainly about
the state-of-art of the evaluation field in the
It does, however
provide unquestionable evidence of the impressive advances the whole region has
made in the evaluation field in the recent past. Evidently, even though not to
the same degree in each country, it is reasonable to say that basic conditions
have been established to make such advances even more comprehensive and
effective in the future.
The current efforts
to establish national and regional evaluation organizations, the growing number
of quality publications in both Spanish and Portuguese on evaluation, the
increasing use of professional evaluation by different organizations in all
societal sectors and the broad recognition of evaluation as important for
improving society are some of the factors influencing such advances. One major
challenge still to be faced in order to have evaluation in a better position as
a recognized professional field is the creation of more formal graduate-level
training for evaluators in a wider range of countries.
This paper is a
work in progress that will be modified and/or improved as we gain new
information. If you would like to provide additional information or point out
any errors or misunderstanding in the text, please do not hesitate to contact Thomaz Chianca (thomaz.chianca@wmich.edu) or Brandon
Youker (brandon.w.youker@wmich.edu).
Lori A. Wingate
|
AJE Web site: http://www.sciencedirect.com/science/journal/10982140 |
The American Journal of Evaluation (AJE) is the flagship publication of the
American Evaluation Association, the world’s largest organization for
professional evaluators. As such, AJE
plays an important role in defining the relatively young discipline of
evaluation and influencing the work and thought of many practicing evaluators,
many of whom have never had any formal training in evaluation.
In the Evaluation Thesaurus Scriven
(1991) provides an analogy for understanding how various disciplines, and the
levels of activities within those disciplines, relate to one another. In this
analogy, he suggests we think of disciplines as estates in the “country of the
mind.” He explains, “The houses on an estate have a ground floor representing
applied work; a floor above that which is devoted to developing instruments,
methods, and techniques, and a top floor where the theoretical work is done. Up
in the attic, out of sight for most of the time, is the den of metatheory” (pp. 13-14).
I used this framework to analyze the contents of AJE articles (from Spring 2003 through the present issue, which is Autumn 2004). I categorized the articles (65 in all) according to whether they focused on practice, methods, theory, or metatheory, and one additional category—history. The breakdown is shown Figure 1. Below I describe these categories and summarize the articles associated with those categories, highlighting what I believe to be the most important articles.
|
|
|
Figure 1. Focus of 2003-2004 AJE articles |
Practice
“Practice” articles
deal with ways of working with stakeholders and clients, ethical challenges,
evaluation contexts, managerial aspects of evaluation, and evaluation use.
Almost half (46.2 percent) of the articles published in AJE since 2003 focus primarily on such practical aspects of the
evaluation profession.
Eight of the 30
articles in the Practice category are part of AJE’s “Ethical Challenges”
series, in which the section editor, Michael Morris, presents a brief scenario
in which an evaluator faces an ethical challenge. In response, two
commentators, in two separate articles, analyze the nature of the ethical
problem and describe what they believe to be the appropriate response by the
evaluator in the scenario, especially in light of the American Evaluation
Association’s Guiding Principles for Evaluators and The Program Evaluation Standards by the Joint Committee on Standards for Educational Evaluation
(1994).
Seven articles in
the Practice category focus on evaluation use, with five of these appearing as
a series in a single issue. These use-oriented articles explore the many facets
of evaluation utilization. They provide exemplars of useful evaluation,
identify factors that promote and impede evaluation use, and weigh the
sometimes conflicting values of evaluation utility and scientific rigor.
Evaluation is an inherently applied discipline—intended to be used—but it is
something that many people shy away from, or downright fear. Given these
conflicting conditions, it is no surprise that many evaluators are interested
in improving evaluation utilization. I categorized two other use-oriented
articles (by Henry [2003] and Henry and Mark [2003]) in the “Metatheory” category, because they go beyond the practical
issues related to use and venture into a theory about evaluation influence,
which I discuss in greater detail in that section.
The remaining articles
that I included in the Practice category address a variety of issues that have
emerged out of the experience of real people engaged in the practice of
evaluation—for example, how certain evaluation contexts present particular
challenges or opportunities, the managerial aspects of evaluation (e.g.,
contracts, resource constraints), and how to communicate effectively with
stakeholders. One article that stands as particularly useful is by Bamberger, Rugh, Church, and Fort (2004). They offer several practical
solutions for common problems that evaluators face when working under severe
constraints. Their recommendations are most relevant for impact evaluations in
which the use of control groups, baseline data, and random sampling would be
ideal but not feasible due to timing, resources, and/or availability of data.
Articles focusing
on practice offer readers insights into the real world of evaluation, where
textbook methods and theory meet politics, red tape, ethical dilemmas, and
stakeholders and clients who may or may not be interested in participating in
evaluation or using its results. These types of articles provide readers with
opportunities to learn from others’ mistakes and successes in the uncertain
world of evaluation practice. They offer students and established evaluators
insights into how evaluation happens in the real world—lessons often not
provided in textbook expositions on theory and methods.
Methods
“Methods” articles
focus on a particular approach to data gathering and/or analysis. Seventeen of
the 65 AJE articles (26.2 percent) deal primarily with methods. Such articles
typically describe an innovative method or a modification of an existing
method. These articles were equally divided between qualitative (8) and
quantitative methods (8), with one article featuring a blend of both.
The qualitative
methods covered by the articles include concept mapping, site visits,
qualitative phone interviewing, the “most significant change” technique,
methods for reconstructing and analyzing program theories, the Delphi
technique, methods of values inquiry, and methods for formatively evaluating
educational technology.
Four of the seven
articles on quantitative methods discussed methods used to overcome problems
associated with randomized controlled trials, including the use of longitudinal
data on program outcomes to estimate program effects, two different methods for
analyzing impacts on beneficiary subgroups, and an approach for blending
experimental and quasi-experimental methods. Other articles focused on the
development of intervention-specific measures, techniques for assessing the
quality of program implementation, and the use of post-plus retrospective
pretests for measuring change.
The one article
that focused on a method that incorporates the use of both qualitative and
quantitative data described the development and use of a rubric for evaluating
collaboration.
Methods articles
highlight innovative and cutting edge approaches to evaluation data gathering
and analysis. Journal articles and professional conferences are probably the
most important ways practicing evaluators learn about new and useful methods.
The methods are typically described in the context of a particular evaluation,
which may help readers to discern the method’s applicability to the areas in
which they work.
Theory
“Theory” articles
center on the use of a particular evaluation approach or model. Evaluation
theory was the focus of just two articles (3.1 percent) published in AJE since 2003. One provides an in-depth
look at an evaluation that blended two approaches to evaluation—theory- driven
and utilization-focused. The other theory-focused article offers an adaptation
of Michael Fetterman’s empowerment evaluation model
(by Carolyn Sullins, a Senior Research Associate at
The Evaluation Center). Both deal with practical applications of theory, but
the emphasis in on the applied theory, rather than the specific methods or
findings. (There are other AJE
articles that feature the use of a particular theory, but the thrust of these
articles is on practice, not theory.)
No articles in the
timeframe examined (2003-2004) focused exclusively on an evaluation
theory/model/approach in its pure form. As Christie and Alkin
(2003) remark in their article about using a theory-driven approach in a
user-oriented evaluation, “theories are rarely, if ever, flawlessly translated
into practice” (p. 381). Given this, “in order to develop a deeper
understanding of how evaluation theories are best applied in practice, it is
important to describe cases where evaluation theories have been used in
practice” (p. 381). That, indeed, is the nature of these two Theory articles.
It was somewhat
surprising to me that only 2 articles out of 65 focused purely on evaluation
theory. It is an important area of inquiry would seem to warrant more space in AJE.
Metaetheory
Scriven
(1991) defines metatheory as a “‘theory’ about the
nature of a field of inquiry, engineering, or craft. It deals with matters such
as the definition of the field’s boundaries, its differences from neighboring
fields or disciplines, the reason why certain methods work well for it and
others are inappropriate…..it is the self-concept of the discipline” (p. 232).
Seven (10.8 percent) articles in AJE
directly discuss or contribute to the evaluation discipline’s self-concept, or metatheory.
Two of the Metaetheory articles focus on use. Both articles address
the issue of evaluation use not simply as a practical matter, but as a sort of
lens through we can view the role of evaluation discipline. Henry and Mark
(2003) address the shortcomings in the existing literature on evaluation use,
particularly the “inattention to the intrapersonal, interpersonal, and society
change processes through which evaluation findings and process may translate
into steps toward social betterment” (p. 294). They urge evaluators to look
beyond immediate use of findings as the primary utilitarian purpose of
evaluation, and instead focus on social betterment as the ultimate desired
outcome. They outline a general theory of evaluation influence. Similarly,
Henry (2003) offers several examples of evaluations that have been influential
and offers a “clearer picture of what evaluation should look like in the
future” (p. 515).
Two articles that I placed in the Metatheory category have to do with evaluation education. These articles do not directly contribute to the metatheory of evaluation in terms of content, but the way in which and what students and others learn about evaluation—its practice, methods, and theory, and history—is probably the primary vehicle by which evaluation metatheory develops. One article provides an overview of a one-year evaluation course that employs a mentoring approach. The other, by Christie and Rose (2003), provides an account of an informal discussion group. This group, facilitated by Marvin Alkin at UCLA, includes both students and faculty members who meet every other week to discuss an article in a recent issue of the American Journal of Evaluation. In addition to providing a venue in which members can share and test ideas, relate theory to practice, refine thinking, and hypothesize (among other things), the group also promotes socialization into the field. Such groups, write Christie and Rose, “are an alternative mechanism for encouraging the kinds of dynamic dialogue that facilitates the advancement of both theoretical and practical notions of a field, such as evaluation, that is so dependent up on the interchange of ideas” (p. 238).
In his article on
the Joint Committee evaluation standards, Stufflebeam
(2004) addresses the applicability of the Program, Personnel, and Student
Evaluation Standards to other cultural contexts. These are essentially
standards for evaluation practice, but they have played an important role in
shaping the field’s self-concept. At issue is whether the Standards can or
should be transferred to other cultural contexts, and Stufflebeam
argues they should not. The widespread interest in doing so is a testament to
the Standard’s relevance to the discipline’s self-concept.
Stake (2004) addresses
the role of advocacy in evaluation. He outlines six types of advocacies found
to some extent in most evaluations. Roughly, they are advocacy for (1) a
program’s success, (2) the evaluation discipline, (3) rationality, (4)
evaluation use, (5) the alleviation of underprivilege,
and (6) democracy. He argues that these advocacies shape evaluators’
interpretations of findings, which are “are enriched by personal experience”
(p. 107). He concludes the article by stating, “Comprehensive, idiosyncratic interpretations
are small steps toward saving the world” (p. 107).
The final article dealing with metatheory
views evaluation itself as an important object of inquiry and provides a
framework for researching the processes, contexts, obstacles, and knowledge claims
in public sector evaluations. In this article, Segerholm
(2003) reviews existing research on evaluation and concludes that it is “fairly
scarce” and tends to focus on particular aspects of the evaluation cycle (i.e.,
initiation, implementation, results, and utilization) (p. 356). Likewise, she
notes, metaevaluations (evaluations of evaluations)
usually focus on a single evaluation. Segerholm
argues that we need more research on evaluation to “gain knowledge and a more
thorough understanding of the phenomenon and practice of evaluation in
general” (p. 357).
History
In addition to Scriven’s disciplinary categories of practice, methods,
theory, and metatheory, I added History as a fifth
category. I found this to be necessary because articles that focus on the
development of the evaluation field cut across all the other categories,
dealing with evaluation practice, methods, and theory, as well as influential
personalities in the field; groundbreaking evaluations; important books; and
key agencies, organizations, and educational institutions. These articles also contribute to the
development and refinement of evaluation’s metatheory,
since they help interpret and shape the field’s self-concept. Nine (13.8
percent) of the AJE articles since
2003 delve into the history of evaluation.
Most of the
articles included in this category (6 out of 9) are oral history accounts of
evaluation leaders collected for The Oral History Project—an effort by Robin
Miller, Jean King, Melvin Mark, and Stacey Stockdill
to document the “genealogy” of program evaluation. These oral history articles
have featured interviews with Lois-ellin Datta and William Shadish, as
well as brief articles by Laura Leviton, Roger Straw,
Charles Reichardt, and Melvin Mark, who reflect on
their experience in the Methodology and Program Evaluation program in the
Psychology Department at Northwestern. Additional evaluation leaders will be
featured in future issues, leading to the compilation of a rich and detailed
history of the development of the evaluation field.
Margaret Mead’s
evaluation of the 1947 Salzburg Seminar on American Civilization is the focus
of the three other History articles.
References
Bamberger, M., Rugh, J., Church, M., & Fort, L. (2004). Shoestring evaluation: Designing impact evaluations
under budget, time and data constraints. American Journal of Evaluation, 25(1), 5-37.
Christie, C. A.,
& Alkin, M. C. (2003). The
user-oriented evaluator’s role in formulating a program theory: Using a
theory-driven approach. American
Journal of Evaluation, 24(3), 373-385.
Christie, C. A.,
& Rose, M. (2003). Learning about evaluation through dialogue: Lessons from
an informal discussion group. American
Journal of Evaluation, 24(2), 235-243.
Henry, G. T. (2003)
Influential evaluations. American Journal
of Evaluation, 24(4), 515-524.
Henry, G. T., &
Mark. M. M. (2003) Beyond use: Understanding evaluation’s influence on
attitudes and actions. American Journal
of Evaluation, 24(3), 293-314.
Joint Committee on
Standards for Educational Evaluation. (1994). The program evaluation standards (2nd ed).
Scriven,
M. (1991). Evaluation thesaurus.
Segerholm,
C. (2003). Researching evaluation in
national (state) politics and administration: A critical approach. American Journal of Evaluation, 24(3),
353-372.
Stake, B. (2004).
How far dare an evaluator go toward saving the world? American Journal of Evaluation, 25(1), 203-107.
Stufflebeam,
D. L. (2004). A Note on the Purposes, Development, and Applicability of the
Joint Committee Evaluation Standards. American
Journal of Evaluation, 25(1), 99-102.
Daniela C. Schröter
Evaluation
is a quarterly, European-based journal that in addition to interdisciplinary
and multidisciplinary peer-reviewed articles occasionally provides Special Issues, Visits to the World of
Practice, News from the Community, Book
Reviews, Speeches and Addresses, and Debates,
Notes, and Queries.