Journal Article
Research Support, Non-U.S. Gov't
Add like
Add dislike
Add to saved papers

Selecting and Simplifying: Rater Performance and Behavior When Considering Multiple Competencies.

THEORY: Assessment of clinical competence is a complex cognitive task with many mental demands often imposed on raters unintentionally. We were interested in whether this burden might contribute to well-described limitations in assessment judgments. In this study we examine the effect on indicators of rating quality of asking raters to (a) consider multiple competencies and (b) attend to multiple issues. In addition, we explored the cognitive strategies raters engage when asked to consider multiple competencies simultaneously.

HYPOTHESES: We hypothesized that indications of rating quality (e.g., interrater reliability) would decline as the number of dimensions raters are expected to consider increases.

METHOD: Experienced faculty examiners rated prerecorded clinical performances within a 2 (number of dimensions) × 2 (presence of distracting task) × 3 (number of videos) factorial design. Half of the participants were asked to rate 7 dimensions of performance (7D), and half were asked to rate only 2 (2D). The second factor involved the requirement (or lack thereof) to rate the performance of actors participating in the simulation. We calculated the interrater reliability of the scores assigned and counted the number of relevant behaviors participants identified as informing their ratings. Second, we analyzed data from semistructured posttask interviews to explore the rater strategies associated with rating under conditions designed to broaden raters' focus.

RESULTS: Generalizability analyses revealed that the 2D group achieved higher interrater reliability relative to the 7D group (G = .56 and .42, respectively, when the average of 10 raters is calculated). The requirement to complete an additional rating task did not have an effect. Using the 2 dimensions common to both groups, an analysis of variance revealed that participants who were asked to rate only 2 dimensions identified more behaviors of relevance to the focal dimensions than those asked to rate 7 dimensions: procedural skill = 36.2%, 95% confidence interval (CI) [32.5, 40.0] versus 23.5%, 95% CI [20.8, 26.3], respectively; history gathering = 38.6%, 95% CI [33.5, 42.9] versus 24.0%, 95% CI [21.1, 26.9], respectively; ps < .05. During posttask interviews, raters identified many sources of cognitive load and idiosyncratic cognitive strategies used to reduce cognitive load during the rating task.

CONCLUSIONS: As intrinsic rating demands increase, indicators of rating quality decline. The strategies that raters engage when asked to rate many dimensions simultaneously are varied and appear to yield idiosyncratic efforts to reduce cognitive effort, which may affect the degree to which raters make judgments based on comparable information.

Full text links

We have located links that may give you full text access.
Can't access the paper?
Try logging in through your university/institutional subscription. For a smoother one-click institutional access experience, please use our mobile app.

Related Resources

For the best experience, use the Read mobile app

Mobile app image

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices Toggle icon

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app