Home Print this page Email this page Small font sizeDefault font sizeIncrease font size
Users Online: 435

 

Home About us Editorial board Search Ahead of print Current issue Archives Submit article Instructions Subscribe Contacts Login 
     


 
 Table of Contents    
ORIGINAL ARTICLE  
Year : 2014  |  Volume : 4  |  Issue : 1  |  Page : 12-17
A study of examiner variability in assessment of preclinical class II amalgam preparation


1 Department of Conservative Dentistry, CSI College of Dental Sciences, Madurai, Tamil Nadu, India
2 Department of Public Health, University of Leeds, United Kingdom

Click here for correspondence address and email

Date of Web Publication17-Oct-2014
 

   Abstract 

Introduction: Training in typodont preclinical restorative exercises is one of the ways by which the required manual skill dexterity could be developed in students. The assessment of these exercises has been associated with inter and intra examiner variability. This study was undertaken to examine whether introduction of objective scoring criteria can reduce the examiner variability and evaluate for inter and intra examiner variability in assessing class II amalgam tooth preparation by using two different methods i.e., glance and grade and objective checklist criteria scoring.
Materials and Methods: The study evaluated 41 undergraduate students performing two Class II disto-occlusal amalgam preparations performed on plastic typodont left lower and upper first molars. The preparations were evaluated by four blinded independent examiners using two methods viz., glance and grade and objective checklist scoring methods. Statistical analysis for inter and intra examiner variability was tested using Friedman test and Wilcoxon signed rank test, respectively.
Results: Results of this study show that intra-examiner variability was significantly reduced with objective checklist criteria scoring. The inter-examiner variability was present both in a glance and grading method and objective checklist scoring method.
Conclusion: This study concludes by recommending that preclinical operative work of students be assessed by objective checklist criteria scoring and it should be introduced after sufficient training and calibration sessions to reduce examiner variability.

Keywords: Assessment, competency, dental education, pre-clinical teaching

How to cite this article:
Sherwood I A, Douglas GV. A study of examiner variability in assessment of preclinical class II amalgam preparation . J Educ Ethics Dent 2014;4:12-7

How to cite this URL:
Sherwood I A, Douglas GV. A study of examiner variability in assessment of preclinical class II amalgam preparation . J Educ Ethics Dent [serial online] 2014 [cited 2024 Mar 28];4:12-7. Available from: https://www.jeed.in/text.asp?2014/4/1/12/143150



   Introduction Top


According to Cowpe et al. one of the competencies a dental graduate needs to achieve in restorative dentistry is "restoring the diseased and damaged teeth, and the management of dental caries by direct and indirect means using materials and techniques that maintain pulp vitality and restore the teeth to form, function and appearance acceptable to the patient in ways which prevent further disease and damage, and help to promote the health of adjacent soft tissues." [1] This competency level acquirement in restorative dentistry requires manual dexterity, and one of the important ways a student can achieve this dexterity is by pre-clinical typodont exercises. Preclinical typodont training needs to be suited so as to help the student in acquiring the necessary competency level for performing the clinical restorative dental work At the end of the pre-clinical academic session, students are assessed whether they have acquired the necessary sufficient dexterity, skills and knowledge for them to enter into the clinics. This assessment of preclinical skills has long been associated with both inter and intra examiner variability as early as 1930s by Brown. [2] This examiner variability has been extensively documented and has been found to be for various reasons like experience of examiners, calibration of examiners, assessment tools employed. [3],[4],[5],[6],[7],[8]

Assessment tools employed in preclinical restorative training vary widely with the two most common being the glance and grading method and the objective check list criteria, and any assessment procedure employed should be transparent in manner with both the staff and students informed of the purpose and the process adopted. [9],[10] The goal of any assessment procedure employed should be that it promotes drive to learn among students and be a spring board for students to adopt and develop effective restorative skills. [11] Traditional "signing-off" assessment like a glance and grading method with little feedback does not encourage learning among the students. So assessment methods encouraging deeper learning and feedback like self-assessment and reflective learning, use of discussion group and peer assessment are encouraged to be part of the learning course. [12] But these methods cannot be used in university examination where faculty assessment will be necessary, so detailed insights and study into the faculty based assessment becomes necessary. Assessment criteria employed by faculty or teachers whether in university examination or in student training needs to incorporate an essential feature of encouraging learning in students and allowing for feedback for both teachers and students, in addition to improving the examiner consistency.

Although no difference has been found between glance and grading scoring method and objective checklist scoring method in assessing operative procedures performed in plastic teeth, Scott et al. reported that introduction of a criteria list in scoring method does help to achieve a more objective result. [13],[14] Traditionally, university preclinical operative dentistry examination in India involves faculty based assessment of class II amalgam typhodont tooth preparation. Though amalgam has been phased out in certain developed countries, in many developing countries it is still widely used. [15]

This study was undertaken to examine whether introduction of an objective scoring criteria can reduce inter and intra examiner variability irrespective of calibration of the examiners, in assessment of preclinical class II amalgam tooth preparation done by students.


   Materials and Methods Top


The study evaluated 41 undergraduate students who were in second year of study, performing two Class II disto-occlusal amalgam preparations performed on plastic typodont left lower first molar and left upper first molar teeth (Ashoo and Sons, New Delhi). The students performed these procedures after a 1 h theory didactic lecture class and 1 h demonstration session on class II disto-occlusal preparation. The students performed the prescribed tooth preparations for this study, as they would be required to perform in a preclinical operative dentistry university examination.

Phantom heads were used to simulate the clinical situation; the tooth preparations were performed by the students with micromotor handpieces (NSK Co., Japan) and straight fissure diamond abrasives (Mani Co., Japan). The two designs of class II preparations were prepared using the same size bur and, by second year BDS students who have undergone same level of training experience to reduce the bias in tooth preparations' quality. The typodont teeth models were collected after the prescribed time schedule of 1 h for each preparation as in university examination time schedule and allotted number codes.

The preparations were evaluated and allotted marks by four blinded independent examiners without magnification using two methods of scoring viz., glance and grade method [Figure 1] and objective checklist scoring method [Figure 2] with explorer and mouth mirror under illumination with typodont models held in hand. One week after first evaluation the preparations were again evaluated by same examiners for second time using the same two methods of scoring for assessing intra-examiner variation. These examiners were not part of the checklist and criteria score designing. The examiners in this study were not subjected to any specific calibration methods except for the briefing of the scores distribution and scoring criteria to be employed.
Figure 1: Glance and grading scores

Click here to view
Figure 2: Objective criteria checklist

Click here to view


Objective checklist criteria scoring distribution used in this study were based on earlier studies by Roma et al., Haj-Ali et al. and Park et al. [16],[19] The objective scoring method in this study was designed and developed by the authors, in such a way that the marks awarded under each criterion will be able to better convey to a student reason why a preparation has been accepted/rejected, and to allow better feedback which would greatly encourage them in their learning curve. The objective checklist criterions' scoring was divided into two sections viz., external outline form and internal outline form of the preparations. No specific criteria were allotted or specified for glance and grading method, except that scores in this system was totaled to a maximum score of 60 in accordance to Tamil Nadu Dr MGR Medical University, Chennai scoring system and, same level of distribution of marks was assigned to objective checklist criteria to reduce the variability due to scores distribution.

Four examiners with different levels of experiences were selected, to understand the influence of level of experience in using the two different scoring methods. The examiners were university teachers and faculty members who had experience of handling preclinical restorative dentistry students with experience of 4-7 years and up to 12 years. No calibration or training sessions were done with use of either system of scoring method. To reduce the bias of evaluating the preparations by using one method of scoring system prior to using another system of scoring, the examiners were divided so that two examiners first evaluated the preparations by glance and grading method followed by objective checklist criteria scoring method. The other two examiners first evaluated the preparations by objective checklist criteria followed by the glance and grading method. The time frame allowed between assessments by two scoring methods for examiners was 1 week. For glance and grading method of scoring the preparations were accepted/rejected by the examiners on their own, without any preset determined criterion except for allotting the scores on five score scale categories same as in objective checklist scoring. This was done to reduce the bias due to scoring scale distribution. In objective scoring system a preparation was accepted/rejected on averaging the scores allotted for each individual criterion and totaled to 60 marks.

Statistical analysis for inter and intra examiner variability was tested using Friedman test and Wilcoxon signed rank test respectively with value of significance at 5 percent using SPSS statistical software.


   Results Top


[Table 1] and [Table 2] show the statistical significant (P < 0.05) difference between the examiners in using both the methods of scoring, indicating that the scoring allotted for the tooth preparations varied greatly between the examiners.
Table 1: Friedman test: inter-examiner variation objective checklist scoring

Click here to view
Table 2: Friedman test: inter-examiner variation glance and grading scoring

Click here to view


[Table 3] shows the statistical significant difference (P < 0.05) within the examiners in using glance and grading scoring method for evaluation of two preparations, indicating the scoring allotted for same tooth preparations by the same examiner varied greatly on two different occasions. [Table 3] also depicts that intra-examiner variation (Z-value) was higher for glance and grading method of scoring than objective checklist criteria scoring. [Table 4] shows that there was no significant difference within the examiners in using objective and checklist criteria scoring method except for one examiner for evaluating upper first molar disto-occlusal preparation (P = 0.046), indicating that the same examiners were allotting the marks for same preparations in a more predictable manner on two different occasions.
Table 3: Wilcoxon signed rank test: intra-examiner variation glance and grading scoring

Click here to view
Table 4: Wilcoxon signed rank test: intra-examiner variation objective checklist criteria scoring

Click here to view



   Discussion Top


Even though there are other methods of assessment apart from ones used in this study, like peer assessment and self assessment which has been shown to greatly improve in learning, in an university examination set up faculty based assessment is widely followed and needed. [12] Therefore insights into faculty based assessment of students' performance and variation in their assessments need to be addressed and studied. Results of this study support the notion of significant intra and inter-examiner variation occurs in evaluating students' work. [7],[8],[19],[20] It is seen from this study that by utilizing objective check list scoring intra-examiner variation was significantly reduced, and the intra-examiner reliability was greatly improved whereby the same examiner was able to assign scores in a more predictable manner, this increase in reliability was achieved without any prior calibration of examiners. It could be expected that the inter-examiner variation reported in this study [Table 1], could also be reduced further if examiners were offered experience of using the scoring method earlier and were calibrated in its use prior to assessment of the students work, but the literature with regard to effectiveness of calibration contains mixed results from being effective to not effective. [18],[21],[22],[23],[24],[25],[26] Haj-Ali and Feil prescribed that to overcome the variation among and within dental educators an appropriate evaluation system be developed that includes valid criteria, appropriate rating scale, and a training program that calibrates the raters. [18] They also suggested that the calibration and training sessions must be prepared with a valid gold standard against which the raters calibrated have been shown to greatly improve with inter-examiner agreement and reasonably resistant to deterioration over a period of 10 weeks. [18] The study by Garland and Newell indicated that calibration did not improve intra and inter-rater reliability but showed that the calibration sessions had its benefits in allowing new faculty members to calibrate with more experienced faculty members and allowing the examiners to become more aware of their evaluating skills and attuned to the students' perceptions of lack of consistency among faculty. [27] Objective checklist criteria scoring distribution used in this study was based on earlier studies by Roma et al., Haj-Ali et al. and Park et al. as these scoring distributions more closely met the criteria of allowing feedback and reflecting at what stage or stages of a preparation the students were performing poorly. [17],[19]

The results of this study varies from a study by Sharaff et al., by showing that there was reduction in intra-examiner variation when objective check-list criteria scoring system was used, but in glance and grading method of scoring the intra-examiner variation was present. [14] Intraexaminer variation observed in the glance and grading method [Table 3] in our study is contrary to study by Sharaff et al. This variation could be because only two class II preparation designs were examined in this study, unlike eight different preparation designs in the study by Sharaff et al. which could have made the examiners better acclimatized to assessing students' preparations leading to improvement in their reliability in scoring. The preparations also were analyzed by same examiners who authored the study which could have also led to some bias. [14] Reduction in intra-examiner variation [Table 4] in our study when objective scoring method was used, could be because the objective scoring had scores of only 0-5 [Figure 2] and not up to 10 scores as in Sharaff et al. study, which could have confused the examiners when allotting the scores. Also, the objective scoring system used here, had no ''very good'' and ''good'' scores which could give greater clarity to the examiners when allotting the scores, instead there was a score for ''excellent'', ''acceptable'' and two not acceptable scores for two different reasons for better understanding for the students to help learn the difference between what is excellent and marginally acceptable and why a particular preparation was not acceptable. [14] This design of the scoring criteria fulfilled one of the objectives of an assessment tool being able to promote learning drive among the students and allowing reflective learning unlike in traditional glance and grading scoring, and also having limited number of values in rater scale. [11],[18] Another reason for the reduction in intra-examiner variation on introduction of objective checklist scoring might be because the examiners could have become more aware of the required tooth preparation standards and having fixed standard criteria against which to judge the students' performance leading to eventual increase in reliability. In the study by Sharaff et al. the criteria for objective evaluation criteria were developed by consensus and implemented without any training or calibration sessions prior to implementation similar to this study. [14] This could have accounted for the higher levels of inter-examiner variability in the objective criteria scoring method which they reported. One examiner in this study did not have intra-examiner reliability in evaluating upper molar disto-occlusal preparation even while using the objective checklist criteria scoring [Table 4]. The reason may be because he was the newest faculty member of the institution and had limited experience in evaluating students work, indicating the more necessity for calibrating and training sessions for the newest member of teaching fraternity as recommended by Garland and Newell. [27]

The results of this study show that glance and grading method was more unreliable with both inter and intra examiner consistency being poor than compared to objective checklist criteria scoring. This is in accordance to study by Houpt and Kress, where it has been suggested that a written criteria would be necessary to improve the examiner reliability as when a dentist or examiner evaluates a clinical procedure he brings in his own clinical bias based on his own clinical experience leading to reduced reliability. [23] In an article by Plasschaert et al. one of the recommendations with regard to student assessment is the introduction of well defined criteria in relation to acquisition of competence of a skill. [28] So a well defined criteria as the one used in this study might be devised for students' evaluation, which allows them to reflect back on their performances and also in addition reduces the examiners assessment variability. [29] Future scoring criteria formulation could involve active student participation, as a study by Redwood et al., showed that active student participation in assessment procedure significantly improved their clinical assessment ability and this ability was able to be retained over course time. [30]

It is seen from this study that on introduction of objective checklist criteria scoring, there was significant improvement with intra-examiner reliability for experienced faculty members, but the inter-examiner variability persisted for all examiners. Therefore calibrating sessions to improve inter-examiner reliability for experienced faculty members and calibration to improve both inter and intra examiner reliability with newer faculty members' might be necessary along with the introduction of objective check-list criteria in evaluating students' performance. One of the limitations in using objective scoring method is scoring scale has to be developed and customized for every different procedures.


   Conclusion Top


This study calls for change in evaluation of preclinical operative work of students from traditional glance and grading method to objective checklist criteria scoring method which reduces the examiner variability, this method of scoring has to be introduced after sufficient training and calibration sessions for improving the examiner reliability especially with regard to lesser experienced faculty members.


   Acknowledgement Top


I want to thank Dr. Prabhakaran, PhD., of Department of Agricultural Economics, Tamil Nadu Agriculture University, Madurai for his valuable support in statistical work.



 
   References Top

1.
Cowpe J, Plasschaert A, Harzer W, Vinkka-Puhakka H, Walmsley AD. Profile and competences for the graduating European dentist - update 2009. Eur J Dent Educ 2010;14:193-202.  Back to cited text no. 1
    
2.
Brown RK. Research in the use of a rating scale as a means of evaluating the personalities of senior dental students. J Dent Res 1930;10:271-9.  Back to cited text no. 2
    
3.
Mackenzie RS. Defining clinical competence in terms of quality, quantity, and need for performance criteria. J Dent Educ 1973;37:37-44.  Back to cited text no. 3
[PUBMED]    
4.
Myers B. Beliefs of dental faculty and students about effective teaching behaviors. J Dent Educ 1977;41:68-76.  Back to cited text no. 4
[PUBMED]    
5.
Lilley JD, ten Bruggen Cate HJ, Holloway PJ, Holt JK, Start KB. Reliability of practical tests in operative dentistry. Br Dent J 1968;125:194-7.  Back to cited text no. 5
[PUBMED]    
6.
Fuller JL. The effects of training and criterion models on interjudge reliability. J Dent Educ 1972;36:19-22.  Back to cited text no. 6
[PUBMED]    
7.
Salvendy G, Hinton WM, Ferguson GW, Cunningham PR. Pilot study on criteria in cavity preparation. J Dent Educ 1973;37:27-31.  Back to cited text no. 7
    
8.
Jenkins SM, Drummer PM, Gilmore AS, Edmunds DH, Hicks R, Ash P. Evaluating undergraduate preclinical operative skill: Use of glance and grade marking system. J Dent 1998;26:679-84.  Back to cited text no. 8
    
9.
Manogue M, Brown G, Foster H. Clinical assessment of dental students: Values and practices of teachers in restorative dentistry. Med Educ 2001;35:36470.  Back to cited text no. 9
    
10.
Vann WF, Machen JB, Hounshell PB. Effects of criteria and checklists on reliability in preclinical evaluation. J Dent Educ 1983;47:6715.  Back to cited text no. 10
    
11.
Plasschaert AJ, Manogue M, Lindh C, McLoughlin J, Murtomaa H, Nattestad A, et al. Curriculum content, structure and ECTS for European dental schools. Part II: Methods of learning and teaching, assessment procedures and performance criteria. Eur J Dent Educ 2007;11:12536.  Back to cited text no. 11
    
12.
Satterthwaite JD, Grey NJ. Peer group assessment of pre-clinical operative skills in restorative dentistry and comparison with experienced assessors. Eur J Dent Educ 2008;12:99102.  Back to cited text no. 12
    
13.
Future use of materials for dental restoration. Report of the meeting convened at WHO HQ, Geneva, Switzerland Geneva, Switzerland: WHO document production services; 2009. p. 278.  Back to cited text no. 13
    
14.
Sharaf AA, AbdelAziz AM, El Meligy OA. Intra and Inter examiner variability in evaluating preclinical pediatric dentistry operative procedures. J Dent Educ 200771:5404.  Back to cited text no. 14
    
15.
Scott BJ, Evans DJ, Drummond JR, Mossey PA, Stirrups DR. An investigation into the use of a structured clinical operative test for the assessment of a clinical skill. Eur J Dent Educ 2001;5:3137.  Back to cited text no. 15
    
16.
Nick DR, Clark M, Miler J, Ordelheide C, Goodacre C, Kim J. The ability of dental students and faculty to estimate the total occlusal convergence of prepared teeth. J Prosthet Dent 2009;101:712.  Back to cited text no. 16
    
17.
Jasinevicius TR, Landers M, Nelson S, Urbankova A. An evaluation of two dental stimulation systems: Virtual reality versus contemporary non-computer-assisted. J Dent Educ 2004;68:115162.  Back to cited text no. 17
    
18.
Haj-Ali R, Feil P. Rater reliability: Short-and long-term ef­fects of calibration training. J Dent Educ 2006;70:42833.  Back to cited text no. 18
    
19.
Park RD, Susarla SM, Howell TH, Karimbux NY. Differences in clinical grading associated with instructor status. Eur J Dent Educ 2009;13:318.  Back to cited text no. 19
    
20.
Gansky SA, Pritchard H, Kahl E, Mendoza D, Bird W, Miller AJ, al. Reliability and Validity of a manual dexterity test to predict preclinical grades. J Dent Educ 2004;68:98594.  Back to cited text no. 20
    
21.
Natkin E, Guild RE. Evaluation of preclinical laboratory performance: A systematic study. J Dent Educ 1967;31:15261.  Back to cited text no. 21
    
22.
Abou-Rass M. A clinical evaluation instrument in endodontics. J Dent Educ 1973;37:2236.  Back to cited text no. 22
    
23.
Houpt MI, Kress G. Accuracy of measurement of clinical performance in dentistry. J Dent Educ 1973;37:3446.  Back to cited text no. 23
    
24.
Hinkelman KW, Long NK. Method for decreasing subjective evaluation in preclinical restorative dentistry. J Dent Educ 1973;37:138.  Back to cited text no. 24
    
25.
Edwards WS, Morse PK, Mitchell RJ. A practical evalu­ation system for preclinical restorative dentistry. J Dent Educ 1982;46:6936.  Back to cited text no. 25
    
26.
Scruggs RR, Daniel SJ, Larkin A, Stoltz RF. Effects of specific criteria and calibration on examiner reliability. J Dent Hyg 1989;63:1259.  Back to cited text no. 26
    
27.
Garland KV, Newell KJ. Dental hygiene faculty calibration in the evaluation of calculus detection. J Dent Educ 2009;73:3839.  Back to cited text no. 27
    
28.
Plasschaerti AJ, Manogue M, Lindh C, McLoughlin J, Murtomaas H, Nattestad A, et al. Curriculum content, structure and ECTS for European dental schools. Part II: Methods of learning and teaching, assessment procedures and performance criteria. Eur J Dent Educ 2007;11:12536.  Back to cited text no. 28
    
29.
Moore U, Durham J. Invited commentary: Issues with assessing competence in undergraduate dental education. Eur J Dent Educ 2011;15:537.  Back to cited text no. 29
    
30.
Redwood C, Winning T, Lekkas D, Townsend G. Improving clinical assessment: Evaluating students' ability to identify and apply clinical criteria. Eur J Dent Educ 2010;14:13644.  Back to cited text no. 30
    

Top
Correspondence Address:
I Anand Sherwood
Anand Dental Clinic, No, 1, Meenakshi Towers, P.T. Rajan Road, Bibikulam, Madurai - 625 002, Tamil Nadu
India
Login to access the Email id

Source of Support: None, Conflict of Interest: None


DOI: 10.4103/0974-7761.143150

Rights and Permissions


    Figures

  [Figure 1], [Figure 2]
 
 
    Tables

  [Table 1], [Table 2], [Table 3], [Table 4]



 

Top
 
 
  Search
 
  
    Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
    Email Alert *
    Add to My List *
* Registration required (free)  


    Abstract
   Introduction
    Materials and Me...
   Results
   Discussion
   Conclusion
   Acknowledgement
    References
    Article Figures
    Article Tables

 Article Access Statistics
    Viewed4307    
    Printed190    
    Emailed0    
    PDF Downloaded356    
    Comments [Add]    

Recommend this journal