Printer Friendly

Reproducibility of Malignant Pleural Mesothelioma Histopathologic Subtyping: Methodologic Issue.

To the Editor.--We were interested to read the paper by Brcic and colleagues (1) in the Archives of Pathology & Laboratory Medicine. The authors aimed to determine the interobserver and intraobserver reproducibility in the histologic differentiation among the main types of malignant pleural mesothelioma (MPM), and in further subtyping of epithelioid MPM. (1) One representative hematoxylin-eosin-stained slide was selected from the archive for each of 200 patients with MPM. The slides were reviewed independently by 3 pathologists and classified according to the current World Health Organization classification of pleural tumors. The interobserver and intraobserver agreement was interpreted using the Cohen [KAPPA] statistic. Based on their results, the overall interobserver agreement for histologic subtyping of mesothelioma was fair ([KAPPA] = 0.36) and the agreement was increased to substantial ([KAPPA] = 0.63) in the second round; therefore, improvement was found in interobserver agreement for all types of MPM and for most epithelioid subtypes. (1)

Knowing that there is no value of [KAPPA] that is internationally a sign of good agreement is of great importance. The [kappa] value to assess the agreement of a qualitative variable has 2 weaknesses, as follows. First, it depends on the prevalence in each category. In other words, it is possible to have different [KAPPA] values with the same percentage for both concordant and discordant cells! In the Table, in both situations (a) and (b), concordant (agreement) and discordant (disagreement) cells have prevalences of 90% and 10%, respectively; however, we get different [KAPPA] values (0.44, moderate, and 0.80, very good). The [KAPPA] value is also dependent on the number of categories. (2-6) In such a situation, especially having more than 2 observers, our suggestion is to apply weighted or Fleiss [KAPPA] because the mentioned estimates provide us unbiased results. (2-8)

Brcic et al (1) concluded that the moderate to substantial agreement in histologic typing and subtyping of MPM can be achieved. Such a conclusion should be supported by the above-mentioned statistical and methodologic issues. (2-8)

In this letter, the limitations of the [KAPPA] value to assess reliability are mentioned.

doi: 10.5858/arpa.2018-0154-LE

Mehdi Naderi, MSc [1]; Siamak Sabour, MD, MSc, DSc, PhD [2]

[1] School of Paramedical, Kermanshah University of Medical Sciences, Kermanshah, Iran; [2] Department of Clinical Epidemiology, School of Health, Safety Promotion and Injury Prevention Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.

(1.) Brcic L, Vlacic G, Quehenberger F, et al. Reproducibility of malignant pleural mesothelioma histopathologic subtyping. Arch Pathol Lab Med. 2018;142(6):747-752.

(2.) Szklo M, Nieto FJ. Epidemiology Beyond the Basics. 2nd ed. Sudbury, MA: Jones and Bartlett Publishers;2007.

(3.) Sabour S. Reproducibility of endometrial cytology by the Osaki Study Group Method: methodological issues. Cytopathology. 2017;28(5): 441-442.

(4.) Sabour S. Reliability of immunocytochemistry and fluorescence in situ hybridization on fine-needle aspiration cytology samples of breast cancers: methodological issues. Diagn Cytopathol. 2016;44(12):1128-1129.

(5.) Sabour S. Reliability of smartphone-based teleradiology for evaluating thoracolumbar spine fractures: statistical issue to avoid misinterpretation. Spine J. 2017;17(8):1200.

(6.) Sabour S. Spinal instability neoplastic scale: methodologic issues to avoid misinterpretation. AJR Am J Roentgenol. 2015;204(4):W493.

(7.) Sabour S. Reproducibility of semi-automatic coronary plaque quantification in coronary CT angiography with sub-mSv radiation dose; common mistakes. J Cardiovasc Comput Tomogr. 2016;10(5): e21-e22.

(8.) Sabour S, Ghassemi F. Accuracy and reproducibility of the ETDRS visual acuity chart: methodological issues. Graefes Arch Clin Exp Ophthalmol. 2016;254(10):2073-2074.

Accepted for publication May 7, 2018.

The authors have no relevant financial interest in the products or companies described in this article.

In Reply.--We are grateful for the opportunity to answer issues raised in a letter concerning methodology in our article analyzing interobserver and intraobserver reproducibility in histologic subtyping of malignant pleural mesothelioma. (1) First of all, we have to correct an error in the Materials and Methods section of our article: instead of the Cohen [KAPPA] (as it was stated) we have actually used the Fleiss [KAPPA] as a measure of agreement. (2) This error occurred during our own editing process of the article, and was later overlooked. We deeply regret the error and appreciate this letter. Furthermore, we would also like to bring up some points about [KAPPA] values.

Since its publication in 1960, as a chance corrected measure of agreement, Cohen [KAPPA] has prompted a vast amount of articles about its properties. The dependence of [KAPPA] on prevalence arises from chance correction, which has to be greater if one category is more prevalent. (3) Alternatively, one can state that it is more difficult to achieve high agreement in a very homogeneous population, and therefore [KAPPA] is lower, although the proportion of discordant ratings is the same. (4)

In the letter, two combined 2 X 2 tables of ratings with the same fraction of discordant ratings, but different prevalences of positive ratings, are given for which Cohen [KAPPA] is different. However, for symmetrical tables, the Fleiss [KAPPA] is the same as Cohen [KAPPA], thus it is affected by the same alleged weakness.

If all 10 discordances in the abovementioned combined table would be of the type that pathologist 1 rates positively and pathologist 2 rates negatively, and all concordant ratings are left unchanged, Cohen [KAPPA] would be increased, whereas Fleiss [KAPPA] would remain unchanged. The effect can be quite substantial, raising doubts about the adequacy of Cohen [KAPPA] as a measure of rater agreement. Fleiss [KAPPA] does not have that flaw; however, its shortcoming is that Fleiss [KAPPA] is not necessarily zero under rater independence. (For example, the table with 2 pathologists rating negative/negative and positive/positive 4 times each, negative/positive 2 times, and positive/negative 8 times has independent ratings, thus Cohen [KAPPA] is zero. Fleiss [KAPPA] is 0.111, however.)

We conclude that [KAPPA] type measures of agreement depend on prevalence by their very nature, and that Fleiss [KAPPA] is unaffected by an asymmetric distribution of discordances.

doi: 10.5858/arpa.2018-0196-LE

Luka Brcic, MD, PhD (1); Gregor Vlacic, MD (2); Franz Quehenberger, PhD (3); Izidor Kern, MD (2)

(1) Institute of Pathology and 3 Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria; (2) Cytology and Pathology Laboratory, University Clinic of Respiratory and Allergic Diseases, Golnik, Slovenia.

(1.) Brcic L, Vlacic G, Quehenberger F, Kern I. Reproducibility of malignant pleural mesothelioma histopathologic subtyping. Arch Pathol Lab Med. 2018;142(6):747-752.

(2.) Fleiss JL. Statistical Methods for Rates and Proportions. 2nd ed. New York: John Wiley & Sons; 1981.

(3.) Vach W. The dependence of Cohen's kappa on the prevalence does not matter. J Clin Epidemiol. 2005;58(7):655-661.

(4.) Kraemer HC, Periyakoil VS, Noda A. Kappa coefficients in medical research. Stat Med. 2002; 21(14):2109-2129.

Accepted for publication June 21, 2018.

Editor's Note: An erratum for the error cited in this letter appeared in the August 2018 issue of the Archives.
Limitation of [KAPPA] for Comparison of 2 Pathologists' Diagnoses With
Different Prevalence in the 2 Categories (a)

                                Pathologist 1

Pathologist 2       Positive    Negative    Total, %     [KAPPA]
                     Result      Result

Situation (a)                                             0.44
  Positive result      85           5          90
  Negative result       5           5          10
  Total, %             90          10         100
Situation (b)                                              0.8
                                                       (very good)
  Positive result      45           5          50
  Negative result       5          45          50
  Total, %             50          50         100

(a)  Authors' own hypothetical data to show limitation of [KAPPA]
value to assess reliability.
COPYRIGHT 2018 College of American Pathologists
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Letters to the Editor
Author:Naderi, Mehdi; Sabour, Siamak
Publication:Archives of Pathology & Laboratory Medicine
Article Type:Letter to the editor
Date:Nov 1, 2018
Previous Article:Top 5 Junior Member Abstract Program Winners Announced at CAP18.
Next Article:Use of a Web-Based Checklist to Improve Compliance With Medicare Access and CHIP Reauthorization Act of 2015 Reporting.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |