Close

Presentation

PS6 - Inter-Rater Reliability of Two Error Taxonomies
DescriptionBackground

Advocate Health’s Midwest Region began its high reliability journey in 2012, which included continuous improvement of the quality of the patient safety event causal analysis process. In 2020-2021, the patient safety directors began to compare coding of individual (IFM) and system failure modes (SFM). These codes are applied to failure mode statements, descriptions of each variation from generally accepted performance standards, by the cause analyst who conducted the patient safety investigation. The codes were based on a framework developed by Press Ganey Associates, LLC (2021). One safety leader hypothesized that sufficient ambiguity and overlap between categories resulted in poor inter-rater reliability (IRR). Directors and later all safety team members began to participate in coding exercises as a group to improve IRR, but this work was not measured.

In 2022, Advocate Aurora Health (now Advocate Midwest Region, MWR) merged with Atrium Health (now Advocate Southeast Region, SER), forming Advocate Health. From both a presentation at the HFES International Symposium on Human Factors and Ergonomics in Health Care (Anderson-Montoya, 2023) and during integration meetings, MWR learned that SER had previously used the same taxonomy as MWR but had since developed a taxonomy based on theoretical models. The new SER taxonomy classified proximate (human) failures using the Model of Human Information Processing (Wickens et al., 2021) and system failures using the Systems Engineering Initiative for Patient Safety (SEIPS) model (Carayon et al., 2006). The organizational merger of MWR and SER provided an opportunity to compare IRR of the two taxonomies.

Methods

Upon determination that comparison would be valuable and after obtaining IRB approval, the team began creating a set of diverse safety event examples to test coding with each of the taxonomies. First, all failure mode statements submitted for the month of Jan 2023 were downloaded from the MWR documentation system. The results were sorted by first IFM then SFM and any statements not coded for both FMs were excluded. The remaining statements were then reviewed by the primary researcher, and any statements that were not able to be coded or out of scope due to incomplete information were deleted. A total of 95 items remained after this exercise. A group of 4 researchers began reviewing FM statements to determine whether the level of analysis was sufficient, the causal factors were correctly identified, and corrected any grammatical errors.

These researchers then began coding FM statements in both the MWR and SER taxonomies with a goal of identifying approximately 20 statements, with representation of each of 5 major categories of IFM (MWR) or PFM (SER) and the two different SFM models, also seeking representative items of the two different sets of 5 major categories (see Table / List below) When 16 items had been identified and coded, the research team reviewed the coded categories to ensure all categories were represented at least twice, then searched the remaining items for categories not represented twice. Five additional items were identified and developed as previous items, with a total of 21 items in the final item set. Coding was verified by colleagues in the SER.

Taxonomy Major Categories:
A) MWR Individual Failure Modes (IFM)
1. Competency
2. Consciousness
3. Communication
4. Critical Thinking
5. Compliance
6. NA

B) MWR System Failure Modes (SFM)
1. Structure
2. Culture
3. Process
4. Policy and Protocol
5. Technology and Environment

C) SER Proximate Failure Modes (PFM)
1. Attention
2. Critical Thinking and Interpretation
3. Physical Skills
4. Conformity
5. External Factors

D) SER System Failure Modes (SFM)
1. Tasks
2. Tools and Technology
3. Organization
4. Internal Environment
5. External Environment


All test items were entered in an online testing platform. Based on the time required to complete the coding exercise and the limitations of the platform, the item sets were split into a total of 2 separate exercises for each taxonomy consisting of 10 or 11 items. Each set was paired with coding responses for each of the taxonomies, resulting in a total of 4 coding exercises. Three of the original members of the research team then completed the exercises several weeks after the initial coding. A larger group of volunteers who served as a cause analysis advisory council for MWR was randomized into two groups, one of which began with MWR taxonomy and the other began with the SER taxonomy. The group received a 1-hour introduction to SER taxonomy and a reference guide with definitions, similar to the MWR guide. A total of 11 safety leaders were invited to participate, and 8 of the 11 completed all coding exercises.

Results

As of submission date, statistical analysis for three initial research team members had been completed, and data collection from the larger team is complete with statistical analysis to follow. For the three research team members, Fleiss’ Kappa, a measure to assess the reliability of agreement between a fixed number of raters, was conducted for each section of testing: A) MWR IFM, B) MWR SFM, C) SER PFM, and D) SER SFM. Two of the three reviewers had extensive experience with the MWR coding system prior to this work, and the third reviewer participated in coding all items in both taxonomies. Kappas for MWR IFM and SFM were 0.459 and 0.489 respectively, or in the range considered to have moderate reliability of agreement (0.41 – 0.60). Kappas for SER PFM and SFM were higher at 0.614 and 0.622, or in the range considered to have good reliability of agreement (0.61 to 0.80). The final presentation will include full results for the 8-member review team, including analysis of differences based on tenure in the safety department and which taxonomy test was presented first.

References

Anderson-Montoya, B., McCarthy, V. (2023, March 27-29). A Redesigned Root Cause Analysis Approach Driven by Human Factors Systems Engineering [Conference Presentation]. 2023 Human Factors and Ergonomics Society Health Care Symposium, Orlando, FL, United States. https://www.hcs-2023.org/_files/ugd/3b7267_9783abe93079496c8418263ada2af96b.pdf

Carayon, P., Schoofs Hundt, A., Karsh, B. T., Gurses, A. P., Alvarado, C. J., Smith, M., & Flatley Brennan, P. (2006). Work system design for patient safety: the SEIPS model. Quality & safety in health care, 15 Suppl 1(Suppl 1), i50–i58. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2464868/

Press Ganey Associates, LLC (2021). The HPI SEC & SSER Patient Safety Measurement System for Healthcare [White Paper]. https://info.pressganey.com/e-books-research/the-hpi-sec-sser-patient-safety-measurement-system-for-healthcare

Wickens, C. D., Helton, W. S., Hollands, J. G., & Banbury, S. (2021). Engineering psychology and human performance. Routledge.
Event Type
Poster Presentation
TimeTuesday, March 264:45pm - 6:15pm CDT
LocationSalon C
Tracks
Digital Health
Simulation and Education
Hospital Environments
Medical and Drug Delivery Devices
Patient Safety Research and Initiatives