Nelson Systematic Reviews to Answer Healthcare, Questions, 2e

Animated publication

Activate your eBook





Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

Systematic Reviews to Answer Health Care Questions Second Edition

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

Systematic Reviews to Answer Health Care

Questions Second Edition

Heidi D. Nelson, MD, MPH, MACP, FRCP Professor Department of Health Systems Science Kaiser Permanente Bernard J. Tyson School of Medicine Pasadena, California

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

Acquisitions Editor: Joe Cho Development Editor: Cindy Yoo Editorial Coordinator: Janet Jayne Editorial Assistant: Kristen Kardoley Marketing Manager: Kirsten Watrud Production Project Manager: Justin Wright Manager, Graphic Arts & Design: Stephen Druding Manufacturing Coordinator: Bernard Tomboc Prepress Vendor: Lumina Datamatics

Second Edition

Copyright © 2025 Wolters Kluwer.

Copyright © 2014 by LIPPINCOTT WILLIAMS & WILKINS, a WOLTERS KLUWER business. All rights reserved. This book is protected by copyright. No part of this book may be reproduced or transmitted in any form or by any means, including as photocopies or scanned-in or other electronic copies, or utilized by any information storage and retrieval system without written permission from the copyright owner, except for brief quotations embodied in critical articles and reviews. Materials appearing in this book prepared by individuals as part of their official duties as U.S. government employees are not covered by the above-mentioned copyright. To request permission, please contact Wolters Kluwer at Two Commerce Square, 2001 Market Street, Philadelphia, PA 19103, via email at, or via our website at (products and services).

9 8 7 6 5 4 3 2 1

Printed in the United States of America

Library of Congress Cataloging-in-Publication Data

ISBN-13: 978-1-9752-1109-7

Cataloging in Publication data available on request from publisher.

This work is provided “as is,” and the publisher disclaims any and all warranties, express or implied, including any warranties as to accuracy, comprehensiveness, or currency of the content of this work.

This work is no substitute for individual patient assessment based upon healthcare professionals’ examina tion of each patient and consideration of, among other things, age, weight, gender, current or prior medical conditions, medication history, laboratory data and other factors unique to the patient. The publisher does not provide medical advice or guidance and this work is merely a reference tool. Healthcare professionals, and not the publisher, are solely responsible for the use of this work including all medical judgments and for any resulting diagnosis and treatments. Given continuous, rapid advances in medical science and health information, independent professional verification of medical diagnoses, indications, appropriate pharmaceutical selections and dosages, and treatment options should be made and healthcare professionals should consult a variety of sources. When prescribing medication, healthcare professionals are advised to consult the product information sheet (the manufacturer’s package insert) accompanying each drug to verify, among other things, conditions of use, warnings and side effects and identify any changes in dosage schedule or contraindications, particularly if the medication to be administered is new, infrequently used or has a narrow therapeutic range. To the maximum extent permitted under applicable law, no responsibility is assumed by the publisher for any injury and/or damage to persons or property, as a matter of products liability, negligence law or otherwise, or from any reference to or use by any person of this work. Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

To Don, Norris, and Amelia Comer and Don and Marian Nelson

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

Contributing Authors

Amy G. Cantor, MD, MPH, FAAFP Associate Professor

Rebecca M. Jungbauer, DrPH, MPH, MA Researcher Department of Medical Informatics and Clinical Epidemiology Pacific Northwest Evidence-based Practice Center Portland Oregon Robin Paynter, MLIS Information Specialist Fertility Regulation Review Group Cochrane Collaboration Portland, Oregon

Departments of Medical Informatics and Clinical Epidemiology, Family Medicine, and Obstetrics and Gynecology Core Investigator, Pacific Northwest Evidence Based Practice Center Oregon Health and Science University Portland, Oregon Rongwei Fu, PhD Professor of Biostatistics School of Public Health Oregon Health and Science University Portland, Oregon

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.



S ystematic reviews use scientific methods to identify, select, assess, and summarize the find ings of studies to answer health care questions. They provide the evidence for evidence-based medicine and are essential in determining health care guidelines and policies. As such, a sys tematic review can have a huge impact on how health care is practiced and funded. To meet this challenge, systematic reviews must adhere to methodological standards. They may include only some or the wrong kinds of studies or provide incorrect conclusions. The selection of studies could be biased or the statistical analysis inappropriate. The studies included in a systematic review could be so flawed that their results are unreliable. A systematic review that simply col lects and catalogs studies will miss these possibilities, whereas one that accurately evaluates and synthesizes the evidence will reveal them. This book is a guide to conducting comprehensive systematic reviews to answer health care questions based on currently accepted methods and standards in the field. It is most relevant to health care practices and populations in the United States but can be applied more broadly. Although intended primarily for researchers, its concise format and practical approach make it suitable for multiple types of users. It emphasizes main concepts, incorporates examples and case studies, and provides references for additional sources. Most examples are based on recent real-world projects conducted by the authors. The second edition is an updated resource that describes essential components in designing and conducting a systematic review. These include defining its purpose, topic, and scope; devel oping research questions; building the team and managing the project; identifying and selecting studies; extracting relevant data; assessing studies for quality and applicability; synthesizing the evidence using qualitative and quantitative analysis; assessing the strength of evidence; and preparing and disseminating the report. New chapters include how to assess the quality of diag nostic accuracy studies, qualitative studies, and systematic reviews; a guide to electronic tools for systematic reviews; and answers to case studies. Each component provides the necessary underpinnings for a comprehensive systematic review that accurately reflects a body of evidence that could ultimately lead to improvements in health care.

Heidi D. Nelson, MD, MPH, MACP, FRCP

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.



A t a time when facts and evidence are often maligned, ignored, or misrepresented, the rigor ous pursuit of truth continues to be a principle and driving force in science and medicine. While the COVID-19 pandemic raged across the world, scientists rapidly mobilized efforts to under stand the virus, its epidemiology and health effects, and how to prevent and treat infections and their complications. Among them emerged several international collaborations creating living systematic reviews that required continual updating and ongoing surveillance of emerg ing research evidence. This commitment to finding truth in midst of confusion is a hallmark of systematic review science. This book draws from the collective knowledge of systematic review scientists internation ally and first-hand experiences of the contributing authors of the first and second editions. We have had tremendous opportunities to contribute to the emerging field of systematic review and actively participate in the historic shift to evidence-based health care. I acknowledge all the truth finders in the field, particularly those who have journeyed with me and contributed to this book.

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . .viii 1 SystematicReviews......................1 2 Defining the Topic and Scope and Developing Research Questions, Analytic Frameworks, and Protocols . . . . . . . . . . 9 3 Building the Systematic Review Team, Engaging Stakeholders, and ManagingtheProject.. . . . . . . . . . . . . . . . . . . 22 4 Determining Inclusion and Exclusion Criteria for Studies. . . . . . 35 5 Conducting Searches for Relevant Studies. . . . . . . . . . . . 48 6 Selecting Studies for Inclusion. . . . . . . . . . . . . . . . . 65 7 Extracting Data from Studies and Constructing Evidence Tables. . . 78 8 Assessing Quality and Applicability of Controlled Clinical Trials, Cohort Studies, and Case-Control Studies . . . . . . . . . . . . 94 9 Assessing Quality and Applicability of Diagnostic Accuracy

Studies, Qualitative Studies, and Systematic Reviews. . . . . . . 120 10 QualitativeAnalysis. . . . . . . . . . . . . . . . . . . . .135 11 QuantitativeAnalysis. . . . . . . . . . . . . . . . . . . . 149 12 Assessing and Rating the Strength of the Body of Evidence. . . . .183 13 Preparing and Disseminating the Report. . . . . . . . . . . . 199 14 Guide to Electronic Tools for Systematic Reviews . . . . . . . . . 223 15 Answers to Case Studies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .230 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .239 Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Selecting Studies for Inclusion

Amy G. Cantor

■ ■ INTRODUCTION As described in the previous chapter, a sufficiently comprehensive and objective literature search ensures the identification of as many potentially relevant studies as possible. However, only a fraction of the studies identified by the literature search will ultimately be included in a systematic review. The process for selecting studies is critical to the systematic review, as it shapes the overall body of evidence upon which all conclusions will be drawn. Decisions about selecting studies are based on whether they meet prespecified inclusion and exclusion criteria that represent the specific populations, interventions, comparators, outcomes, timing, settings, and study designs of interest (PICOTS). However, these decisions involve reviewer judgment that is inherently subject to variation and random error depending on interpreta tion, experience, and expertise. Care must be taken throughout the study selection process to minimize bias. Several standards provide guidance for the screening and selection of studies (Table 6.1). 1 This chapter describes accepted procedures for study selection, how to docu ment the selection process, and important issues to consider when applying inclusion and exclusion criteria. TABLE 6.1 STANDARDS FOR SCREENING AND SELECTING STUDIES SCREEN AND SELECT STUDIES • Include or exclude studies based on the protocol’s prespecified criteria • Use observational studies in addition to randomized clinical trials to evaluate harms of interventions • Use two or more members of the review team, working independently, to screen and select studies • Train screeners using written documentation; test and retest screeners to improve accuracy and consistency • Use one of two strategies to select studies: read all full-text articles identified in the search or screen titles and abstracts of all articles and then read the full-text of articles identified in initial screening • Taking account of the risk of bias, consider using observational studies to address gaps in the evidence from randomized clinical trials on the benefits of interventions DOCUMENT THE SEARCH • Provide a line-by-line description of the search strategy, including the date of search for each database, web browser, etc. • Document the disposition of each report identified including reasons for their exclusion if appropriate Source: Institute of Medicine. Finding What Works in Health Care: Standards for Systematic Reviews . Washington, DC: The National Academies Press; 2011. Reprinted with permission from the National Academies Press, Copyright 2011, National Academy of Sciences.

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.



Chapter 6 • Selecting Studies for Inclusion


Chapter 4 describes considerations for developing inclusion and exclusion criteria for studies, such as how criteria can narrow or broaden the scope of the review and how to use experts to refine criteria to optimize clinical relevance. Although the eligibility criteria are designed to define the specific boundaries of the research questions in terms of the PICOTS elements, they usually cannot encompass all possible sources of variation. When systematic reviewers encoun ter unanticipated sources of ambiguity in the eligibility criteria, the potential for subjectivity and bias in the study selection process increases. Both inter- and intra-reviewer variation can occur. Systematic reviewers may differ in their interpretations of eligibility criteria, leading to vari ation in judgments about the relevancy of studies (inter-reviewer variation). The κ -statistic is a measure of the level of agreement between reviewers. Although inter-reviewer variation has been widely acknowledged, little research has been done to quantify its effect. 2–5 A prospective study evaluated variation between two independent systematic review groups using the same research questions and eligibility criteria. 5 When the numbers of overall included studies were considered, the groups had only fair agreement ( κ = 0.55). However, when the numbers of important included studies were compared, agreement was excellent ( κ = 0.88). Reasons for variation included differences in interpretations of the relevance of the oldest papers, multiple publications using the same or similar data, inclusion of publications of case series studies, and availability of data in publications. Additional inconsistencies reported in other evaluations include use of non-English language publications and nonrandomized studies, and different definitions of outcomes. 2–4 Variability is also affected by application of the best evidence approach during study selec tion. A core principle of systematic reviews is that they are based on the highest levels of evi dence available, and reviewers select and prioritize the highest quality and most applicable studies to address their research questions. 6 This approach generally excludes or minimizes inclusion of lower-level studies if enough higher-level studies are available. However, limita tions in the evidence are often first discovered during the study selection process. Variability is introduced when reviewers make different decisions about when and how to lower the evidence threshold, such as including studies that do not meet the prespecified criteria. Eligibility criteria may need to be refined to reduce this type of variability (Box 6.1). BOX 6.1 Modifying Eligibility Criteria during the Study Selection Process For a systematic review of the effectiveness of patient navigation to increase cancer screen ing in populations adversely affected by health disparities, systematic reviewers worked with a technical expert panel to develop and refine the inclusion criteria (Table 6.2). 7 TABLE 6.2 INCLUSION CRITERIA Population Patients from populations adversely affected by health care disparities in the United States (ie, racial and ethnic minorities, socioeconomically disadvan taged, underserved rural populations, sexual and gender minorities, others subject to discrimination) Intervention Patient navigation services to increase routine colorectal, breast, and cervical cancer screening Comparator Usual or alternate care for colorectal, breast, and cervical cancer screening While reviewing and selecting studies for inclusion, the systematic reviewers identified recurring methodological questions regarding whether a study should be included. For example, whereas one study may describe the intervention as “patient navigation” that

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion

Population Intervention and Comparator includes scheduling, reminders, and referral services, another study may include an inter vention with similar types of services but not use patient navigation terminology. Rather than exclude questionable studies, the reviewers considered how to refine their inclu sion criteria without compromising their ability to address the research questions of the systematic review (Table 6.3). They resolved these issues through team discussions and added their changes to the inclusion criteria in the revised protocol and methods section of the report. Navigation services were defined to include outreach activities involving letters or calls, educational materials and sessions, assessment and reduction of barriers to screening, language translation, appointment scheduling and reminders, bowel prep assistance for colorectal cancer screening, mailed supplies and kits, transportation and appointment attendance as needed, and point-of-care prompts, among others. TABLE 6.3 STUDY SELECTION ISSUES AND RESOLUTION OBSERVATION ISSUE IMPLICATION RESOLUTION Studies enrolled participants from multiple settings Participants may not represent populations adversely affected by dis parities as defined in the inclusion criteria Reduces clinical relevancy of the systematic review Only include studies with the majority of participants from populations adversely affected by health disparities Studies did not describe or differentiate characteristics of participants Participants from general or mixed populations (eg, low income, rural) may differ from those from specific groups (racial and ethnic minori ties) even when they all meet the definition of populations adversely affected by health disparities Potential confounding of results Include all eligible studies and consider subgroup analysis based on specific participant characteristics as available The interven tion services were often not identified as patient navi gation by the study Patient navigation ser vices are not consistently recognized as such Possible exclu sion of studies not explicitly providing patient navigation Include studies of inter ventions providing services broadly considered types of patient navigation regard less of whether the term is used Navigation services varied across studies Combining all types of services in the interven tion group assumes that they result in similar out comes Use a broad definition of patient navigation to include the wide range of services and consider subgroup analysis based on specific components or intensity Population The systematic review protocol specifies the populations of interest as clearly as possible. These generally include the condition or indication and demographic criteria (eg, age and sex). However, studies may enroll mixed or poorly defined populations, or populations that appear to meet the intent of the criteria, but not the prespecified specific criteria. For example, a trial may enroll participants diagnosed with a particular condition, but not use the diagnostic criteria spec ified in the inclusion criteria of the systematic review protocol. In this situation, the reviewers Cannot compare effectiveness and harms of specific compo nents of patient navigation services

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion

must decide whether to include the trial and, if so, refine their eligibility criteria. One solution is adopting a best evidence approach where studies with less clearly defined populations are used only if there is insufficient evidence from studies with more precisely defined populations. Intervention Some studies combine participants with different treatments into a single intervention group. Combining treatments is acceptable for systematic reviews examining a class effect , but treat ments must be similar enough that they can be considered a related group for major outcomes, such as bisphosphonates for treatment of osteoporotic fractures. However, studies with com bined treatment groups may be less relevant when interventions are dissimilar or comparisons are intended. Whether these studies are included depends on the purpose of the systematic review. For example, a review focusing on the effects of individual drugs could include studies examining a class effect when there are gaps in evidence. Comparator Comparators are often described broadly or imprecisely in eligibility criteria. Difficulties occur when comparators include nonspecific treatments, such as usual care or no treatment. Systematic reviewers may select studies based on their interpretations of imprecise criteria, introducing bias when there is variation across studies. Usual care may differ by study setting or when the study was done, for example. Understanding the key elements of usual care that differ is essential to interpreting their effects on outcomes. For example, in a systematic review of hyperbaric oxy gen to treat traumatic brain injury, furosemide and mannitol were given to all participants as standard therapy to reduce presumed elevated intracranial pressure in older studies. 8,9 Newer studies used invasive intracranial pressure monitoring to guide treatment more selectively and provided phenytoin to all participants for prophylaxis of seizures. Outcome When outcomes are precisely defined in the eligibility criteria and consistently reported in study publications, variability among systematic reviewers is less likely. However, important details are often missing or inadequate. For example, a systematic review of screening for developmental delay in young children would include an overwhelming number of studies with a broad range of outcomes requiring further definition, whereas a review of the effectiveness of screening for speech and language delay in preschool age children would focus on very specific outcomes. Study Design Systematic reviewers’ decisions about the inclusion of observational studies commonly intro duce variability. 2–5 Although there is broad agreement about the hierarchy of study designs, 10–12 thoughts vary about the role of observational studies 1,13,14 and classification of observational study designs. 15 The specific study designs eligible for inclusion for each research question need to be clearly defined and refined as necessary, particularly if higher-level evidence is not available. Other Selection Criteria Additional selection criteria include how multiple publications stemming from an individual, unique study will be handled. Care should be taken to identify the primary and related publications to prevent double counting and possibly inclusion of duplicate data in analyses. Systematic review ers need to establish rules and processes for handling multiple publications that may vary by topic. The selection of studies based on the availability of data for meta-analysis also contributes to variability. Decisions include whether data derived from graphs should be included, statistical imputation methods will be used when precise variance data are not reported, and missing data will be sought by contacting study authors. These decisions will affect study selection and must

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion

be made during the course of the review, but often are not anticipated in advance. Rules created to address these issues must be documented in the report and the revised protocol. Decisions to include non-English language publications and study abstracts or confer ence proceedings also contribute to variability across systematic reviews on the same topic. 2–4 Although these may be important issues in determining the validity of a given review, they generally do not contribute to variability in eligibility decisions within a review. Decisions made about the selection of studies based on these characteristics should also be clearly documented. ■ ■ PROCESS FOR STUDY SELECTION The goal of the study selection process is to ensure that the final body of evidence encompasses all available studies relevant to the research questions of the systematic review. The way to accomplish this is to consistently apply the prespecified eligibility criteria for including and excluding studies. However, even when the eligibility criteria are prespecified and detailed, a certain degree of subjectivity in their interpretation is unavoidable. For this reason, established standards for screening and selecting studies for a systematic review can be used to protect against reviewer bias and random error. 1,10–12 These include using dual review, conducting the assessments in two stages, and training reviewers. As with all steps in the systematic review process, the approach used for study selection should be prespecified and explicitly described in the methods sections of both the protocol and the review report. Dual Review Dual review refers to the process of involving at least two systematic reviewers in decisions about the selection of individual studies. This approach varies from review to review, rang ing from using the second reviewer to check the decisions of the first reviewer, to the second reviewer undertaking a completely independent assessment of all studies. Using two or more reviewers to independently screen and select studies is recommended, 1 and it is widely accepted that the reliability of the study selection process is maximized when multiple reviewers inde pendently assess potentially relevant studies.

The benefits of dual, independent review were demonstrated in a study specifically designed to estimate the accuracy and reliability of multiple reviewers when assessing the eligibility of approximately 11,286 randomized controlled trials (RCTs) of methods to improve postal ques tionnaires. 16 This study found that single reviewers missed an average of 8% of eligible reports (range, 0% to 24%) and adding a second, independent reviewer increased the number of RCTs correctly identified by an average of 9% (range, 0% to 32%). 16 Although this study demonstrated an advantage of using two independent reviewers, it did so under relatively narrow circumstances and may have actually underestimated the broader benefits of this approach. This study was limited to the selection of RCTs that may provide more explicit descriptions of their PICOTS characteristics 17 than observational studies. Use of two reviewers may be even more critical in reviews involving a wider variety of study designs that are traditionally more difficult to interpret. Also, among the four reviewers evaluated in this study, 16 three had previous experience with sys tematic review methods and no information was provided about their expertise in the relevant clini cal areas. As described in Chapter 3, in building the systematic review team, it is recommended that investigators and staff offer a broad range of expertise in the relevant clinical areas and in systematic review methods. It is reasonable to expect that wider variation in background and experience level may lead to wider variation in the probability of reliable decisions about study eligibility than was observed in the study. Additionally, previous research has shown that reviewers with relevant clinical content expertise may have strong viewpoints that can influence their interpretation and application of the eligibility criteria. 10 Therefore, as may be the case for assessing more complex study designs, use of two independent reviewers for study selection may also be important when the review team includes reviewers with different types of expertise (Box 6.2). Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion

BOX 6.2 Reducing Selection Variability with Dual Review When Eligibility Criteria Are Broad Some systematic reviews are exploratory and their intention is to provide a comprehen sive evaluation of evidence. To achieve this goal, the study eligibility criteria need to be framed in broad terms. However, broad criteria can increase selection variability, as illus trated in a review of the effectiveness of intensive primary care programs in reducing hos pital admissions and/or death in high-risk patients with multiple chronic conditions and frequent hospital admissions. 18 The purpose of the review was to assist a health delivery committee in developing a primary care intensivist model for pilot testing. The eligibility criteria are described in Table 6.4.


Patients identified as high risk for hospital admission and/or death, regardless of whether they have a specific disease, such as heart failure


Multicomponent, interdisciplinary intensive primary care programs


Usual care (without the utilization of an intensive primary care program)


All-cause mortality, hospitalization, emergency department use, hospital days


Studies that include a follow-up period of more than 30 days


Ambulatory setting

Study design

Systematic reviews, controlled clinical trials, observational studies

Several factors complicated the reviewers’ approach to selecting studies, includ ing the lack of objective criteria for determining level of risk for hospital admission or death; complexity of studies of healthcare delivery systems; and lack of standard taxonomy for describing intensive primary care programs. Accordingly, the review ers anticipated encountering a high level of variation in the literature in the types of programs and patient populations and in the adequacy of reporting relevant PICOTS elements. The team performed dual review, specifically pairing reviewers with and without clinical expertise in primary care. However, this approach resulted in more than typical levels of inter-reviewer variation in judgments about the relevancy of evidence, as outlined in Table 6.5. These discrepancies were discussed and resolved among the review team, ultimately reducing variability while improving the clinical relevance of the report.

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.



No widely accepted disease-nonspecific, objective criteria for determining level of risk for hospital admis sion and/or death exist

Include studies that used indi cators of risk that had high face validity, such as history of persistently high hospital utilization and/or presence of multiple chronic illnesses

Reviewers with clinical exper tise in primary care were more likely to include studies that used less obvious indi cators of risk including need for intensive assistance (eg, impairments in multiple activ ities of daily living, poverty, and age ≥65 years)



Chapter 6 • Selecting Studies for Inclusion




• Studies of healthcare delivery systems are complex and multifaceted • Standard taxonomy for describing

Include studies that explicitly described multiple components of interest (eg, small panel sizes; enhanced support by interdisciplinary teams; inte gration with affiliated phar macy, mental health, home health, and community-based and inpatient geriatric care service; and 24-hour access)

Reviewers with clinical exper tise in primary care were more likely to regard studies of singular enhancements as an “intense” level of service based on personal clinical experiences

intensive primary care programs has only recently been published

Intervention and comparator

Regardless of the approach used to involve a second reviewer, the intent is to identify dis crepancies in judgments about whether a study meets criteria for inclusion. In many cases, the cause of a discrepancy is simply a random error on the part of a single reviewer. In other cases, it is because of differences in the interpretation of the eligibility criteria and relevant study char acteristics. For example, in a review of pharmacologic treatments for adults with fibromyalgia, a reviewer who is a practicing rheumatologist may feel strongly about excluding studies that do not use the most recent American College of Rheumatology diagnostic criteria. A reviewer who is not a content expert may not make this distinction and include studies using other criteria. Discrepancies based on differences in the interpretation of the eligibility criteria can be useful in identifying important ambiguities that may require the development of supplemental crite ria or lead to valuable refinements to the protocol. All discrepancies should be discussed and resolved by a consensus process, sometimes requiring additional reviewers or experts. As with the approach to involving a second reviewer, the approach to handling discrepancies should be explicitly described in both the study protocol and the review report. Two-Stage Strategy to Selecting Studies The study selection process is typically conducted in two stages . The first stage involves the assessment of only the titles and abstracts from the database searches. The purpose of the first stage is to efficiently eliminate all obviously ineligible publications. For example, if the system atic review evaluates the effects of using antiepileptic drugs for treatment of bipolar disorder, a reviewer could confidently exclude a study entitled, “Drug Therapy for Epilepsy” after review ing only the title. The second stage involves a detailed assessment of the subset of full-text publications that were determined to be likely or possibly eligible based on the review of their titles and abstracts. An alternative to using a two-stage process for study selection is to use a single-stage process in which the full-text papers are reviewed for all studies identified from the database searches. Systematic review standards endorse use of both the two- and single-stage approaches and acknowledge that the single-stage approach is more expensive and time-consuming. 1 When using the two-stage strategy to selecting studies, the systematic reviewers may con sider using different approaches to involving the second reviewer within each stage. The impor tance of involving an independent second reviewer at the full-text stage has been emphasized in systematic review methods guidelines. 10–12 Decisions about using more than one reviewer and how independent they will be may depend on the volume of citations, number of reviewers, and reviewer experience level, among other factors. For example, AHRQ guidance on avoiding bias in study selection involves the use of a second reviewer to assess only the citations and abstracts that the first reviewer deemed ineligible. 12 However, AHRQ acknowledges a number of disadvantages to this approach, including the lack of empiric evidence to support its reliability.

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion

Knowledge of the first reviewer’s decisions may bias the second reviewer. Also, this approach may result in reduced specificity of the retrieval process while increasing the number of full-text articles requiring review. Masking/Blinding of Study Selection Historically, there has been speculation that the selection of studies may be influenced by sys tematic reviewers’ knowledge of the study author’s identity or institution, journal, or year of publication. 19 However, masking or blinding reviewers to authors, institution, journals, and treatment groups did not have a clinically or statistically significant impact on the results of five meta-analyses in a study evaluating this effect. 20 The questionable benefit of masking reviewers to information about study sources generally does not warrant the substantial time and effort required to do so, 10–12 estimated at approximately 1.3 hours per paper. 20 Training and Pilot Testing Systematic reviewer training and pilot testing of the eligibility criteria are important in maxi mizing the reliability of the study selection process. Systematic review standards refer to this process and advocate the use of written documentation and testing to improve accuracy and consistency. 1 Reviewer training and pilot testing of eligibility criteria can begin with a small sample of studies (eg, 10% to 20%). Calculation of κ -statistics to measure reviewer agreement may be particularly useful in identifying problems during pilot testing. This process can gener ate discussions about the sources of variation in interpreting the review’s eligibility criteria that can lead to their refinement. However, it is a less useful gauge of the overall study selection pro cess because the main goal of dual review is to accurately identify all available relevant studies, not necessarily to achieve perfect agreement. ■ ■ DOCUMENTATION AND REPORTING Transparency is essential to the study selection process and requires careful documentation of the decisions about the eligibility of each article identified by the literature search. Disposition of each study should be documented, along with reasons for exclusion if appropriate. 1 Proper documentation and reporting of the study selection process allow for replication and indepen dent assessment of potential bias in study selection by systematic review users. Although thorough documentation of the study selection process is important, it requires substantial effort. Documentation includes recording separate inclusion decisions from mul tiple independent reviewers for both the abstract and full-text levels of the selection process and identifying discrepancies from dual review for further discussion and resolution. For more complex, technical, and large reviews involving numerous reviewers, reference management software, such as EndNote, ProCite, or Reference Manager, can be used to document inclusion in addition to their usual roles in formatting the bibliography. A standardized coding system for categorizing the inclusion decisions and reasons for exclu sion promotes adherence to the eligibility criteria, increases efficiency in reviewer assessment, and assists with organization and documentation. For example, a typical coding system could use the word “include” for each included study and a prespecified scheme of numbers or let ters to classify reasons for exclusions that mirror the PICOTS elements (ie, ineligible popula tion, intervention, or comparator). This system can be created by the systematic reviewers and adapted for specific research questions (Table 6.6). Alternatively, purpose-built software, such as DistillerSR, EPPI-Reviewer 4, RevMan, and oth ers, is available to assist systematic reviewers with this and other review processes. However, software for systematic reviews requires significant time in training and setup for each review (see Chapter 14). Its value to a review would be based on the strengths and limitations of spe cific software, experience of the team with systematic reviews as well as the software itself, and

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion


ANXIETY IN WOMEN 21 Key Questions (KQ) include:

KQ1: In women and adolescent girls aged 13 years and older, without currently diagnosed anxiety disorder, what is the effectiveness of screening and evaluation for anxiety to improve symptoms, quality of life, and function? KQ2: What is the accuracy of methods to screen for generalized anxiety disorder, including differences across population groups? KQ3: What are the potential adverse effects of screening for generalized anxiety disorder?


Database or other source from which the study was retrieved: CCRCT (Cochrane), CDSR (Cochrane), MEDLINE (ML), Health & Psychosocial Instruments (HPI), (CT), expert, hand search, reference list Population: Adult and adolescent females aged 13 years and older; A = adult, AD = adolescent (<18 years), O = older adult (>65 years), P = pregnant, PP = postpartum Study design: R = RCT, O = observational study, SR = systematic review, D = diagnostic accuracy study Abstract code: 1 = Retrieve; potentially meets review criteria; pull full text article to determine study eligibility 2 = Background; paper does not meet review criteria but may provide relevant background and context E = Exclude; clearly does not meet review criteria Reasons for exclusion are not recorded at the abstract level. Full-text inclusion/exclusion codes: Reasons for paper inclusion or exclusion according to pre-specified criteria 1 = Include Paper included in report. Inclusion criteria listed below E = Exclude Typically more than one reason applies. Reasons for exclusion coded as: E2 = Wrong population E3 = Wrong intervention E4 = Wrong outcome E5 = Wrong (or no) comparison E6 = Inadequate reference standard E7 = Wrong study design for the key question E8 = Not a study (letter, editorial, non-systematic review article, no original data) E9 = Not English language E10 = Sample size too small (N < 100) E11 = Outdated systematic review

Custom 1

Custom 2

Custom 3

Custom 4

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

( continued )


Chapter 6 • Selecting Studies for Inclusion

Inclusion criteria: Populations • Adult and adolescent females aged 13 years and older without previously diagnosed generalized anxiety disorder • Studies with 50% or more women enrolled • Studies may include pregnant or postpartum women Interventions and Comparators • KQ1: Screening for anxiety versus no screening • KQ2: Screening tool or method used in primary care applicable settings to assess the presence of anxiety versus a reference standard using DSM or ICD criteria for generalized anxiety disorder Effectiveness outcomes • KQ1: Improvement in anxiety symptoms, quality of life, and function • KQ2: Diagnostic accuracy of the screening test (sensitivity, specificity, PPV, NPV, AUC) Harms outcomes • KQ3: False positive results, worse anxiety, any additional harms reported by the study Study designs • Effectiveness: RCTs of screening effectiveness, high-quality systematic reviews, observational studies with comparison groups (cohort studies including database studies and case–control studies) • Test performance: Diagnostic accuracy studies • Adverse events: RCTs of screening, high-quality systematic reviews, observational studies Abbreviations: AUC, area under the receiver operating characteristic curve; CCRCT, Cochrane Central Register of Controlled Trials; CDSR, Cochrane Database of Systematic Reviews; DSM, Diagnostic and Statistical Manual of Mental Disorders; ICD, International Classification of Diseases; KQ, key question; NPV, negative predictive value; PPV, positive predictive value; RCT, randomized controlled trial. the size and complexity of the review. This approach may be particularly useful for review teams that are geographically separated, which is increasingly more common since the COVID-19 pandemic and adoption of remote work environments. Decisions made during the study selection process that further refine the eligibility criteria, including creation of supplemental criteria, must be amended to the review protocol and docu mented in the review report. Reporting The PRISMA group (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) has published recommendations for reporting the results of the study selection process 22 and pro vides a program for creating literature flow diagrams for reviews that conform to the PRISMA 2020 statement. 23 PRISMA guidance recommends using a flow diagram to report the numbers of studies identified, screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage (Figure 6.1). The diagrams differ depending on whether the systematic review is new or an update of an earlier review, and whether studies were identified by databases and registers only or in addition to other methods. The identification of studies includes reporting the initial litera ture search results, stratified by individual database and other sources, before removing dupli cate publications. Screening refers to the number of publications that remain after removing duplicates and then screened based on their titles and abstracts only, the number of full-text

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.


Chapter 6 • Selecting Studies for Inclusion

Identification of studies via databases and registers

Identification of studies via other methods

Records removed before screening : Duplicate records removed (n = ) Records marked as ineligible by automation tools (n = ) Records removed for other reasons (n = )

Records identified from: Websites (n = ) Organisations (n = ) Citation searching (n = ) etc

Records identified from a :

Databases (n = ) Registers (n = )

Identification Screening Included

Records screened (n = )

Records excluded b (n = )

Reports sought for retrieval (n = )

Reports not retrieved (n = )

Reports sought for retrieval (n = )

Reports for retrieved (n = )

Reports assessed for eligibility (n = )

Reports assessed for eligibility (n = )

Reports excluded: Reason 1 (n = ) Reason 2 (n = ) Reason 3 (n = ) etc

Reports excluded: Reason 1 (n = ) Reason 2 (n = ) Reason 3 (n = ) etc

Studies included in review (n = ) Reports of included studies (n = )

publications retrieved, and the number of full-text publications screened. For studies excluded at the full-text level, PRISMA additionally recommends that reviewers indicate the numbers of exclusions based on their prespecified PICOTS-relevant exclusion categories (ie, wrong pop ulation, intervention, comparator, etc). This can be efficiently accomplished when reviewers use a standardized method for coding exclusions, and reference management or other data base software to electronically manage their study selection process. Included publications are those meeting all inclusion criteria and contributing to the body of evidence in the system atic review. The number of studies that are included in the quantitative analysis is separately reported, if applicable. For systematic reviews that include a variety of populations, study designs, or other com ponents, it is also acceptable to stratify the reporting of the numbers of included studies by relevant categories. For example, in a systematic review about screening for osteoporosis, the selection of studies for research questions about the efficacy and harms of drugs to prevent osteoporotic fractures was uniquely different from a question about the diagnostic accuracy of dual-energy x-ray absorptiometry in detecting low bone mineral density. 24 Reporting the included studies by research question was a useful way to focus results for both the systematic reviewers and users. In addition to reporting the numbers of studies excluded by the reasons for exclusion in the PRISMA diagram, systematic review guidelines recommend providing a list of the individual studies excluded from the review and their reasons for exclusions. 10–12,25 Rather than a com plete accounting of all studies excluded at both the abstract and full-text levels, this list is most informative when it focuses on full-text studies reviewed that nearly fulfilled eligibility criteria and initially appeared relevant. ■■ FIGURE 6.1 PRISMA 2020 flow diagram for new systematic reviews that included searches of databases, registers, and other sources. a Consider, if feasible to do so, reporting the number of records identified from each database or register searched (rather than the total number across all databases/registers). b If automation tools were used, indicate how many records were excluded by a human and how many were excluded by automation tools. Source: Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ . 2021;372:n71.

Copyright © 2024 Wolters Kluwer, Inc. Unauthorized reproduction of the content is prohibited.

Made with FlippingBook Annual report maker