The success of any systematic review, rapid review, targeted literature review (TLR), or evidence synthesis project depends heavily on the quality of the protocol.
A well-defined protocol ensures that:
Human reviewers apply consistent screening decisions.
AI interprets study eligibility correctly.
Reviewer conflicts are minimised.
Inclusion and exclusion decisions remain transparent and reproducible.
Screening can be scaled confidently across large datasets.
EasySLR provides a suite of tools specifically designed to help teams develop, validate, refine, and optimise protocols before deploying AI across an entire review.
This guide outlines a recommended workflow for:
Creating a protocol
Optimising the protocol using Protocol Text Review
Running a pilot screening exercise
Comparing AI decisions against reviewer decisions
Performing AI Conflict Analysis
Refining the protocol
Scaling AI across the full dataset
Why Protocol Validation Matters Before Running AI
Many teams make the mistake of immediately running AI on thousands of articles.
However, if the protocol contains:
the AI will simply reproduce those inconsistencies at scale.
For this reason, EasySLR recommends treating protocol development as an iterative process.
The recommended workflow is:
→ Protocol Creation
→ Protocol Review
→ Pilot Screening
→ AI Validation
→ Conflict Analysis
→ Protocol Refinement
→ Full Scale Screening
This approach typically produces:
Phase 1: Creating the Protocol
Navigate to: Project Settings → Protocol
EasySLR provides a protocol template that can be customised for your project.
The protocol generally contains:
The protocol becomes the primary source of truth for both reviewers and AI.
Step 1: Define the Research Objective
The objective should clearly describe:
Population of interest
Intervention
Comparison
Outcomes
Scope of the review
Example: "To evaluate the efficacy and safety of Drug X compared with standard chemotherapy in adults with advanced non-small cell lung cancer."
A clearly defined objective helps reviewers remain aligned throughout the project.
Step 2: Define PICOS Criteria
PICOS serves as the foundation of screening decisions.
Population (P)
Describe:
Disease area
Patient demographics
Disease severity
Clinical characteristics
Example: Adults aged 18 years or older with advanced non-small cell lung cancer.
Avoid: Cancer patients.
The latter is too broad and open to interpretation.
Intervention (I)
Specify:
Drug names
Treatment classes
Procedures
Exposures
Example: Pembrolizumab monotherapy.
Avoid: Immunotherapy.
This may unintentionally include multiple interventions.
Comparator (C)
Define:
Placebo
Standard of care
Alternative intervention
Example: Platinum-based chemotherapy.
Outcomes (O)
Specify outcomes that matter for inclusion.
Examples:
This helps both reviewers and AI identify relevant studies.
Study Design (S)
Specify eligible designs.
Examples:
Include:
Exclude:
Case Reports
Editorials
Narrative Reviews
Step 3: Write AI-Friendly Screening Instructions
The AI relies heavily on protocol descriptions. The clearer the instructions, the better the results.
Good Example: "Include studies evaluating adults diagnosed with advanced NSCLC receiving Pembrolizumab monotherapy."
Poor Example: "Include relevant cancer studies involving immunotherapy."
The second example leaves significant room for interpretation.
Best Practices for AI Instructions
Be Explicit: Clearly state what should be included.
Example: Include studies with adults aged ≥18 years.
Avoid Subjective Language.
Avoid:
Relevant
Important
Significant
Small studies
Instead specify measurable criteria.
Example: Exclude studies with fewer than 50 patients.
Define Edge Cases
Examples:
Mixed populations
Multiple interventions
Combined outcomes
Subgroup analyses
Providing guidance for edge cases significantly improves consistency.
Phase 2: Protocol Optimisation
Before screening begins, review the protocol using EasySLR's Protocol Text Review.
Navigate to: Tools → Protocol Text Review
Select: New Text Review
Choose:
Click: Analyze Protocol
What Protocol Text Review Evaluates
The tool performs two major assessments.
1. PICOS Review
The system evaluates:
Completeness
Clarity
Consistency
Potential ambiguity
Examples of issues detected:
Undefined age ranges
Broad intervention definitions
Missing outcome specifications
Conflicting study design requirements
The tool then provides recommendations for improvement.
2. Selection Criteria Hierarchy Review
The order of criteria matters.
For example:
Exclude:
Animal studies
Case reports
Conference abstracts
before evaluating disease-specific criteria.
The system identifies:
and recommends improvements.
The recommendations generated by EasySLR are intended to help identify potential areas for improvement within the protocol. However, they should be considered as guidance rather than mandatory changes.
Carefully review each suggestion and assess its relevance based on:
Your domain expertise
The objectives of the review
The specific research question
Project requirements and methodology
Not all recommendations will be applicable to every project. Apply only those changes that align with your scientific judgment and screening strategy. The final decision on protocol modifications should always remain with the review team.
Update:
PICOS definitions
Inclusion criteria
Exclusion criteria
AI guidance
before screening begins.
This step alone often reduces conflicts substantially.
Phase 3: Create a Validation Sample
Instead of screening the full dataset immediately, create a pilot sample.
Recommended sample size: 5–10% of imported citations
Examples: 1,000 citations → Review 100 citations
5,000 citations → Review 250–500 citations
The goal is validation rather than productivity.
Why Pilot Screening Matters
The pilot helps determine:
Whether reviewers interpret criteria consistently
Whether AI understands the protocol correctly
Whether protocol revisions are required
before large-scale screening begins.
Phase 4: Human Screening
Assign the sample to reviewers.
Complete:
Avoid changing criteria during this phase.
Phase 5: Run AI on the Same Sample
Once human screening is complete: Run AI screening on the exact same articles. This creates a direct comparison set.
The AI will generate:
Suggestions/ Decisions
AI Notes/ rationale
PICOS
Phase 6: Evaluate AI Performance
The objective is not to determine whether AI is perfect.
The objective is to understand:
Where AI agrees
Where AI disagrees
Why disagreements occur
Review AI Notes
For each disagreement:
Review:
AI decision
Reviewer decision
AI reasoning
Questions to ask: Did the AI misunderstand the protocol? Or Is the protocol insufficiently clear?
Common Findings
Examples: AI includes studies with mixed populations.
This may indicate: Population criteria need clarification.
AI excludes studies lacking explicit outcome language.
This may indicate: Outcome definitions need expansion.
Phase 7: Run Conflict Analysis or AI Conflict Analysis (If AI is used as an assistant)
Navigate to: Tools → Conflict Analysis
Select: New Conflict Analysis
Choose:
Title & Abstract
or
Full Text
Click: Analyze Conflicts
What Conflict Analysis Does
The system evaluates:
and generates actionable recommendations.
Types of Insights Generated
Example: ~2 conflicts affected due to study design.
This suggests study design criteria may require refinement.
Example: Multiple reviewers interpret "elderly population" differently.
Recommendation: Specify age threshold.
Example: Mixed intervention studies create inconsistent decisions.
Recommendation: Add explicit handling instructions.
Protocol Recommendations
The system may suggest:
Clarifying inclusion criteria
Refining exclusion labels
Adding missing instructions
Defining edge cases
Phase 8: Update the Protocol
Use recommendations from:
to improve the protocol.
Refine the protocol to improve clarity, consistency, and transparency, ensuring that screening criteria are interpreted uniformly by both reviewers and AI. Provide clear and explicit instructions wherever possible to support accurate, reproducible, and AI-ready screening decisions.
Phase 9: Revalidate
Rerun AI on the same sample.
This helps confirm:
Improved agreement
Fewer conflicts
Better AI performance
before scaling.
Phase 10: Scale Across the Full Dataset
Once:
AI agreement is satisfactory for the project
Major protocol ambiguities have been resolved
Conflict numbers have reduced to an acceptable level
proceed with screening all remaining citations.
At this stage:
leading to faster and more reliable screening.