Protocol Development, AI Validation & Conflict Analysis: Best Practice Workflow for Achieving Optimal AI Screening Results in EasySLR

Protocol Development, AI Validation & Conflict Analysis: Best Practice Workflow for Achieving Optimal AI Screening Results in EasySLR

The success of any systematic review, rapid review, targeted literature review (TLR), or evidence synthesis project depends heavily on the quality of the protocol.
A well-defined protocol ensures that:
  • Human reviewers apply consistent screening decisions.
  • AI interprets study eligibility correctly.
  • Reviewer conflicts are minimised.
  • Inclusion and exclusion decisions remain transparent and reproducible.
  • Screening can be scaled confidently across large datasets.
EasySLR provides a suite of tools specifically designed to help teams develop, validate, refine, and optimise protocols before deploying AI across an entire review.

This guide outlines a recommended workflow for:
  1. Creating a protocol
  2. Optimising the protocol using Protocol Text Review
  3. Running a pilot screening exercise
  4. Comparing AI decisions against reviewer decisions
  5. Performing AI Conflict Analysis
  6. Refining the protocol
  7. Scaling AI across the full dataset

Why Protocol Validation Matters Before Running AI
Many teams make the mistake of immediately running AI on thousands of articles.

However, if the protocol contains:
  • Ambiguous definitions
  • Overlapping criteria
  • Missing decision rules
  • Inconsistent reviewer interpretation
the AI will simply reproduce those inconsistencies at scale.

For this reason, EasySLR recommends treating protocol development as an iterative process.

The recommended workflow is:

→ Protocol Creation
→ Protocol Review
→ Pilot Screening
→ AI Validation
→ Conflict Analysis
→ Protocol Refinement
→ Full Scale Screening

This approach typically produces:
  • Higher AI agreement
  • Fewer conflicts
  • Faster screening
  • Improved reproducibility
  • Greater confidence in final results

Phase 1: Creating the Protocol
Navigate to: Project Settings → Protocol
EasySLR provides a protocol template that can be customised for your project.

The protocol generally contains:
  • Research Objective
  • PICOS Framework
  • Inclusion Criteria
  • Exclusion Criteria
  • AI Guidance (Description for Inclusion-Exclusion reasons)
The protocol becomes the primary source of truth for both reviewers and AI.

Step 1: Define the Research Objective
The objective should clearly describe:
  • Population of interest
  • Intervention
  • Comparison
  • Outcomes
  • Scope of the review

Example: "To evaluate the efficacy and safety of Drug X compared with standard chemotherapy in adults with advanced non-small cell lung cancer."
A clearly defined objective helps reviewers remain aligned throughout the project.

Step 2: Define PICOS Criteria
PICOS serves as the foundation of screening decisions.

Population (P)
Describe:
  • Disease area
  • Patient demographics
  • Disease severity
  • Clinical characteristics
Example: Adults aged 18 years or older with advanced non-small cell lung cancer.

Avoid: Cancer patients.

The latter is too broad and open to interpretation.

Intervention (I)
Specify:
  • Drug names
  • Treatment classes
  • Procedures
  • Exposures
Example: Pembrolizumab monotherapy.

Avoid: Immunotherapy.

This may unintentionally include multiple interventions.

Comparator (C)
Define:
  • Placebo
  • Standard of care
  • Alternative intervention
Example: Platinum-based chemotherapy.

Outcomes (O)

Specify outcomes that matter for inclusion.
Examples:
  • Overall Survival
  • Progression-Free Survival
  • Quality of Life
  • Adverse Events
This helps both reviewers and AI identify relevant studies.

Study Design (S)
Specify eligible designs.
Examples:
Include:
  • Randomized Controlled Trials
  • Prospective Cohort Studies
Exclude:
  • Case Reports
  • Editorials
  • Narrative Reviews

Step 3: Write AI-Friendly Screening Instructions
The AI relies heavily on protocol descriptions. The clearer the instructions, the better the results.

Good Example: "Include studies evaluating adults diagnosed with advanced NSCLC receiving Pembrolizumab monotherapy."
Poor Example: "Include relevant cancer studies involving immunotherapy."
The second example leaves significant room for interpretation.

Best Practices for AI Instructions

Be Explicit: Clearly state what should be included.
Example: Include studies with adults aged ≥18 years.
Avoid Subjective Language.

Avoid:
  • Relevant
  • Important
  • Significant
  • Small studies
Instead specify measurable criteria.

Example: Exclude studies with fewer than 50 patients.

Define Edge Cases
Examples:
  • Mixed populations
  • Multiple interventions
  • Combined outcomes
  • Subgroup analyses
Providing guidance for edge cases significantly improves consistency.

Phase 2: Protocol Optimisation
Before screening begins, review the protocol using EasySLR's Protocol Text Review.
Navigate to: Tools → Protocol Text Review
Select: New Text Review

Choose:
  • Title & Abstract Protocol
    or
  • Full Text Protocol
Click: Analyze Protocol

What Protocol Text Review Evaluates
The tool performs two major assessments.

1. PICOS Review
The system evaluates:
  • Completeness
  • Clarity
  • Consistency
  • Potential ambiguity

Examples of issues detected:
  • Undefined age ranges
  • Broad intervention definitions
  • Missing outcome specifications
  • Conflicting study design requirements
The tool then provides recommendations for improvement.

2. Selection Criteria Hierarchy Review
The order of criteria matters.
For example:
Exclude:
  • Animal studies
  • Case reports
  • Conference abstracts
before evaluating disease-specific criteria.

The system identifies:
  • Overlapping exclusions
  • Missing exclusion categories
  • Redundant criteria
  • Poor sequencing
and recommends improvements.

The recommendations generated by EasySLR are intended to help identify potential areas for improvement within the protocol. However, they should be considered as guidance rather than mandatory changes.

Carefully review each suggestion and assess its relevance based on:
  • Your domain expertise
  • The objectives of the review
  • The specific research question
  • Project requirements and methodology

Not all recommendations will be applicable to every project. Apply only those changes that align with your scientific judgment and screening strategy. The final decision on protocol modifications should always remain with the review team.

Update:
  • PICOS definitions
  • Inclusion criteria
  • Exclusion criteria
  • AI guidance
before screening begins.

This step alone often reduces conflicts substantially.

Phase 3: Create a Validation Sample
Instead of screening the full dataset immediately, create a pilot sample.
Recommended sample size: 5–10% of imported citations
Examples: 1,000 citations → Review 100 citations
5,000 citations → Review 250–500 citations
The goal is validation rather than productivity.

Why Pilot Screening Matters
The pilot helps determine:
  • Whether reviewers interpret criteria consistently
  • Whether AI understands the protocol correctly
  • Whether protocol revisions are required
before large-scale screening begins.

Phase 4: Human Screening
Assign the sample to reviewers.
Complete:
  • Independent screening
  • Dual screening (if applicable)
Avoid changing criteria during this phase.

Phase 5: Run AI on the Same Sample
Once human screening is complete: Run AI screening on the exact same articles. This creates a direct comparison set.
The AI will generate:
  • Suggestions/ Decisions
  • AI Notes/ rationale
  • PICOS

Phase 6: Evaluate AI Performance
The objective is not to determine whether AI is perfect.
The objective is to understand:
  • Where AI agrees
  • Where AI disagrees
  • Why disagreements occur

Review AI Notes
For each disagreement:
Review:
  • AI decision
  • Reviewer decision
  • AI reasoning
Questions to ask: Did the AI misunderstand the protocol? Or Is the protocol insufficiently clear?

Common Findings
Examples: AI includes studies with mixed populations.
This may indicate: Population criteria need clarification.
AI excludes studies lacking explicit outcome language.
This may indicate: Outcome definitions need expansion.

Phase 7: Run Conflict Analysis or AI Conflict Analysis (If AI is used as an assistant)
Navigate to: Tools → Conflict Analysis
Select: New Conflict Analysis
Choose:
  • Title & Abstract
    or
  • Full Text
Click: Analyze Conflicts

What Conflict Analysis Does

The system evaluates:
  • Reviewer disagreements
  • Reviewer vs AI disagreements
  • Conflict patterns
  • Protocol weaknesses
and generates actionable recommendations.

Types of Insights Generated
  • Reviewer Conflict Trends
Example: ~2 conflicts affected due to study design.
This suggests study design criteria may require refinement.
  • Ambiguous Criteria
Example: Multiple reviewers interpret "elderly population" differently.
Recommendation: Specify age threshold.
  • Missing Decision Rules
Example: Mixed intervention studies create inconsistent decisions.
Recommendation: Add explicit handling instructions.

Protocol Recommendations
The system may suggest:
  • Clarifying inclusion criteria
  • Refining exclusion labels
  • Adding missing instructions
  • Defining edge cases

Phase 8: Update the Protocol
Use recommendations from:
  • Protocol Text Review
  • Conflict Analysis/ AI Conflict Analysis
to improve the protocol.

Refine the protocol to improve clarity, consistency, and transparency, ensuring that screening criteria are interpreted uniformly by both reviewers and AI. Provide clear and explicit instructions wherever possible to support accurate, reproducible, and AI-ready screening decisions.

Phase 9: Revalidate
Rerun AI on the same sample.

This helps confirm:
  • Improved agreement
  • Fewer conflicts
  • Better AI performance
before scaling.

Phase 10: Scale Across the Full Dataset
Once:
  • AI agreement is satisfactory for the project
  • Major protocol ambiguities have been resolved
  • Conflict numbers have reduced to an acceptable level
proceed with screening all remaining citations.

At this stage:
  • AI is better calibrated
  • Protocol is stronger
  • Reviewer interpretation is aligned
leading to faster and more reliable screening.



    • Related Articles

    • List of AI Models Used Across EasySLR

      EasySLR uses purpose-specific GPT models across different review stages to optimise accuracy, performance, and cost efficiency. Each model has been selected based on the complexity and nature of the task it supports, ensuring reliable AI assistance ...
    • Conflict Analysis in EasySLR

      The Conflict Analysis feature in EasySLR helps identify patterns in reviewer disagreements during the screening process. It provides data-driven suggestions to refine your protocol and improve consistency across decisions, ultimately reducing ...
    • Protocol Text Review & AI Performance Analysis

      EasySLR is designed to help reviewers streamline their workflows and make evidence-based decisions faster. Two powerful features — Protocol Text Review and AI Performance Analysis —ensure your review protocol is aligned, AI-ready, and easy to ...
    • Protocol Generation in EasySLR

      The Protocol Generation tool in EasySLR helps you quickly create a ready-to-use review protocol for your project based on your Search Query and Research Objective. This feature is designed to save time and ensure your project starts with a structured ...
    • How to Enhance AI Performance for Optimal Results?

      EasySLR’s AI can streamline screening by providing inclusion/exclusion suggestions based on your protocol. However, to get the best results, it's important to first calibrate the AI to ensure it aligns with your decision-making. The steps below ...