Protocol Development, AI Validation & Conflict Analysis: Best Practice Workflow for Achieving Optimal AI Screening Results in EasySLR

The success of any systematic review, rapid review, targeted literature review (TLR), or evidence synthesis project depends heavily on the quality of the protocol.

A well-defined protocol ensures that:

Human reviewers apply consistent screening decisions.
AI interprets study eligibility correctly.
Reviewer conflicts are minimised.
Inclusion and exclusion decisions remain transparent and reproducible.
Screening can be scaled confidently across large datasets.

EasySLR provides a suite of tools specifically designed to help teams develop, validate, refine, and optimise protocols before deploying AI across an entire review.

This guide outlines a recommended workflow for:

Creating a protocol
Optimising the protocol using Protocol Text Review
Running a pilot screening exercise
Comparing AI decisions against reviewer decisions
Performing AI Conflict Analysis
Refining the protocol
Scaling AI across the full dataset

Why Protocol Validation Matters Before Running AI

Many teams make the mistake of immediately running AI on thousands of articles.

However, if the protocol contains:

Ambiguous definitions
Overlapping criteria
Missing decision rules
Inconsistent reviewer interpretation

the AI will simply reproduce those inconsistencies at scale.

For this reason, EasySLR recommends treating protocol development as an iterative process.

The recommended workflow is:

→ Protocol Creation
→ Protocol Review
→ Pilot Screening
→ AI Validation
→ Conflict Analysis
→ Protocol Refinement
→ Full Scale Screening

This approach typically produces:

Higher AI agreement
Fewer conflicts
Faster screening
Improved reproducibility
Greater confidence in final results

Phase 1: Creating the Protocol

Navigate to: Project Settings → Protocol

EasySLR provides a protocol template that can be customised for your project.

The protocol generally contains:

Research Objective
PICOS Framework
Inclusion Criteria
Exclusion Criteria
AI Guidance (Description for Inclusion-Exclusion reasons)

The protocol becomes the primary source of truth for both reviewers and AI.

Step 1: Define the Research Objective

The objective should clearly describe:

Population of interest
Intervention
Comparison
Outcomes
Scope of the review

Example: "To evaluate the efficacy and safety of Drug X compared with standard chemotherapy in adults with advanced non-small cell lung cancer."

A clearly defined objective helps reviewers remain aligned throughout the project.

Step 2: Define PICOS Criteria

PICOS serves as the foundation of screening decisions.

Population (P)

Describe:

Disease area
Patient demographics
Disease severity
Clinical characteristics

Example: Adults aged 18 years or older with advanced non-small cell lung cancer.

Avoid: Cancer patients.

The latter is too broad and open to interpretation.

Intervention (I)

Specify:

Drug names
Treatment classes
Procedures
Exposures

Example: Pembrolizumab monotherapy.

Avoid: Immunotherapy.

This may unintentionally include multiple interventions.

Comparator (C)

Define:

Placebo
Standard of care
Alternative intervention

Example: Platinum-based chemotherapy.

Outcomes (O)

Specify outcomes that matter for inclusion.

Examples:

Overall Survival
Progression-Free Survival
Quality of Life
Adverse Events

This helps both reviewers and AI identify relevant studies.

Study Design (S)

Specify eligible designs.

Examples:

Include:

Randomized Controlled Trials
Prospective Cohort Studies

Exclude:

Case Reports
Editorials
Narrative Reviews

Step 3: Write AI-Friendly Screening Instructions

The AI relies heavily on protocol descriptions. The clearer the instructions, the better the results.

Good Example: "Include studies evaluating adults diagnosed with advanced NSCLC receiving Pembrolizumab monotherapy."

Poor Example: "Include relevant cancer studies involving immunotherapy."

The second example leaves significant room for interpretation.

Best Practices for AI Instructions

Be Explicit: Clearly state what should be included.

Example: Include studies with adults aged ≥18 years.

Avoid Subjective Language.

Avoid:

Relevant
Important
Significant
Small studies

Instead specify measurable criteria.

Example: Exclude studies with fewer than 50 patients.

Define Edge Cases

Examples:

Mixed populations
Multiple interventions
Combined outcomes
Subgroup analyses

Providing guidance for edge cases significantly improves consistency.

Phase 2: Protocol Optimisation

Before screening begins, review the protocol using EasySLR's Protocol Text Review.

Navigate to: Tools → Protocol Text Review

Select: New Text Review

Choose:

Title & Abstract Protocol
or
Full Text Protocol

Click: Analyze Protocol

What Protocol Text Review Evaluates

The tool performs two major assessments.

1. PICOS Review

The system evaluates:

Completeness
Clarity
Consistency
Potential ambiguity

Examples of issues detected:

Undefined age ranges
Broad intervention definitions
Missing outcome specifications
Conflicting study design requirements

The tool then provides recommendations for improvement.

2. Selection Criteria Hierarchy Review

The order of criteria matters.

For example:

Exclude:

Animal studies
Case reports
Conference abstracts

before evaluating disease-specific criteria.

The system identifies:

Overlapping exclusions
Missing exclusion categories
Redundant criteria
Poor sequencing

and recommends improvements.

The recommendations generated by EasySLR are intended to help identify potential areas for improvement within the protocol. However, they should be considered as guidance rather than mandatory changes.

Carefully review each suggestion and assess its relevance based on:

Your domain expertise
The objectives of the review
The specific research question
Project requirements and methodology

Not all recommendations will be applicable to every project. Apply only those changes that align with your scientific judgment and screening strategy. The final decision on protocol modifications should always remain with the review team.

Update:

PICOS definitions
Inclusion criteria
Exclusion criteria
AI guidance

before screening begins.

This step alone often reduces conflicts substantially.

Phase 3: Create a Validation Sample

Instead of screening the full dataset immediately, create a pilot sample.

Recommended sample size: 5–10% of imported citations

Examples: 1,000 citations → Review 100 citations

5,000 citations → Review 250–500 citations

The goal is validation rather than productivity.

Why Pilot Screening Matters

The pilot helps determine:

Whether reviewers interpret criteria consistently
Whether AI understands the protocol correctly
Whether protocol revisions are required

before large-scale screening begins.

Phase 4: Human Screening

Assign the sample to reviewers.

Complete:

Independent screening
Dual screening (if applicable)

Avoid changing criteria during this phase.

Phase 5: Run AI on the Same Sample

Once human screening is complete: Run AI screening on the exact same articles. This creates a direct comparison set.

The AI will generate:

Suggestions/ Decisions
AI Notes/ rationale
PICOS

Phase 6: Evaluate AI Performance

The objective is not to determine whether AI is perfect.

The objective is to understand:

Where AI agrees
Where AI disagrees
Why disagreements occur

Review AI Notes

For each disagreement:

Review:

AI decision
Reviewer decision
AI reasoning

Questions to ask: Did the AI misunderstand the protocol? Or Is the protocol insufficiently clear?

Common Findings

Examples: AI includes studies with mixed populations.

This may indicate: Population criteria need clarification.

AI excludes studies lacking explicit outcome language.

This may indicate: Outcome definitions need expansion.

Phase 7: Run Conflict Analysis or AI Conflict Analysis (If AI is used as an assistant)

Navigate to: Tools → Conflict Analysis

Select: New Conflict Analysis

Choose:

Title & Abstract
or
Full Text

Click: Analyze Conflicts

What Conflict Analysis Does

The system evaluates:

Reviewer disagreements
Reviewer vs AI disagreements
Conflict patterns
Protocol weaknesses

and generates actionable recommendations.

Types of Insights Generated

Reviewer Conflict Trends

Example: ~2 conflicts affected due to study design.

This suggests study design criteria may require refinement.

Ambiguous Criteria

Example: Multiple reviewers interpret "elderly population" differently.

Recommendation: Specify age threshold.

Missing Decision Rules

Example: Mixed intervention studies create inconsistent decisions.

Recommendation: Add explicit handling instructions.

Protocol Recommendations

The system may suggest:

Clarifying inclusion criteria
Refining exclusion labels
Adding missing instructions
Defining edge cases

Phase 8: Update the Protocol

Use recommendations from:

Protocol Text Review
Conflict Analysis/ AI Conflict Analysis

to improve the protocol.

Refine the protocol to improve clarity, consistency, and transparency, ensuring that screening criteria are interpreted uniformly by both reviewers and AI. Provide clear and explicit instructions wherever possible to support accurate, reproducible, and AI-ready screening decisions.

Phase 9: Revalidate

Rerun AI on the same sample.

This helps confirm:

Improved agreement
Fewer conflicts
Better AI performance

before scaling.

Phase 10: Scale Across the Full Dataset

Once:

AI agreement is satisfactory for the project
Major protocol ambiguities have been resolved
Conflict numbers have reduced to an acceptable level

proceed with screening all remaining citations.

At this stage:

AI is better calibrated
Protocol is stronger
Reviewer interpretation is aligned

leading to faster and more reliable screening.

Related Help Articles

Conflict Analysis

Related Articles
Conflict Analysis in EasySLR
The Conflict Analysis feature in EasySLR helps identify patterns in reviewer disagreements during the screening process. It provides data-driven suggestions to refine your protocol and improve consistency across decisions, ultimately reducing ...
List of AI Models Used Across EasySLR
EasySLR uses purpose-specific GPT models across different review stages to optimise accuracy, performance, and cost efficiency. Each model has been selected based on the complexity and nature of the task it supports, ensuring reliable AI assistance ...
Comparing AI Assistant, AI as One of the Reviewers, and AI as the Only Reviewer in EasySLR
EasySLR utilises advanced artificial intelligence to streamline and enhance the article screening process. Within the platform, AI can be configured in three different ways depending on your review methodology and project requirements: AI Assistant ...
Protocol Text Review & AI Performance Analysis
EasySLR is designed to help reviewers streamline their workflows and make evidence-based decisions faster. Two powerful features — Protocol Text Review and AI Performance Analysis —ensure your review protocol is aligned, AI-ready, and easy to ...
Protocol Generation in EasySLR
The Protocol Generation tool in EasySLR helps you quickly create a ready-to-use review protocol for your project based on your Search Query and Research Objective. This feature is designed to save time and ensure your project starts with a structured ...

Protocol Development, AI Validation & Conflict Analysis: Best Practice Workflow for Achieving Optimal AI Screening Results in EasySLR

Protocol Development, AI Validation & Conflict Analysis: Best Practice Workflow for Achieving Optimal AI Screening Results in EasySLR

Related Articles

Conflict Analysis in EasySLR

List of AI Models Used Across EasySLR

Comparing AI Assistant, AI as One of the Reviewers, and AI as the Only Reviewer in EasySLR

Protocol Text Review & AI Performance Analysis

Protocol Generation in EasySLR