Skip to content

Fact Checker Guide

The BMLibrarian Fact Checker is designed to audit biomedical statements in LLM training datasets by evaluating them against published literature evidence and storing results in PostgreSQL.

Overview

The Fact Checker system provides:

  • Automated verification of statements as yes/no/maybe
  • Citation extraction with confidence levels (high/medium/low)
  • Batch processing with incremental resumption
  • Multi-user annotation workflows

Basic Usage

Input Format

Create a JSON file containing statements to verify:

[
  {
    "statement": "Exercise reduces cardiovascular disease risk",
    "expected_answer": "yes"
  },
  {
    "statement": "Smoking has no effect on lung cancer",
    "expected_answer": "no"
  }
]

Running the Fact Checker

# Basic usage
uv run python fact_checker_cli.py input.json

# With output file
uv run python fact_checker_cli.py input.json -o results.json

# Quick mode for testing
uv run python fact_checker_cli.py input.json --quick

# Verbose output
uv run python fact_checker_cli.py input.json --verbose

Command Line Options

Option Description Default
-o, --output Output JSON file None
--quick Quick mode for testing False
--verbose Verbose output False
--score-threshold Minimum document score (1-5) 2.5
--max-documents Maximum documents to search 50
--model Ollama model to use gpt-oss:20b
--temperature Model temperature 0.1

Evaluation Criteria

Yes Verdict

Requirements:

  • High-quality citations supporting the statement
  • Consistent evidence across multiple sources
  • Strong relevance to the specific claim

No Verdict

Requirements:

  • Multiple citations directly contradicting the statement
  • Consistent contradictory evidence
  • Clear opposition to the claim

Maybe Verdict

Assigned when:

  • Evidence is mixed or inconclusive
  • Sources partially support the statement
  • Insufficient evidence found

Confidence Levels

Level Description
High Strong, consistent evidence from quality sources
Medium Moderate evidence with some limitations
Low Limited evidence or conflicting findings

Output Format

Results are stored in PostgreSQL's factcheck schema and optionally exported to JSON:

{
  "statement": "Exercise reduces cardiovascular disease risk",
  "evaluation": "yes",
  "confidence": "high",
  "reasoning": "Multiple high-quality studies demonstrate...",
  "citations": [
    {
      "pmid": "12345678",
      "title": "Cardiovascular Benefits of Exercise",
      "passage": "Regular exercise significantly reduces...",
      "relevance_score": 0.95
    }
  ]
}

Review GUI

Human reviewers can annotate findings using the Review GUI:

uv run python fact_checker_review_gui.py --input-file results.db

# Blind mode for unbiased annotation
uv run python fact_checker_review_gui.py --input-file results.db --blind

Review Features

  • Statement display with AI evaluation
  • Supporting citation cards
  • Annotation controls (Yes/No/Maybe)
  • Confidence timer
  • Auto-save functionality

Workflow

graph LR
    A[Input JSON] --> B[Fact Checker CLI]
    B --> C[Search Literature]
    C --> D[Score Documents]
    D --> E[Extract Citations]
    E --> F[Evaluate Statement]
    F --> G[Store in PostgreSQL]
    G --> H[Export JSON]
    H --> I[Review GUI]
    I --> J[Human Annotation]

Database Storage

Results are stored in the factcheck schema:

-- View all evaluations
SELECT * FROM factcheck.evaluations ORDER BY created_at DESC;

-- View by verdict
SELECT * FROM factcheck.evaluations WHERE verdict = 'yes';

-- View with citations
SELECT e.*, c.passage, c.relevance_score
FROM factcheck.evaluations e
JOIN factcheck.citations c ON e.id = c.evaluation_id;

Best Practices

Statement Quality

  • Use clear, unambiguous statements
  • Avoid compound claims
  • Focus on verifiable facts

Search Parameters

  • Start with default thresholds
  • Increase max-documents for complex topics
  • Adjust score-threshold based on results

Review Process

  1. Review AI evaluation first
  2. Examine supporting citations
  3. Check citation relevance
  4. Provide clear reasoning for your decision

Integration with Other Tools

Paper Checker

For validating entire abstracts:

uv run python paper_checker_cli.py abstract.json

Research Workflow

Fact Checker results can inform research:

  1. Identify controversial statements
  2. Use Fact Checker for initial assessment
  3. Follow up with full research workflow
  4. Generate comprehensive report