Fact Checker Guide¶

The BMLibrarian Fact Checker is designed to audit biomedical statements in LLM training datasets by evaluating them against published literature evidence and storing results in PostgreSQL.

Overview¶

The Fact Checker system provides:

Automated verification of statements as yes/no/maybe
Citation extraction with confidence levels (high/medium/low)
Batch processing with incremental resumption
Multi-user annotation workflows

Basic Usage¶

Input Format¶

Create a JSON file containing statements to verify:

[
  {
    "statement": "Exercise reduces cardiovascular disease risk",
    "expected_answer": "yes"
  },
  {
    "statement": "Smoking has no effect on lung cancer",
    "expected_answer": "no"
  }
]

Running the Fact Checker¶

# Basic usage
uv run python fact_checker_cli.py input.json

# With output file
uv run python fact_checker_cli.py input.json -o results.json

# Quick mode for testing
uv run python fact_checker_cli.py input.json --quick

# Verbose output
uv run python fact_checker_cli.py input.json --verbose

Command Line Options¶

Option	Description	Default
`-o, --output`	Output JSON file	None
`--quick`	Quick mode for testing	False
`--verbose`	Verbose output	False
`--score-threshold`	Minimum document score (1-5)	2.5
`--max-documents`	Maximum documents to search	50
`--model`	Ollama model to use	gpt-oss:20b
`--temperature`	Model temperature	0.1

Evaluation Criteria¶

Yes Verdict¶

Requirements:

High-quality citations supporting the statement
Consistent evidence across multiple sources
Strong relevance to the specific claim

No Verdict¶

Requirements:

Multiple citations directly contradicting the statement
Consistent contradictory evidence
Clear opposition to the claim

Maybe Verdict¶

Assigned when:

Evidence is mixed or inconclusive
Sources partially support the statement
Insufficient evidence found

Confidence Levels¶

Level	Description
High	Strong, consistent evidence from quality sources
Medium	Moderate evidence with some limitations
Low	Limited evidence or conflicting findings

Output Format¶

Results are stored in PostgreSQL's factcheck schema and optionally exported to JSON:

{
  "statement": "Exercise reduces cardiovascular disease risk",
  "evaluation": "yes",
  "confidence": "high",
  "reasoning": "Multiple high-quality studies demonstrate...",
  "citations": [
    {
      "pmid": "12345678",
      "title": "Cardiovascular Benefits of Exercise",
      "passage": "Regular exercise significantly reduces...",
      "relevance_score": 0.95
    }
  ]
}

Review GUI¶

Human reviewers can annotate findings using the Review GUI:

uv run python fact_checker_review_gui.py --input-file results.db

# Blind mode for unbiased annotation
uv run python fact_checker_review_gui.py --input-file results.db --blind

Review Features¶

Statement display with AI evaluation
Supporting citation cards
Annotation controls (Yes/No/Maybe)
Confidence timer
Auto-save functionality

Workflow¶

graph LR
    A[Input JSON] --> B[Fact Checker CLI]
    B --> C[Search Literature]
    C --> D[Score Documents]
    D --> E[Extract Citations]
    E --> F[Evaluate Statement]
    F --> G[Store in PostgreSQL]
    G --> H[Export JSON]
    H --> I[Review GUI]
    I --> J[Human Annotation]

Database Storage¶

Results are stored in the factcheck schema:

-- View all evaluations
SELECT * FROM factcheck.evaluations ORDER BY created_at DESC;

-- View by verdict
SELECT * FROM factcheck.evaluations WHERE verdict = 'yes';

-- View with citations
SELECT e.*, c.passage, c.relevance_score
FROM factcheck.evaluations e
JOIN factcheck.citations c ON e.id = c.evaluation_id;

Best Practices¶

Statement Quality¶

Use clear, unambiguous statements
Avoid compound claims
Focus on verifiable facts

Search Parameters¶

Start with default thresholds
Increase max-documents for complex topics
Adjust score-threshold based on results

Review Process¶

Review AI evaluation first
Examine supporting citations
Check citation relevance
Provide clear reasoning for your decision

Integration with Other Tools¶

Paper Checker¶

For validating entire abstracts:

uv run python paper_checker_cli.py abstract.json

Research Workflow¶

Fact Checker results can inform research:

Identify controversial statements
Use Fact Checker for initial assessment
Follow up with full research workflow
Generate comprehensive report