Fact Checker Guide¶
The BMLibrarian Fact Checker is designed to audit biomedical statements in LLM training datasets by evaluating them against published literature evidence and storing results in PostgreSQL.
Overview¶
The Fact Checker system provides:
- Automated verification of statements as yes/no/maybe
- Citation extraction with confidence levels (high/medium/low)
- Batch processing with incremental resumption
- Multi-user annotation workflows
Basic Usage¶
Input Format¶
Create a JSON file containing statements to verify:
[
{
"statement": "Exercise reduces cardiovascular disease risk",
"expected_answer": "yes"
},
{
"statement": "Smoking has no effect on lung cancer",
"expected_answer": "no"
}
]
Running the Fact Checker¶
# Basic usage
uv run python fact_checker_cli.py input.json
# With output file
uv run python fact_checker_cli.py input.json -o results.json
# Quick mode for testing
uv run python fact_checker_cli.py input.json --quick
# Verbose output
uv run python fact_checker_cli.py input.json --verbose
Command Line Options¶
| Option | Description | Default |
|---|---|---|
-o, --output |
Output JSON file | None |
--quick |
Quick mode for testing | False |
--verbose |
Verbose output | False |
--score-threshold |
Minimum document score (1-5) | 2.5 |
--max-documents |
Maximum documents to search | 50 |
--model |
Ollama model to use | gpt-oss:20b |
--temperature |
Model temperature | 0.1 |
Evaluation Criteria¶
Yes Verdict¶
Requirements:
- High-quality citations supporting the statement
- Consistent evidence across multiple sources
- Strong relevance to the specific claim
No Verdict¶
Requirements:
- Multiple citations directly contradicting the statement
- Consistent contradictory evidence
- Clear opposition to the claim
Maybe Verdict¶
Assigned when:
- Evidence is mixed or inconclusive
- Sources partially support the statement
- Insufficient evidence found
Confidence Levels¶
| Level | Description |
|---|---|
| High | Strong, consistent evidence from quality sources |
| Medium | Moderate evidence with some limitations |
| Low | Limited evidence or conflicting findings |
Output Format¶
Results are stored in PostgreSQL's factcheck schema and optionally exported to JSON:
{
"statement": "Exercise reduces cardiovascular disease risk",
"evaluation": "yes",
"confidence": "high",
"reasoning": "Multiple high-quality studies demonstrate...",
"citations": [
{
"pmid": "12345678",
"title": "Cardiovascular Benefits of Exercise",
"passage": "Regular exercise significantly reduces...",
"relevance_score": 0.95
}
]
}
Review GUI¶
Human reviewers can annotate findings using the Review GUI:
uv run python fact_checker_review_gui.py --input-file results.db
# Blind mode for unbiased annotation
uv run python fact_checker_review_gui.py --input-file results.db --blind
Review Features¶
- Statement display with AI evaluation
- Supporting citation cards
- Annotation controls (Yes/No/Maybe)
- Confidence timer
- Auto-save functionality
Workflow¶
graph LR
A[Input JSON] --> B[Fact Checker CLI]
B --> C[Search Literature]
C --> D[Score Documents]
D --> E[Extract Citations]
E --> F[Evaluate Statement]
F --> G[Store in PostgreSQL]
G --> H[Export JSON]
H --> I[Review GUI]
I --> J[Human Annotation]
Database Storage¶
Results are stored in the factcheck schema:
-- View all evaluations
SELECT * FROM factcheck.evaluations ORDER BY created_at DESC;
-- View by verdict
SELECT * FROM factcheck.evaluations WHERE verdict = 'yes';
-- View with citations
SELECT e.*, c.passage, c.relevance_score
FROM factcheck.evaluations e
JOIN factcheck.citations c ON e.id = c.evaluation_id;
Best Practices¶
Statement Quality¶
- Use clear, unambiguous statements
- Avoid compound claims
- Focus on verifiable facts
Search Parameters¶
- Start with default thresholds
- Increase
max-documentsfor complex topics - Adjust
score-thresholdbased on results
Review Process¶
- Review AI evaluation first
- Examine supporting citations
- Check citation relevance
- Provide clear reasoning for your decision
Integration with Other Tools¶
Paper Checker¶
For validating entire abstracts:
Research Workflow¶
Fact Checker results can inform research:
- Identify controversial statements
- Use Fact Checker for initial assessment
- Follow up with full research workflow
- Generate comprehensive report