Skip to content

CLI Guide

The BMLibrarian CLI is an interactive command-line application that enables researchers to conduct evidence-based medical literature reviews with human-in-the-loop interaction at every step.

Overview

The CLI guides you through the complete process of evidence-based medical literature research:

  • Research question collection and refinement
  • AI-generated database query creation
  • Document searching and relevance scoring
  • Citation extraction from high-scoring documents
  • Professional medical report generation
  • Optional counterfactual analysis
  • Markdown report export

Getting Started

Basic Launch

uv run python bmlibrarian_cli.py

Quick Mode

For faster testing with reduced scope:

uv run python bmlibrarian_cli.py --quick

Auto Mode

For automated execution:

uv run python bmlibrarian_cli.py --auto "What are the cardiovascular benefits of exercise?"

Command Line Options

Option Description Default
--quick Quick mode (20 docs, 2min timeout) False
--auto Automated mode with question None
--max-results Maximum search results 100
--score-threshold Minimum relevance score (1-5) 2.5
--max-citations Maximum citations to extract 30
--timeout Operation timeout (seconds) 120

Workflow Phases

Initial Setup

The CLI loads your configuration and connects to:

  • PostgreSQL database
  • Ollama AI server

Search Phase

  1. Enter Question: Type your medical research question
  2. Query Generation: AI creates a PostgreSQL query
  3. Query Review: Accept, modify, or replace the query
  4. Execute Search: Search the database for documents

Evaluation Phase

  1. Document Scoring: AI scores each document (1-5)
  2. Threshold Selection: Choose minimum score for inclusion
  3. Citation Extraction: Extract relevant passages

Synthesis Phase

  1. Report Generation: AI creates publication-style report
  2. Counterfactual Analysis: Optionally search for contradictory evidence
  3. Final Report: Comprehensive report with all evidence
  4. Export: Save to markdown file

Configuration

Score Threshold

The score threshold determines which documents are included:

Threshold Effect
1.0 Include all documents
2.0 Include moderately relevant
3.0 Include only relevant
4.0 Include only highly relevant
5.0 Include only perfect matches

Relevance Threshold

The citation relevance threshold (0-1) controls passage extraction:

Threshold Effect
0.3 Include most passages
0.6 Include relevant passages
0.8 Include only highly relevant

Interactive Commands

During the workflow, you can:

Command Action
y / yes Accept and continue
n / no Reject and retry
e / edit Edit the current item
q / quit Exit the workflow
s / skip Skip current step

Output Format

Reports are generated in markdown format with:

  • Research question header
  • Methodology description
  • Evidence synthesis
  • Vancouver-style references
  • Confidence assessment

Example Output

# Research Report: Cardiovascular Benefits of Exercise

## Executive Summary

Based on analysis of 45 relevant documents...

## Findings

### Primary Evidence

1. Regular aerobic exercise reduces cardiovascular mortality
   by 20-30% [1,2,3].

2. Exercise improves lipid profiles and reduces
   inflammation markers [4,5].

## References

1. Smith J, et al. Exercise and Heart Health.
   J Cardiol. 2023;45:123-130.

2. Jones A, et al. Cardiovascular Benefits of Physical
   Activity. Circulation. 2024;150:456-467.

Troubleshooting

Database Connection Issues

# Test PostgreSQL connection
psql -h localhost -U your_user -d knowledgebase -c "SELECT 1"

# Check environment variables
cat .env | grep POSTGRES

Ollama Service Issues

# Check Ollama is running
curl http://localhost:11434/api/tags

# Restart Ollama
ollama serve

No Documents Found

  1. Try broader search terms
  2. Check database has documents
  3. Verify query syntax
-- Check document count
SELECT COUNT(*) FROM pubmed_articles;

Slow Performance

  1. Use faster models for initial testing
  2. Reduce --max-results
  3. Increase --score-threshold
  4. Use --quick mode

Advanced Usage

Batch Processing

# Process multiple questions
for q in "question1" "question2"; do
    uv run python bmlibrarian_cli.py --auto "$q" --quick
done

Custom Model Selection

# Use specific model
uv run python bmlibrarian_cli.py --model llama3.1:8b

Export Options

Reports are saved to the current directory with timestamp:

Research_Report_20240315_143022.md

Best Practices

Question Formulation

Be Specific

Good: "Effects of HIIT on HbA1c in Type 2 diabetes patients"

Bad: "Exercise and diabetes"

Iterative Refinement

  1. Start with auto mode for initial exploration
  2. Review generated queries
  3. Switch to interactive mode for refinement
  4. Adjust thresholds based on results

Quality Assurance

  • Always review AI-generated queries
  • Check that citations actually support claims
  • Verify document scoring makes sense
  • Use counterfactual analysis for balance