Agent Architecture¶

The BMLibrarian agents module provides a modular, extensible architecture for AI-powered biomedical literature tasks. The system is built around a common BaseAgent class with specialized agents for specific tasks.

Architecture Principles¶

1. Separation of Concerns¶

Each agent has a single, well-defined responsibility:

Agent	Responsibility
QueryAgent	Natural language to PostgreSQL query conversion
DocumentScoringAgent	Document relevance assessment (1-5 scale)
CitationFinderAgent	Extract relevant passages and citations
ReportingAgent	Synthesize citations into medical reports
CounterfactualAgent	Generate contradictory evidence questions

2. Common Foundation¶

All agents inherit from BaseAgent which provides:

Ollama client management and connection handling
Standardized error handling patterns
Callback system for progress updates and UI integration
Model configuration and parameter management
Connection testing utilities

3. Maintainability¶

Each agent class is kept under 500 lines
Clear separation between core and specialized features
Comprehensive unit testing with mocked dependencies
Backward compatibility layer for existing code

Directory Structure¶

src/bmlibrarian/agents/
├── __init__.py              # Public API exports
├── base.py                  # BaseAgent abstract class
├── query_agent.py           # Natural language query conversion
├── scoring_agent.py         # Document relevance scoring
├── citation_agent.py        # Citation extraction from documents
├── reporting_agent.py       # Report synthesis and formatting
├── counterfactual_agent.py  # Counterfactual analysis
├── queue_manager.py         # SQLite-based task queue system
└── orchestrator.py          # Multi-agent workflow coordination

Base Agent Class¶

Abstract Base Class Design¶

from abc import ABC, abstractmethod
from typing import Optional, Callable

class BaseAgent(ABC):
    def __init__(
        self,
        model: str,
        host: str,
        temperature: float,
        top_p: float,
        callback: Optional[Callable]
    ):
        # Common initialization

    @abstractmethod
    def get_agent_type(self) -> str:
        """Must be implemented by subclasses"""
        pass

    # Common functionality
    def test_connection(self) -> bool: ...
    def get_available_models(self) -> list[str]: ...
    def _make_ollama_request(self, messages: list, **options) -> str: ...
    def _call_callback(self, step: str, data: str) -> None: ...

Key Features¶

Standardized Ollama Integration:

Consistent client initialization
Error handling for connection issues
Model availability checking

Callback System:

Progress updates for UI integration
Error-tolerant callback execution
Step-based progress tracking

Configuration Management:

Model parameters with sensible defaults
Option overrides for specific requests
Host and model configurability

Specialized Agents¶

QueryAgent¶

Purpose: Convert natural language questions to PostgreSQL to_tsquery format

Key Methods:

def convert_question(question: str) -> str:
    """Core conversion functionality"""

def find_abstracts(...) -> Generator:
    """Integrated search with database"""

Error Handling:

Input validation (empty questions)
Ollama connection errors
Query format validation with warnings

DocumentScoringAgent¶

Purpose: Evaluate document relevance with scores and reasoning

Key Methods:

def evaluate_document(question: str, document: Dict) -> ScoringResult:
    """Single document evaluation"""

def batch_evaluate_documents(...) -> List[Tuple]:
    """Efficient multi-document scoring"""

def get_top_documents(...) -> List:
    """Ranked selection with filtering"""

Response Structure:

class ScoringResult(TypedDict):
    score: int      # 0-5 relevance score
    reasoning: str  # Explanation for the score

CitationFinderAgent¶

Purpose: Extract relevant passages and citations from documents

Key Methods:

def extract_citation(question: str, document: Dict) -> Citation:
    """Extract single citation"""

def process_scored_documents_for_citations(...) -> List[Citation]:
    """Batch processing with progress tracking"""

ReportingAgent¶

Purpose: Synthesize citations into medical publication-style reports

Report Features:

Professional medical writing style
Evidence strength assessment
Vancouver-style reference formatting
Methodology notes and quality controls
Structured markdown output

CounterfactualAgent¶

Purpose: Analyze documents to generate research questions for contradictory evidence

Response Structure:

class CounterfactualAnalysis(TypedDict):
    document_title: str
    main_claims: List[str]
    counterfactual_questions: List[CounterfactualQuestion]
    overall_assessment: str
    confidence_level: str

Integration Patterns¶

Database Integration¶

from bmlibrarian.agents import QueryAgent
from bmlibrarian.database import find_abstracts

query_agent = QueryAgent()
for doc in query_agent.find_abstracts("COVID vaccines"):
    print(doc['title'])

Combined Workflows¶

from bmlibrarian.agents import (
    QueryAgent,
    DocumentScoringAgent,
    CitationFinderAgent,
    ReportingAgent,
    CounterfactualAgent
)

question = "What are the cardiovascular benefits of exercise?"

# 1. Search for documents
documents = list(query_agent.find_abstracts(question))

# 2. Score documents for relevance
scored_docs = scoring_agent.batch_evaluate_documents(question, documents)

# 3. Extract citations from high-scoring documents
high_scoring = [(doc, result) for doc, result in scored_docs if result['score'] > 3]
citations = citation_agent.process_scored_documents_for_citations(
    user_question=question,
    scored_documents=high_scoring
)

# 4. Generate comprehensive report
report = reporting_agent.synthesize_report(question, citations)

# 5. Analyze for contradictory evidence
counterfactual = counterfactual_agent.analyze_document(
    document_content=report,
    document_title=f"Research Report: {question}"
)

Callback Integration¶

def progress_callback(step: str, data: str):
    print(f"[{step}] {data}")

agent = QueryAgent(callback=progress_callback)
results = agent.find_abstracts("heart disease")

Extending the Architecture¶

Adding New Agents¶

Inherit from BaseAgent:

from bmlibrarian.agents.base import BaseAgent

class ResearchAgent(BaseAgent):
    def get_agent_type(self) -> str:
        return "research_agent"

    def summarize_documents(self, documents: list) -> str:
        # Implementation here
        pass

Add to __init__.py:

from .research_agent import ResearchAgent
__all__.append("ResearchAgent")

Write Tests:

class TestResearchAgent:
    def test_summarize_documents(self):
        # Test implementation
        pass

Testing Strategy¶

Running Tests¶

# Run all agent tests
uv run pytest tests/test_agents.py

# Run with coverage
uv run pytest tests/test_agents.py --cov=bmlibrarian.agents

# Run specific test classes
uv run pytest tests/test_query_agent.py::TestQueryAgent

Unit Testing Approach¶

Mock Ollama Dependencies - No external dependencies for testing
Comprehensive Coverage - All public methods tested
Integration Testing - Optional with real database

Security and Privacy¶

Input Validation¶

Non-empty input requirements
Type checking for complex parameters
SQL injection prevention through parameterized queries

Local Processing¶

All AI processing happens locally through Ollama
No data sent to external services
User questions and document content remain private