Agent Architecture¶
The BMLibrarian agents module provides a modular, extensible architecture for AI-powered biomedical literature tasks. The system is built around a common BaseAgent class with specialized agents for specific tasks.
Architecture Principles¶
1. Separation of Concerns¶
Each agent has a single, well-defined responsibility:
| Agent | Responsibility |
|---|---|
| QueryAgent | Natural language to PostgreSQL query conversion |
| DocumentScoringAgent | Document relevance assessment (1-5 scale) |
| CitationFinderAgent | Extract relevant passages and citations |
| ReportingAgent | Synthesize citations into medical reports |
| CounterfactualAgent | Generate contradictory evidence questions |
2. Common Foundation¶
All agents inherit from BaseAgent which provides:
- Ollama client management and connection handling
- Standardized error handling patterns
- Callback system for progress updates and UI integration
- Model configuration and parameter management
- Connection testing utilities
3. Maintainability¶
- Each agent class is kept under 500 lines
- Clear separation between core and specialized features
- Comprehensive unit testing with mocked dependencies
- Backward compatibility layer for existing code
Directory Structure¶
src/bmlibrarian/agents/
├── __init__.py # Public API exports
├── base.py # BaseAgent abstract class
├── query_agent.py # Natural language query conversion
├── scoring_agent.py # Document relevance scoring
├── citation_agent.py # Citation extraction from documents
├── reporting_agent.py # Report synthesis and formatting
├── counterfactual_agent.py # Counterfactual analysis
├── queue_manager.py # SQLite-based task queue system
└── orchestrator.py # Multi-agent workflow coordination
Base Agent Class¶
Abstract Base Class Design¶
from abc import ABC, abstractmethod
from typing import Optional, Callable
class BaseAgent(ABC):
def __init__(
self,
model: str,
host: str,
temperature: float,
top_p: float,
callback: Optional[Callable]
):
# Common initialization
@abstractmethod
def get_agent_type(self) -> str:
"""Must be implemented by subclasses"""
pass
# Common functionality
def test_connection(self) -> bool: ...
def get_available_models(self) -> list[str]: ...
def _make_ollama_request(self, messages: list, **options) -> str: ...
def _call_callback(self, step: str, data: str) -> None: ...
Key Features¶
Standardized Ollama Integration:
- Consistent client initialization
- Error handling for connection issues
- Model availability checking
Callback System:
- Progress updates for UI integration
- Error-tolerant callback execution
- Step-based progress tracking
Configuration Management:
- Model parameters with sensible defaults
- Option overrides for specific requests
- Host and model configurability
Specialized Agents¶
QueryAgent¶
Purpose: Convert natural language questions to PostgreSQL to_tsquery format
Key Methods:
def convert_question(question: str) -> str:
"""Core conversion functionality"""
def find_abstracts(...) -> Generator:
"""Integrated search with database"""
Error Handling:
- Input validation (empty questions)
- Ollama connection errors
- Query format validation with warnings
DocumentScoringAgent¶
Purpose: Evaluate document relevance with scores and reasoning
Key Methods:
def evaluate_document(question: str, document: Dict) -> ScoringResult:
"""Single document evaluation"""
def batch_evaluate_documents(...) -> List[Tuple]:
"""Efficient multi-document scoring"""
def get_top_documents(...) -> List:
"""Ranked selection with filtering"""
Response Structure:
class ScoringResult(TypedDict):
score: int # 0-5 relevance score
reasoning: str # Explanation for the score
CitationFinderAgent¶
Purpose: Extract relevant passages and citations from documents
Key Methods:
def extract_citation(question: str, document: Dict) -> Citation:
"""Extract single citation"""
def process_scored_documents_for_citations(...) -> List[Citation]:
"""Batch processing with progress tracking"""
ReportingAgent¶
Purpose: Synthesize citations into medical publication-style reports
Report Features:
- Professional medical writing style
- Evidence strength assessment
- Vancouver-style reference formatting
- Methodology notes and quality controls
- Structured markdown output
CounterfactualAgent¶
Purpose: Analyze documents to generate research questions for contradictory evidence
Response Structure:
class CounterfactualAnalysis(TypedDict):
document_title: str
main_claims: List[str]
counterfactual_questions: List[CounterfactualQuestion]
overall_assessment: str
confidence_level: str
Integration Patterns¶
Database Integration¶
from bmlibrarian.agents import QueryAgent
from bmlibrarian.database import find_abstracts
query_agent = QueryAgent()
for doc in query_agent.find_abstracts("COVID vaccines"):
print(doc['title'])
Combined Workflows¶
from bmlibrarian.agents import (
QueryAgent,
DocumentScoringAgent,
CitationFinderAgent,
ReportingAgent,
CounterfactualAgent
)
question = "What are the cardiovascular benefits of exercise?"
# 1. Search for documents
documents = list(query_agent.find_abstracts(question))
# 2. Score documents for relevance
scored_docs = scoring_agent.batch_evaluate_documents(question, documents)
# 3. Extract citations from high-scoring documents
high_scoring = [(doc, result) for doc, result in scored_docs if result['score'] > 3]
citations = citation_agent.process_scored_documents_for_citations(
user_question=question,
scored_documents=high_scoring
)
# 4. Generate comprehensive report
report = reporting_agent.synthesize_report(question, citations)
# 5. Analyze for contradictory evidence
counterfactual = counterfactual_agent.analyze_document(
document_content=report,
document_title=f"Research Report: {question}"
)
Callback Integration¶
def progress_callback(step: str, data: str):
print(f"[{step}] {data}")
agent = QueryAgent(callback=progress_callback)
results = agent.find_abstracts("heart disease")
Extending the Architecture¶
Adding New Agents¶
- Inherit from BaseAgent:
from bmlibrarian.agents.base import BaseAgent
class ResearchAgent(BaseAgent):
def get_agent_type(self) -> str:
return "research_agent"
def summarize_documents(self, documents: list) -> str:
# Implementation here
pass
- Add to
__init__.py:
- Write Tests:
Testing Strategy¶
Running Tests¶
# Run all agent tests
uv run pytest tests/test_agents.py
# Run with coverage
uv run pytest tests/test_agents.py --cov=bmlibrarian.agents
# Run specific test classes
uv run pytest tests/test_query_agent.py::TestQueryAgent
Unit Testing Approach¶
- Mock Ollama Dependencies - No external dependencies for testing
- Comprehensive Coverage - All public methods tested
- Integration Testing - Optional with real database
Security and Privacy¶
Input Validation¶
- Non-empty input requirements
- Type checking for complex parameters
- SQL injection prevention through parameterized queries
Local Processing¶
- All AI processing happens locally through Ollama
- No data sent to external services
- User questions and document content remain private