Contributing Guide¶

Thank you for your interest in contributing to BMLibrarian! This guide covers everything you need to know to contribute effectively.

Development Prerequisites¶

Required Software¶

Python 3.12+
PostgreSQL 12+ with pgvector extension
Git for version control
uv package manager (recommended)
Ollama for local LLM inference

Environment Setup¶

# Clone the repository
git clone https://github.com/hherb/bmlibrarian
cd bmlibrarian

# Install dependencies with uv
uv sync

# Or with pip
pip install -e .[dev]

# Verify installation
uv run pytest tests/ -v

Code Standards¶

Style Guidelines¶

BMLibrarian follows strict code quality requirements:

Aspect	Standard
Style	PEP 8 compliance
Line length	88 characters (Black default)
Type hints	Required for all function signatures
Docstrings	Google-style for public methods
Coverage	Minimum 80% (currently exceeding 98%)

Type Hints¶

All function signatures must include type hints:

# Good
def process_document(
    document: Dict[str, Any],
    threshold: float = 0.5
) -> ProcessingResult:
    """Process a document and return results."""
    ...

# Bad
def process_document(document, threshold=0.5):
    ...

Docstrings¶

Use Google-style docstrings:

def calculate_relevance(
    query: str,
    document: Dict[str, str],
    model: str = "default"
) -> float:
    """Calculate relevance score between query and document.

    Args:
        query: The search query string.
        document: Document dictionary with 'title' and 'abstract' keys.
        model: Model name to use for scoring.

    Returns:
        Relevance score between 0.0 and 1.0.

    Raises:
        ValueError: If document is missing required keys.

    Example:
        >>> score = calculate_relevance("diabetes", {"title": "...", "abstract": "..."})
        >>> print(f"Relevance: {score:.2f}")
    """
    ...

Testing¶

Test Categories¶

BMLibrarian uses three test categories:

Unit Tests - Individual function testing
Integration Tests - Database operations (marked with @pytest.mark.integration)
CLI Tests - Command parsing validation

Running Tests¶

# Run all tests
uv run pytest tests/

# Run with coverage
uv run pytest tests/ --cov=bmlibrarian --cov-report=html

# Run specific test file
uv run pytest tests/test_agents.py -v

# Run integration tests only
uv run pytest tests/ -m integration

# Skip integration tests
uv run pytest tests/ -m "not integration"

Writing Tests¶

import pytest
from bmlibrarian.agents import QueryAgent

class TestQueryAgent:
    """Tests for QueryAgent class."""

    def test_convert_question_basic(self):
        """Test basic question conversion."""
        agent = QueryAgent()
        result = agent.convert_question("What causes diabetes?")
        assert "diabetes" in result.lower()

    def test_convert_question_empty_raises(self):
        """Test that empty question raises ValueError."""
        agent = QueryAgent()
        with pytest.raises(ValueError, match="empty"):
            agent.convert_question("")

    @pytest.mark.integration
    def test_find_abstracts_database(self, db_connection):
        """Test database search integration."""
        agent = QueryAgent()
        results = list(agent.find_abstracts("COVID", max_results=10))
        assert len(results) <= 10

Quality Tools¶

Automated Checking¶

BMLibrarian uses several quality tools:

Tool	Purpose	Command
Black	Code formatting	`black src/ tests/`
isort	Import sorting	`isort src/ tests/`
mypy	Type checking	`mypy src/`
flake8	Linting	`flake8 src/ tests/`
bandit	Security scanning	`bandit -r src/`

Pre-commit Hooks¶

Install pre-commit hooks for automatic checking:

# Install pre-commit
pip install pre-commit

# Install hooks
pre-commit install

# Run manually
pre-commit run --all-files

Configuration¶

.pre-commit-config.yaml:

repos:
  - repo: https://github.com/psf/black
    rev: 23.12.1
    hooks:
      - id: black

  - repo: https://github.com/pycqa/isort
    rev: 5.13.2
    hooks:
      - id: isort

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.8.0
    hooks:
      - id: mypy
        additional_dependencies: [types-all]

Feature Development Process¶

1. Plan Your Feature¶

Check existing issues and discussions
Document requirements clearly
Consider backward compatibility

2. Create Feature Branch¶

# Update main branch
git checkout main
git pull origin main

# Create feature branch
git checkout -b feature/your-feature-name

3. Implement with Tests¶

Write tests first (TDD encouraged)
Implement feature code
Ensure all tests pass
Update documentation

4. Submit Pull Request¶

# Push branch
git push origin feature/your-feature-name

Then create a pull request on GitHub.

Pull Request Standards¶

Requirements¶

All pull requests must:

[ ] Pass all existing tests
[ ] Maintain code coverage threshold (80%+)
[ ] Comply with style guidelines
[ ] Include updated documentation
[ ] Have clear commit messages

PR Template¶

## Summary

Brief description of changes.

## Changes

- Added X functionality
- Fixed Y bug
- Updated Z documentation

## Testing

- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manual testing performed

## Documentation

- [ ] Docstrings updated
- [ ] User documentation updated
- [ ] Developer documentation updated

## Related Issues

Closes #123

Commit Messages¶

Follow conventional commit format:

type(scope): description

[optional body]

[optional footer]

Types:

feat: New feature
fix: Bug fix
docs: Documentation
style: Formatting
refactor: Code restructuring
test: Test changes
chore: Maintenance

Examples:

feat(agents): add batch processing to ScoringAgent

fix(migrations): handle missing checksum column

docs(api): update QueryAgent documentation

Migration System Extensions¶

When extending the migration system:

CLI Parser Updates¶

# In cli.py
def create_parser():
    parser = argparse.ArgumentParser()
    subparsers = parser.add_subparsers()

    # Add new command
    new_cmd = subparsers.add_parser('new-command')
    new_cmd.add_argument('--option', help='Option description')

    return parser

Corresponding Tests¶

def test_new_command_parser():
    parser = create_parser()
    args = parser.parse_args(['new-command', '--option', 'value'])
    assert args.option == 'value'

Getting Help¶

Resources¶

GitHub Issues: Bug reports and feature requests
GitHub Discussions: Questions and ideas
Documentation: /doc directory

Contact¶

Create an issue for bugs
Start a discussion for questions
Tag maintainers for urgent issues

Recognition¶

Contributors are recognized in:

CONTRIBUTORS.md file
Release notes
Project documentation

Thank you for contributing to BMLibrarian!