Skip to content

Contributing Guide

Thank you for your interest in contributing to BMLibrarian! This guide covers everything you need to know to contribute effectively.

Development Prerequisites

Required Software

  • Python 3.12+
  • PostgreSQL 12+ with pgvector extension
  • Git for version control
  • uv package manager (recommended)
  • Ollama for local LLM inference

Environment Setup

# Clone the repository
git clone https://github.com/hherb/bmlibrarian
cd bmlibrarian

# Install dependencies with uv
uv sync

# Or with pip
pip install -e .[dev]

# Verify installation
uv run pytest tests/ -v

Code Standards

Style Guidelines

BMLibrarian follows strict code quality requirements:

Aspect Standard
Style PEP 8 compliance
Line length 88 characters (Black default)
Type hints Required for all function signatures
Docstrings Google-style for public methods
Coverage Minimum 80% (currently exceeding 98%)

Type Hints

All function signatures must include type hints:

# Good
def process_document(
    document: Dict[str, Any],
    threshold: float = 0.5
) -> ProcessingResult:
    """Process a document and return results."""
    ...

# Bad
def process_document(document, threshold=0.5):
    ...

Docstrings

Use Google-style docstrings:

def calculate_relevance(
    query: str,
    document: Dict[str, str],
    model: str = "default"
) -> float:
    """Calculate relevance score between query and document.

    Args:
        query: The search query string.
        document: Document dictionary with 'title' and 'abstract' keys.
        model: Model name to use for scoring.

    Returns:
        Relevance score between 0.0 and 1.0.

    Raises:
        ValueError: If document is missing required keys.

    Example:
        >>> score = calculate_relevance("diabetes", {"title": "...", "abstract": "..."})
        >>> print(f"Relevance: {score:.2f}")
    """
    ...

Testing

Test Categories

BMLibrarian uses three test categories:

  1. Unit Tests - Individual function testing
  2. Integration Tests - Database operations (marked with @pytest.mark.integration)
  3. CLI Tests - Command parsing validation

Running Tests

# Run all tests
uv run pytest tests/

# Run with coverage
uv run pytest tests/ --cov=bmlibrarian --cov-report=html

# Run specific test file
uv run pytest tests/test_agents.py -v

# Run integration tests only
uv run pytest tests/ -m integration

# Skip integration tests
uv run pytest tests/ -m "not integration"

Writing Tests

import pytest
from bmlibrarian.agents import QueryAgent

class TestQueryAgent:
    """Tests for QueryAgent class."""

    def test_convert_question_basic(self):
        """Test basic question conversion."""
        agent = QueryAgent()
        result = agent.convert_question("What causes diabetes?")
        assert "diabetes" in result.lower()

    def test_convert_question_empty_raises(self):
        """Test that empty question raises ValueError."""
        agent = QueryAgent()
        with pytest.raises(ValueError, match="empty"):
            agent.convert_question("")

    @pytest.mark.integration
    def test_find_abstracts_database(self, db_connection):
        """Test database search integration."""
        agent = QueryAgent()
        results = list(agent.find_abstracts("COVID", max_results=10))
        assert len(results) <= 10

Quality Tools

Automated Checking

BMLibrarian uses several quality tools:

Tool Purpose Command
Black Code formatting black src/ tests/
isort Import sorting isort src/ tests/
mypy Type checking mypy src/
flake8 Linting flake8 src/ tests/
bandit Security scanning bandit -r src/

Pre-commit Hooks

Install pre-commit hooks for automatic checking:

# Install pre-commit
pip install pre-commit

# Install hooks
pre-commit install

# Run manually
pre-commit run --all-files

Configuration

.pre-commit-config.yaml:

repos:
  - repo: https://github.com/psf/black
    rev: 23.12.1
    hooks:
      - id: black

  - repo: https://github.com/pycqa/isort
    rev: 5.13.2
    hooks:
      - id: isort

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.8.0
    hooks:
      - id: mypy
        additional_dependencies: [types-all]

Feature Development Process

1. Plan Your Feature

  • Check existing issues and discussions
  • Document requirements clearly
  • Consider backward compatibility

2. Create Feature Branch

# Update main branch
git checkout main
git pull origin main

# Create feature branch
git checkout -b feature/your-feature-name

3. Implement with Tests

  • Write tests first (TDD encouraged)
  • Implement feature code
  • Ensure all tests pass
  • Update documentation

4. Submit Pull Request

# Push branch
git push origin feature/your-feature-name

Then create a pull request on GitHub.

Pull Request Standards

Requirements

All pull requests must:

  • [ ] Pass all existing tests
  • [ ] Maintain code coverage threshold (80%+)
  • [ ] Comply with style guidelines
  • [ ] Include updated documentation
  • [ ] Have clear commit messages

PR Template

## Summary

Brief description of changes.

## Changes

- Added X functionality
- Fixed Y bug
- Updated Z documentation

## Testing

- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manual testing performed

## Documentation

- [ ] Docstrings updated
- [ ] User documentation updated
- [ ] Developer documentation updated

## Related Issues

Closes #123

Commit Messages

Follow conventional commit format:

type(scope): description

[optional body]

[optional footer]

Types:

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation
  • style: Formatting
  • refactor: Code restructuring
  • test: Test changes
  • chore: Maintenance

Examples:

feat(agents): add batch processing to ScoringAgent

fix(migrations): handle missing checksum column

docs(api): update QueryAgent documentation

Migration System Extensions

When extending the migration system:

CLI Parser Updates

# In cli.py
def create_parser():
    parser = argparse.ArgumentParser()
    subparsers = parser.add_subparsers()

    # Add new command
    new_cmd = subparsers.add_parser('new-command')
    new_cmd.add_argument('--option', help='Option description')

    return parser

Corresponding Tests

def test_new_command_parser():
    parser = create_parser()
    args = parser.parse_args(['new-command', '--option', 'value'])
    assert args.option == 'value'

Getting Help

Resources

  • GitHub Issues: Bug reports and feature requests
  • GitHub Discussions: Questions and ideas
  • Documentation: /doc directory

Contact

  • Create an issue for bugs
  • Start a discussion for questions
  • Tag maintainers for urgent issues

Recognition

Contributors are recognized in:

  • CONTRIBUTORS.md file
  • Release notes
  • Project documentation

Thank you for contributing to BMLibrarian!