Skip to content

Development guide

General guidelines

The roadmap is the single source for open tasks, priorities, and timelines. Review it before starting work or filing a pull request to avoid duplication.

Run ./bootstrap.sh before development to install dependencies, copy .env.example, and execute linting/tests.

  • Keep preprocessing, OCR, mapping, QC, import, and export phases decoupled.
  • Prefer configuration-driven behavior and avoid hard-coded values.
  • Document new processing phases with reproducible examples.

Pair Programming with AI Agents (Encouraged)

This project promotes collaborative development between humans and AI agents. Agents should act as active programming partners, not just code generators:

AI Agent Partnership Guidelines

  • Question assumptions about problem-solving approaches
  • Balance technical implementation with practical usability
  • Regularly suggest hands-on testing with real data
  • Keep focus on end-user workflows and institutional needs
  • Identify gaps between code functionality and real-world usage
  • Propose concrete testing protocols and validation approaches
  • Create actionable human work lists for tasks requiring domain expertise

Practical Development Mindset

  1. Build → Test on Real Data → Iterate (not just build → build → build)
  2. Ask "Does this solve the actual problem?" before adding complexity
  3. Prioritize user workflows over technical elegance
  4. Document what humans need to do alongside what code can do
  5. Bridge the gap between development environment and production usage

This collaborative approach ensures technical solutions actually serve institutional and research needs.

Feature Implementation Workflow

This project uses the .specify framework for structured feature development. Follow this complete workflow for new features:

Phase 1: Specification & Clarification

# 1. Create feature specification
/specify <feature description>
# Creates feature branch, initializes spec.md with structured requirements

# 2. Resolve ambiguities
/clarify
# Interactive Q&A to clarify missing decisions and update spec

Example: /specify Add batch OCR processing with progress tracking and error recovery

Phase 2: Planning & Task Decomposition

# 3. Generate technical plan
/plan
# Creates detailed technical design in plan.md

# 4. Generate actionable tasks
/tasks
# Creates dependency-ordered tasks.md for implementation

# 5. Analyze design consistency
/analyze
# Cross-validates spec, plan, and tasks for consistency

Phase 3: Implementation & Deployment

# 6. Execute implementation
/implement
# Processes each task sequentially, writes code and tests

# 7. Deploy feature
/deploy
# Handles deployment with dependency resolution

Workflow Benefits

  • Structured Requirements: Clear specifications prevent scope creep
  • Design Validation: Cross-artifact analysis catches inconsistencies early
  • Dependency Management: Task ordering prevents implementation blockers
  • Quality Gates: Built-in validation at each phase
  • Traceability: Requirements → Design → Implementation → Deployment

Integration with Testing

The workflow integrates with our testing standards: - Specifications include testable acceptance criteria - Plans define testing strategies - Implementation phase includes test creation - Each task completion triggers validation

For advanced patterns and coordination with the meta-project system, see the .specify/ directory documentation.

Retroactive Specifications

This project has undergone systematic retroactive specification analysis to document major features developed before formal specification processes were established. Key findings:

Major Features Analyzed: - Apple Vision OCR Integration (v0.3.0) - 95% accuracy breakthrough - Modern UI/UX System (Current) - Complete interface transformation - Darwin Core Archive Export (v0.2.0) - Institutional compliance system

Key Lessons Learned: - Research-driven development produces superior outcomes - Architecture decisions need explicit documentation - Performance requirements should be specified upfront - Migration strategies must be planned for breaking changes

Specification Checkpoints: Going forward, specifications are required for: - Features requiring >3 days development effort - Architecture changes affecting >2 modules - New external dependencies or critical libraries - Data format or schema changes

See .specify/retro-specs/ for complete retroactive analysis and future specification strategy.

Testing and linting

Run the full test suite and linter before committing changes.

ruff check .
pytest

These checks help maintain a consistent code style and verify that new contributions do not introduce regressions.

Release Process

This project follows semantic versioning and Keep a Changelog format for all releases.

Creating a Release

  1. Update version numbers:

    # Update version in pyproject.toml
    # Update version in CHANGELOG.md with new section
    

  2. Update CHANGELOG.md:

  3. Add new version section with date: ## [X.Y.Z] - YYYY-MM-DD
  4. Move items from [Unreleased] to the new version section
  5. Follow Keep a Changelog format
  6. Add version comparison link at bottom of file

  7. Create and push the release:

    git add .
    git commit -m "🚀 Release vX.Y.Z: Brief description"
    git tag vX.Y.Z
    git push origin main
    git push origin vX.Y.Z
    

  8. Update comparison links:

  9. Update [Unreleased] link to compare from new version
  10. Add new version comparison link
  11. Example format:
    [Unreleased]: https://github.com/devvyn/aafc-herbarium-dwc-extraction-2025/compare/vX.Y.Z...HEAD
    [X.Y.Z]: https://github.com/devvyn/aafc-herbarium-dwc-extraction-2025/compare/vX.Y.W...vX.Y.Z
    

Critical Requirements

  • Always create git tags for releases - this enables changelog comparison links
  • Use semantic versioning: v0.1.0, v0.2.0, v1.0.0, etc.
  • Follow Keep a Changelog format - agents must maintain this structure
  • Update pyproject.toml version to match changelog and tag
  • Test the comparison links - they should work on GitHub

Changelog Format Reference

## [Unreleased]

## [1.0.0] - 2025-01-15
### Added
- New feature descriptions

### Changed
- Modified functionality descriptions

### Fixed
- Bug fix descriptions

[Unreleased]: https://github.com/repo/compare/v1.0.0...HEAD
[1.0.0]: https://github.com/repo/compare/v0.9.0...v1.0.0

[AAFC]: Agriculture and Agri-Food Canada [GBIF]: Global Biodiversity Information Facility [DwC]: Darwin Core [OCR]: Optical Character Recognition [API]: Application Programming Interface [CSV]: Comma-Separated Values [IPT]: Integrated Publishing Toolkit [TDWG]: Taxonomic Databases Working Group