Development guide¶
General guidelines¶
The roadmap is the single source for open tasks, priorities, and timelines. Review it before starting work or filing a pull request to avoid duplication.
Run ./bootstrap.sh before development to install dependencies, copy .env.example, and execute linting/tests.
- Keep preprocessing, OCR, mapping, QC, import, and export phases decoupled.
- Prefer configuration-driven behavior and avoid hard-coded values.
- Document new processing phases with reproducible examples.
Pair Programming with AI Agents (Encouraged)¶
This project promotes collaborative development between humans and AI agents. Agents should act as active programming partners, not just code generators:
AI Agent Partnership Guidelines¶
- Question assumptions about problem-solving approaches
- Balance technical implementation with practical usability
- Regularly suggest hands-on testing with real data
- Keep focus on end-user workflows and institutional needs
- Identify gaps between code functionality and real-world usage
- Propose concrete testing protocols and validation approaches
- Create actionable human work lists for tasks requiring domain expertise
Practical Development Mindset¶
- Build → Test on Real Data → Iterate (not just build → build → build)
- Ask "Does this solve the actual problem?" before adding complexity
- Prioritize user workflows over technical elegance
- Document what humans need to do alongside what code can do
- Bridge the gap between development environment and production usage
This collaborative approach ensures technical solutions actually serve institutional and research needs.
Feature Implementation Workflow¶
This project uses the .specify framework for structured feature development. Follow this complete workflow for new features:
Phase 1: Specification & Clarification¶
# 1. Create feature specification
/specify <feature description>
# Creates feature branch, initializes spec.md with structured requirements
# 2. Resolve ambiguities
/clarify
# Interactive Q&A to clarify missing decisions and update spec
Example: /specify Add batch OCR processing with progress tracking and error recovery
Phase 2: Planning & Task Decomposition¶
# 3. Generate technical plan
/plan
# Creates detailed technical design in plan.md
# 4. Generate actionable tasks
/tasks
# Creates dependency-ordered tasks.md for implementation
# 5. Analyze design consistency
/analyze
# Cross-validates spec, plan, and tasks for consistency
Phase 3: Implementation & Deployment¶
# 6. Execute implementation
/implement
# Processes each task sequentially, writes code and tests
# 7. Deploy feature
/deploy
# Handles deployment with dependency resolution
Workflow Benefits¶
- Structured Requirements: Clear specifications prevent scope creep
- Design Validation: Cross-artifact analysis catches inconsistencies early
- Dependency Management: Task ordering prevents implementation blockers
- Quality Gates: Built-in validation at each phase
- Traceability: Requirements → Design → Implementation → Deployment
Integration with Testing¶
The workflow integrates with our testing standards: - Specifications include testable acceptance criteria - Plans define testing strategies - Implementation phase includes test creation - Each task completion triggers validation
For advanced patterns and coordination with the meta-project system, see the .specify/ directory documentation.
Retroactive Specifications¶
This project has undergone systematic retroactive specification analysis to document major features developed before formal specification processes were established. Key findings:
Major Features Analyzed: - Apple Vision OCR Integration (v0.3.0) - 95% accuracy breakthrough - Modern UI/UX System (Current) - Complete interface transformation - Darwin Core Archive Export (v0.2.0) - Institutional compliance system
Key Lessons Learned: - Research-driven development produces superior outcomes - Architecture decisions need explicit documentation - Performance requirements should be specified upfront - Migration strategies must be planned for breaking changes
Specification Checkpoints: Going forward, specifications are required for: - Features requiring >3 days development effort - Architecture changes affecting >2 modules - New external dependencies or critical libraries - Data format or schema changes
See .specify/retro-specs/ for complete retroactive analysis and future specification strategy.
Testing and linting¶
Run the full test suite and linter before committing changes.
These checks help maintain a consistent code style and verify that new contributions do not introduce regressions.
Release Process¶
This project follows semantic versioning and Keep a Changelog format for all releases.
Creating a Release¶
-
Update version numbers:
-
Update CHANGELOG.md:
- Add new version section with date:
## [X.Y.Z] - YYYY-MM-DD - Move items from
[Unreleased]to the new version section - Follow Keep a Changelog format
-
Add version comparison link at bottom of file
-
Create and push the release:
-
Update comparison links:
- Update
[Unreleased]link to compare from new version - Add new version comparison link
- Example format:
Critical Requirements¶
- Always create git tags for releases - this enables changelog comparison links
- Use semantic versioning: v0.1.0, v0.2.0, v1.0.0, etc.
- Follow Keep a Changelog format - agents must maintain this structure
- Update pyproject.toml version to match changelog and tag
- Test the comparison links - they should work on GitHub
Changelog Format Reference¶
## [Unreleased]
## [1.0.0] - 2025-01-15
### Added
- New feature descriptions
### Changed
- Modified functionality descriptions
### Fixed
- Bug fix descriptions
[Unreleased]: https://github.com/repo/compare/v1.0.0...HEAD
[1.0.0]: https://github.com/repo/compare/v0.9.0...v1.0.0
[AAFC]: Agriculture and Agri-Food Canada [GBIF]: Global Biodiversity Information Facility [DwC]: Darwin Core [OCR]: Optical Character Recognition [API]: Application Programming Interface [CSV]: Comma-Separated Values [IPT]: Integrated Publishing Toolkit [TDWG]: Taxonomic Databases Working Group