Skip to content

GPT engine usage

Supplying API keys

Set OPENAI_API_KEY in your environment before running the toolkit. Use a local secret manager or a .env file that is excluded from version control. Avoid embedding keys in scripts or configuration.

Customising prompts

Prompt templates live under ../config/prompts. Modify these files or point the configuration to another directory via the gpt.prompt_dir setting to adjust system behaviour. Each task uses separate files for different roles:

  • *.system.prompt sets global behaviour and constraints.
  • *.user.prompt contains the request that is sent with runtime input.
  • *.assistant.prompt (optional) can seed an example reply.

Configuration options

The [gpt] section of ../config/config.default.toml controls the model, fallback behaviour, prompt directory, and dry-run mode for offline testing.

Language hints

Specify supported languages in [ocr].langs. These values are forwarded to GPT as a system message to steer recognition. Omit the setting to allow automatic language detection.

Testing

Unit tests in ../tests/unit/test_gpt_prompts.py load fixture templates from ../tests/resources/gpt_prompts to ensure custom prompt directories and legacy *.prompt files are honoured. Run pytest to validate these behaviours whenever prompts change.

Validate that all prompt templates expose required placeholders with:

pytest tests/unit/test_prompt_coverage.py
# or
python review_tui.py --check-prompts
Use the standalone harness to print missing placeholders:

python scripts/prompt_coverage.py

[AAFC]: Agriculture and Agri-Food Canada [GBIF]: Global Biodiversity Information Facility [DwC]: Darwin Core [OCR]: Optical Character Recognition [API]: Application Programming Interface [CSV]: Comma-Separated Values [IPT]: Integrated Publishing Toolkit [TDWG]: Taxonomic Databases Working Group