Design & Implementation#

Coord2Region is built on a specific set of architectural choices designed to ensure reproducibility, modularity, and interface parity.

This page summarizes the design philosophy and the technical stack that powers it.

Core Principles#

The development of Coord2Region is guided by three recurring themes.

1. One Workflow, Many Surfaces#

Whether you use the CLI, the Python API, or the Web Builder, you are interacting with the exact same logic.

  • Design: A shared configuration schema (YAML) drives all interfaces.

  • Implementation: The coord2region.pipeline.run_pipeline() function

    serves as the single entry point. The CLI is simply a wrapper around this function, and the Web Builder generates the configuration dict that feeds it. This prevents “drift” between interactive notebooks and production scripts.

2. Reproducibility First#

An analysis is only useful if it can be verified. Coord2Region treats provenance as a first-class citizen.

  • Design: Every run must produce self-describing artefacts.

  • Implementation: By default, the pipeline creates a coord2region-output/

    directory containing not just the results (CSV/JSON), but also a snapshot of the configuration YAML, software versions, and atlas hashes used during execution.

3. Composable Integrations#

We believe in standing on the shoulders of giants rather than reinventing wheels.

  • Design: Use specialized libraries for specific domains and loosely

    couple AI providers.

  • Implementation: * Anatomy: Uses Nilearn and MNE-Python for handling NIfTI and Surface data.
    • Literature: Uses NiMARE for meta-analytic database interaction.

    • AI: Uses a plugin-based provider system, allowing you to swap

      backends (e.g., switching from OpenAI to Gemini) without changing your code.

Technical Stack#

Coord2Region occupies the orchestration layer of the scientific Python ecosystem.

Atlases & Mappers#

The coord2region.fetching module manages the downloading and caching of atlas data. It abstracts away the differences between:

  • Volumetric Data (Nilearn): Handled via nilearn.datasets and nibabel.

  • Surface Data (MNE): Handled via mne.read_labels_from_annot.

The coord2region.coord2region.AtlasMapper class wraps these underlying arrays, providing a uniform .map(x, y, z) method regardless of the underlying data format.

Provider Interface#

To support the “Composable Integrations” philosophy, all external services implement a shared interface.

  • Literature: The pipeline speaks to a generic “StudyFetcher” contract.

    Currently, this wraps NiMARE for querying Neurosynth and NeuroQuery datasets.

  • Generative AI: The coord2region.providers.ai module normalizes

    inputs and outputs across SDKs. Whether you use google-genai, openai, or huggingface_hub, the pipeline receives a standardized text summary or image object.

Configuration System#

Configuration is handled via a cascade strategy to support both local development and CI/CD pipelines:

  1. Defaults: Hardcoded sane defaults in the library.

  2. YAML Config: A coord2region-config.yaml file (generated by scripts/configure_coord2region.py).

  3. Environment Variables: OPENAI_API_KEY, NILEARN_DATA, etc. (Highest priority).

This allows you to commit a template YAML file to your repository while injecting sensitive API keys via environment variables at runtime.