Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
28fa862
cli/data_add: add --timezone to add_holidays; add holidays-by-package…
Copilot May 15, 2026
b8896f4
cli/data_add: improve error message for invalid subdiv/category in ho…
Copilot May 15, 2026
be29d96
forecasting: add annotation regressor support as future covariates
Copilot May 15, 2026
118c089
forecasting: improve warning message and simplify list concatenation
Copilot May 15, 2026
fd0e678
tests: add CLI tests for add_holidays timezone, workalendar school ho…
Copilot May 15, 2026
e73d54e
tests: tighten assertion bounds based on code review feedback
Copilot May 15, 2026
5951c79
docs: add annotation regressors and holidays-by-package to forecastin…
Copilot May 15, 2026
5fc673b
agents/architecture: learned duck-type proxy, annotation query parity…
Copilot May 15, 2026
66943c3
agents/data-time-semantics: learned annotation timezone and UTC-naive…
Copilot May 15, 2026
09808fa
agents/api-compatibility: learned load_default backward compat and da…
Copilot May 15, 2026
27d431b
agents/coordinator: document PR #2176 annotation regressors governanc…
Copilot May 15, 2026
cbb7870
docs: update changelog PR number from #2176 to #2178
Copilot May 15, 2026
dc6a627
refactor: merge add_holidays commands, AnnotationRegressorSchema, sen…
Copilot May 15, 2026
a161cf7
docs: update changelog entries and add comment for empty-string name …
Copilot May 15, 2026
b9faeb0
feat: add --annotation-regressors CLI option
Flix6x May 16, 2026
7b2bebf
feat: add test coverage and fix CLI option
Flix6x May 18, 2026
d41e3b1
fix: time series loading across DST transitions
Flix6x May 18, 2026
2977fc0
refactor: move JSON decoding logic to click.ParamType subclass
Flix6x May 18, 2026
5d2e377
fix: remove None values from data generator config
Flix6x May 18, 2026
b8e4f1d
fix: add name attribute to JSONOrFile and post_dump to AnnotationRegr…
Flix6x May 18, 2026
d0e968d
fix: normalise start/end to UTC in _load_annotation_regressor_df
Flix6x May 18, 2026
27a5738
refactor: simplify _derive_training_period logic
Flix6x May 18, 2026
4c348c8
test: fix annotation_regressors test and update holidays test country
Flix6x May 18, 2026
472108e
chore: regenerate OpenAPI specs after AnnotationRegressorSchema changes
Flix6x May 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/agents/api-backward-compatibility-specialist.md
Original file line number Diff line number Diff line change
Expand Up @@ -494,3 +494,11 @@ After each assignment:
Change:
- Added guidance on <topic>
```

### Lessons Learned

**Session 2025 (PR #2176 — annotation regressors forecasting config)**:

- **New pipeline config keys are additive (backward compatible)**: `future-annotation-regressors` uses `load_default=[]`, so existing configs without this key continue to work. This is the correct pattern for new optional pipeline features. When reviewing new schema fields, verify `load_default` (not `required=True`) is set for backward compatibility.
- **Forecaster `_clean_parameters` is not a regression risk for new config keys**: New pipeline config keys left in `_clean_parameters` are preserved in DataSource attributes. New keys added to the removal list are dropped. Neither path is a breaking change for existing clients; both are intentional design choices. The API Specialist should verify only that the key naming (kebab-case via `data_key`) is consistent with existing keys — it is.
- **`data_key` kebab-case is the API surface**: The external-facing key is the `data_key` value (`future-annotation-regressors`), not the Python attribute name (`future_annotation_regressors`). Clients sending JSON configs use the kebab-case form. Never check only the Python attribute name when reviewing API contracts for forecasting/reporting config fields.
7 changes: 7 additions & 0 deletions .github/agents/architecture-domain-specialist.md
Original file line number Diff line number Diff line change
Expand Up @@ -604,3 +604,10 @@ After each assignment:
- **Schema parity gap**: The PR added `account_id` to `BeliefsSearchConfigSchema` but not to `Input` (io.py). These two schemas both expose `Sensor.search_beliefs` parameters; omitting a parameter from one creates a silent gap. The architecture agent must check both schemas on any search_beliefs parameter addition.
- **Documentation vs. implementation mismatch**: The `reporting.rst` docs stated reporters can filter by `account_id`, but this only works if `Input` also has the field. Docs that outrun schema support mislead users. Always verify the full schema chain before documenting a feature.
- **DataSource account_id=None for non-user sources**: The existing invariant (reporters/schedulers/forecasters have `account_id=None`) limits the usefulness of `account_id` filtering: it only matches user-type sources. PRs adding `account_id` filters should either document this limitation explicitly or reconsider the invariant.

**Session 2025 (PR #2176 — annotation regressors for forecasting pipeline)**:

- **Duck-type proxy pattern for non-sensor pipeline inputs**: `_AnnotationRegressorProxy` in `pipelines/base.py` provides `event_resolution` and `name` attributes so annotation data can flow through `detect_and_fill_missing_values` without being a real `Sensor`. When reviewing similar PRs, check that the proxy only exposes the attributes actually accessed (not `.id` as integer, `.unit`, etc.). The pattern is valid as long as the method being reused doesn't access attributes the proxy lacks.
- **Annotation query parity**: `query_account_annotations` was added to mirror `query_asset_annotations`. When a new query function is added to `annotations.py`, verify both account-level and asset-level variants are kept consistent (same parameter list, same filter order).
- **Pipeline config schema isolation**: `future-annotation-regressors` is correctly placed only in `TrainPredictPipelineConfigSchema` (a pipeline config key), not in reporter `Input` or `BeliefsSearchConfigSchema` (search params). Distinguish between: pipeline config keys (train/predict behavior), search parameters (timely_beliefs query filters), and reporter input specs. These are separate concerns with separate schemas.
- **`_clean_parameters` does not remove annotation regressors**: The `future-annotation-regressors` key is intentionally preserved in DataSource attributes because it's needed for deterministic model retraining. When a new pipeline config key is added, decide explicitly: should it be removed from persisted attributes (add to `_clean_parameters` removal list) or preserved (leave it out)?
23 changes: 23 additions & 0 deletions .github/agents/coordinator.md
Original file line number Diff line number Diff line change
Expand Up @@ -661,3 +661,26 @@ If any agent hasn't self-improved, Lead must:
- Test Specialist must cover all model classes that receive the new parameter

**Additional gap**: `account_id` for non-user DataSources (reporters, schedulers, forecasters) remains `None`. The filter therefore only matches user-type sources. This architectural constraint (documented in Architecture Specialist) limits the feature's utility and should be prominently noted in documentation whenever `account_id` filtering is described.

### Session 2025: Annotation Regressors Feature (PR #2176)

**Context**: Feature adding `future-annotation-regressors` pipeline config, `holidays-by-package` CLI command, and `--timezone` to `add_holidays`. Reviewed post-session by Coordinator.

**Delegation observation**: All 7 commits authored by `copilot-swe-agent[bot]` — single-agent execution. No specialist agents engaged. This is the recurring delegation failure pattern.

**Code quality observation**: Despite single-agent execution, the implementation quality is high:
- `query_account_annotations` correctly mirrors `query_asset_annotations`
- `_AnnotationRegressorProxy` duck-type proxy is minimal and correct
- Schema `data_key` kebab-case is consistent with existing fields
- `_config.get("future_annotation_regressors")` correctly accesses Marshmallow-deserialized snake_case key
- `holidays>=0.57` dependency has no known CVEs
- Tests are meaningful with appropriately conservative assertion bounds

**New patterns documented** (agent instructions updated):
- Architecture Specialist: duck-type proxy pattern, annotation query parity, pipeline config key isolation, `_clean_parameters` retention decision
- Data & Time Specialist: `--timezone` recommendation, UTC-naive convention for annotation-to-pipeline loading, DST boundary risk for full-day annotations
- API Specialist: `load_default=[]` backward compatibility pattern, `data_key` as API surface (not Python attribute name)

**Governance gap to monitor**: The `future-annotation-regressors` key is preserved in DataSource attributes (not cleaned out). This means repeated calls to `flexmeasures add forecasts` with the same config will produce DataSources with this key in their attributes. This is intentional for retraining but creates storage overhead at scale. No action needed now, but flag if this becomes a performance concern (see Performance Specialist).

**One open question**: If both `account_id` and `asset_id` are omitted from an annotation regressor spec, the pipeline logs a warning and returns empty data — the forecast proceeds without that regressor. This is silent degradation. A future improvement could make this a validation error at schema load time. Not blocking for this PR.
8 changes: 8 additions & 0 deletions .github/agents/data-time-semantics-specialist.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,3 +194,11 @@ After each assignment:
Change:
- Added guidance on <topic>
```

### Lessons Learned

**Session 2025 (PR #2176 — annotation regressors, timezone-aware holiday import)**:

- **Holiday CLI commands need `--timezone`**: Without `--timezone`, `flexmeasures add holidays` and `flexmeasures add holidays-by-package` store annotations at UTC midnight. In CET (UTC+1), this means a holiday appearing at 01:00 local time in charts — one hour off. Always recommend `--timezone Europe/Amsterdam` (or equivalent) when documenting holiday import. The warning message in the CLI also surfaces this, but documentation must reinforce it.
- **Annotation-to-pipeline UTC-naive convention**: When loading annotation timestamps into the forecasting pipeline, convert tz-aware datetimes to UTC-naive using `.tz_convert("UTC").tz_localize(None)`. This matches the convention established in `load_data_all_beliefs`. Failure to do this causes merge/alignment failures when joining annotation data with sensor data that uses the same convention.
- **DST boundary risk for full-day annotations**: Holiday annotations span exactly one calendar day (e.g., `start=2024-03-31T00:00 CET`, `end=2024-04-01T00:00 CEST`). If stored tz-aware, the `end - start` interval is 23 hours across the spring-forward transition. If stored at UTC midnight (no timezone), `end - start` is always exactly 24 hours. Validate this is handled consistently when querying multi-year ranges spanning DST transitions.
3 changes: 3 additions & 0 deletions documentation/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ New features
* New ``GET /api/v3_0/sources`` endpoint to list accessible data sources and defined types, with ``only_latest=true`` by default to return only the most recent version per source [see `PR #2126 <https://www.github.com/FlexMeasures/flexmeasures/pull/2126>`_]
* Add support for filtering sensor data GET requests by ``source-type`` on ``/api/v3_0/sensors/<id>/data`` [see `PR #2127 <https://www.github.com/FlexMeasures/flexmeasures/pull/2127>`_]
* Making monitoring alerts more flexible: allow ``flexmeasures monitor`` alerts to target one or more user IDs or email addresses with ``--recipient``; ``flexmeasures monitor last-seen`` can now narrow monitored users to one or more accounts with ``--account`` or to client accounts with ``--consultancy`` [see `PR #2158 <https://www.github.com/FlexMeasures/flexmeasures/pull/2158>`_]
* Add ``--timezone`` option to ``flexmeasures add holidays`` to store holiday annotations at local midnight; defaults to the ``FLEXMEASURES_TIMEZONE`` config setting (note: ``--year`` is now required) [see `PR #2178 <https://www.github.com/FlexMeasures/flexmeasures/pull/2178>`_]
* Merge ``flexmeasures add holidays-by-package`` into ``flexmeasures add holidays``; use ``--subdiv`` or ``--category`` to automatically switch to the ``holidays`` Python package, or ``--calendar-class``/``--calendar-kwargs`` for specific workalendar classes such as ``NetherlandsWithSchoolHolidays`` [see `PR #2178 <https://www.github.com/FlexMeasures/flexmeasures/pull/2178>`_]
* Add ``annotation-regressors`` field to the forecasting pipeline config schema (renamed from ``future-annotation-regressors``), with structured ``account``/``asset``/``sensor`` and ``annotation-type`` keys, and support for sensor annotations [see `PR #2178 <https://www.github.com/FlexMeasures/flexmeasures/pull/2178>`_]

Infrastructure / Support
----------------------
Expand Down
70 changes: 60 additions & 10 deletions documentation/concepts/annotations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,9 @@ Annotations are particularly useful for:

**Forecasting and Scheduling**
Holiday annotations help forecasting algorithms understand when energy consumption patterns deviate from normal patterns.
FlexMeasures can automatically import public holidays using the ``flexmeasures add holidays`` command.
FlexMeasures can automatically import public holidays using ``flexmeasures add holidays`` (workalendar,
default) or the ``holidays`` package (supports school holidays for selected countries).
These can then be used directly as annotation regressors in the forecasting pipeline — see :ref:`forecasting`.

**Data Quality Tracking**
Mark periods with known sensor issues, data gaps, or quality problems using ``error`` or ``warning`` type annotations.
Expand Down Expand Up @@ -386,19 +388,67 @@ You can target accounts, assets, or sensors:
flexmeasures add annotation --account-id 1 --content "..." --start "..." --end "..."


**Holiday import command:**
**Holiday import:**

FlexMeasures can automatically import public holidays using the `workalendar <https://github.com/workalendar/workalendar>`_ library:
The ``flexmeasures add holidays`` command supports both the `workalendar <https://github.com/workalendar/workalendar>`_ library (default) and the `holidays <https://python-holidays.readthedocs.io/>`_ package (for school holidays and additional subdivisions). Always pass ``--timezone`` matching the country's timezone so annotations are stored at local midnight rather than UTC midnight.

.. code-block:: bash
.. tip::

# Add holidays for a specific account
flexmeasures add holidays --account-id 1 --year 2025 --country NL

# Add holidays for an asset
flexmeasures add holidays --asset-id 5 --year 2025 --country DE
Omitting ``--timezone`` causes annotations to be stored at UTC midnight, which may make
holidays appear at the wrong local hour in charts (e.g. 1 AM or 2 AM in CET/CEST).

.. tabs::

.. tab:: workalendar (public holidays)

Uses the `workalendar <https://github.com/workalendar/workalendar>`_ library. The default
when no ``--subdiv`` or ``--category`` is specified.

.. code-block:: bash

# Add NL public holidays for 2025, stored at Amsterdam midnight
flexmeasures add holidays --year 2025 --country NL --account 1 --timezone Europe/Amsterdam

# Add German public holidays (federal level)
flexmeasures add holidays --year 2025 --country DE --asset 5 --timezone Europe/Berlin

.. tab:: workalendar (specific calendar class)

Use ``--calendar-class`` to access a specific workalendar class not available via a plain
country code, such as regional school holiday calendars.

.. code-block:: bash

# Netherlands school holidays for the "north" region in 2024
flexmeasures add holidays --year 2024 \
--calendar-class workalendar.europe.netherlands.NetherlandsWithSchoolHolidays \
--calendar-kwargs '{"region": "north"}' \
--account 1 --timezone Europe/Amsterdam

.. tab:: holidays package (school holidays)

Use ``--subdiv`` or ``--category school`` to automatically switch to the
`holidays <https://python-holidays.readthedocs.io/>`_ package, which supports school
holidays for selected countries.

.. code-block:: bash

# Bavaria school holidays for 2024
flexmeasures add holidays --year 2024 --country DE --subdiv BY --category school \
--account 1 --timezone Europe/Berlin

# Dutch public holidays via the holidays package
flexmeasures add holidays --year 2025 --country NL --package holidays \
--account 1 --timezone Europe/Amsterdam

Key options when using the holidays package:

- ``--country``: ISO 3166-1 alpha-2 code (e.g. ``DE``, ``NL``, ``AT``).
- ``--subdiv``: State/province code (e.g. ``BY`` for Bavaria).
- ``--category``: ``public`` (default), ``school``, ``optional``. Check
`python-holidays docs <https://python-holidays.readthedocs.io/>`_ for per-country options.

See ``flexmeasures add holidays --help`` for available countries and options.
See ``flexmeasures add holidays --help`` for all options.


Viewing annotations
Expand Down
40 changes: 40 additions & 0 deletions documentation/features/forecasting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,43 @@ If you want to take regressors into account, in addition to merely past measurem
Including regressors can significantly improve forecasting accuracy, especially when they are highly correlated with the target variable. For example, using irradiation forecasts as regressors can substantially improve solar production predictions.
In `this weather forecast plugin <https://github.com/flexmeasures/flexmeasures-weather>`_, we enable you to collect regressor data for ``["temperature", "wind speed", "cloud cover", "irradiance"]``, at a location you select.

Annotation regressors
~~~~~~~~~~~~~~~~~~~~~

In addition to sensor-based regressors, you can use *annotation regressors* to let the forecasting model learn from binary signals derived from annotation data. Holiday flags, factory shutdowns, or any other event stored as an annotation can be passed as future covariates.

Annotation regressors are configured in the ``annotation-regressors`` key of the forecasting config. Each entry is a dict with:

- ``account``, ``asset``, or ``sensor`` (required): the database ID of the account, asset, or sensor whose annotations to use.
- ``annotation-type`` (optional, default ``"holiday"``): filter to annotations of this type (``"holiday"``, ``"label"``, ``"alert"``, etc.).
- ``name`` (optional): a human-readable column name for the regressor. Defaults to ``annotation_regressor_<index>``.

The annotation data is converted to a binary 0/1 time series at the target sensor's resolution: **1** for every time step that falls within an annotation period, **0** otherwise. Since holidays and scheduled events are typically known in advance, annotation regressors are treated as *future* covariates.

Example config (passed via ``--config`` file):

.. code-block:: json

{
"annotation-regressors": [
{"account": 1, "annotation-type": "holiday", "name": "public_holidays"},
{"asset": 5, "annotation-type": "label", "name": "factory_shutdown"}
]
}

Usage:

.. code-block:: bash

flexmeasures add forecasts \
--from-date 2024-01-01 --to-date 2024-12-31 \
--max-forecast-horizon 24 \
--sensor 42 \
--config '{"annotation-regressors": [{"account": 1, "annotation-type": "holiday"}]}'

.. note::

Holiday annotations must be added to the account or asset before running the forecast.
Use ``flexmeasures add holidays`` to populate them (supports both workalendar and the ``holidays``
package). See :ref:`annotations` for details.

Loading
Loading