Add annotation regressors to forecaster config, merged holidays command, and timezone fix#2178
Add annotation regressors to forecaster config, merged holidays command, and timezone fix#2178Copilot wants to merge 24 commits into
Conversation
… command; add future-annotation-regressors to pipeline schema Context: - add_holidays stored UTC midnight timestamps, causing holidays to appear at wrong local time (e.g. 1AM in Amsterdam) in the UI - No support for the 'holidays' Python package which covers more countries and school holidays not in workalendar - No way to pass annotation-based binary regressors to the forecasting pipeline Change: - Add --timezone option to 'flexmeasures add holidays'; holidays are now localized to the given timezone (UTC when omitted, with a warning) - Add new 'flexmeasures add holidays-by-package' command backed by the 'holidays' package; supports --country, --subdiv, --category, --timezone - Add 'holidays>=0.57' to pyproject.toml direct dependencies - Add future_annotation_regressors field (data_key: future-annotation-regressors) to TrainPredictPipelineConfigSchema for annotation-based binary regressors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…lidays-by-package Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
Context: - TrainPredictPipelineConfigSchema already supports future_annotation_regressors - Need to wire this config field through the pipeline execution Change: - Add query_account_annotations() to data/queries/annotations.py - Add _AnnotationRegressorProxy class to base pipeline for reuse in detect_and_fill_missing_values - Add annotation_regressors parameter to BasePipeline.__init__ with proxy/name construction - Add _load_annotation_regressor_df() to build binary 0/1 time series from DB annotations - Extend load_data_all_beliefs() to left-join annotation regressors onto the main DataFrame - Extend split_data_all_beliefs() to include annotation regressor columns as future covariates - Update autoregressive-only check to account for annotation_regressors - Pass annotation_regressors to TrainPipeline and PredictPipeline from run_cycle Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
- Include col_name in annotation regressor skip warning for clarity - Remove redundant list() wrapping (self.future is already list[Sensor]) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…lidays, holidays-by-package, and annotation pipeline regressors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…g and annotations docs Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…, pipeline config isolation patterns Context: - PR #2176 introduced _AnnotationRegressorProxy for reusing sensor-based pipeline utilities with non-sensor (annotation) data, and added query_account_annotations mirroring query_asset_annotations Change: - Added lesson on duck-type proxy validity check (only proxy what's actually accessed) - Added annotation query parity rule (account + asset variants must stay consistent) - Added pipeline config schema isolation guidance (pipeline vs search vs reporter schemas) - Added _clean_parameters retention decision rule for new config keys Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
… pipeline conventions Context: - PR #2176 added --timezone option to holiday CLI commands and implements annotation-to-pipeline loading that must match load_data_all_beliefs UTC-naive convention Change: - Added lesson on --timezone recommendation for holiday CLI commands - Added UTC-naive convention for annotation timestamp loading into pipeline - Added DST boundary risk note for full-day annotations spanning spring-forward Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…ta_key as API surface Context: - PR #2176 added future-annotation-regressors as an optional pipeline config field using load_default=[] and data_key kebab-case pattern Change: - Added lesson on load_default=[] as backward compatibility pattern for new optional fields - Added lesson on _clean_parameters not being a regression risk for new config keys - Added lesson that data_key kebab-case is the API surface (not Python attribute name) Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…e review Context: - Post-session governance review of annotation regressors feature branch - Single-agent execution (recurring delegation failure), but high code quality Change: - Documented delegation observation (all commits by copilot-swe-agent) - Documented new patterns added to 3 specialist agents - Noted open question: missing account_id/asset_id spec causes silent degradation (could be a schema-level validation error in future) - Noted storage growth concern for future_annotation_regressors in DataSource attrs Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
Agent-Logs-Url: https://github.com/FlexMeasures/flexmeasures/sessions/5a54d4cf-b1e6-4400-8808-8fe704b3fd84 Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
|
|
||
|
|
||
| **Holiday import command:** | ||
| **Holiday import (workalendar):** |
There was a problem hiding this comment.
Let's use tabs to switch between workalendar and holidays examples.
There was a problem hiding this comment.
Done in dc6a627 — the holiday import section now uses three tabs: "workalendar (public holidays)", "workalendar (specific calendar class)", and "holidays package (school holidays)".
| ) | ||
| ) | ||
| # NetherlandsWithSchoolHolidays returns public + school holiday days; the primary | ||
| # assertion is an exact match, and the lower bound confirms the calendar is non-trivial. |
There was a problem hiding this comment.
The > 90 lower-bound assertion is intentional — it confirms the calendar produces a non-trivial number of days (it also serves as readable documentation of the expected magnitude). The exact-match assertion count == expected_count already catches regressions, while > 90 prevents a scenario where both the calendar and DB return zero or a trivially small count.
…sor support, zoneinfo, tabs in docs Agent-Logs-Url: https://github.com/FlexMeasures/flexmeasures/sessions/533db2b6-51ae-425f-be65-5fbce46a06be Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
…fallback Agent-Logs-Url: https://github.com/FlexMeasures/flexmeasures/sessions/533db2b6-51ae-425f-be65-5fbce46a06be Co-authored-by: Flix6x <30658763+Flix6x@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Signed-off-by: F.N. Claessen <claessen@seita.nl>
Signed-off-by: F.N. Claessen <claessen@seita.nl>
Signed-off-by: F.N. Claessen <claessen@seita.nl>
Signed-off-by: F.N. Claessen <claessen@seita.nl>
Signed-off-by: F.N. Claessen <claessen@seita.nl>
…essorSchema Context: - JSONOrFile lacked the required click.ParamType.name attribute, causing AttributeError when --help was invoked on commands that use it. - AnnotationRegressorSchema.remove_none_values was already added in an earlier commit but the post_dump decorator import was missing from pipeline.py. Change: - Add name = 'JSON_OR_FILE' to JSONOrFile click.ParamType subclass. - Import post_dump in pipeline.py (already used by AnnotationRegressorSchema).
Context: - Running add forecasts with --train-start in CET (+01) and --start in CEST (+02) caused a TypeError from pd.date_range: 'Start and end cannot both be tz-aware with different timezones'. - The start/end passed to _load_annotation_regressor_df come from event_starts_after and event_ends_before which can carry different UTC offsets across DST boundaries. Change: - Convert start and end to UTC before constructing the time index. - Strip timezone info afterwards to match the tz-naive convention used in load_data_all_beliefs.
Context: - train_start with explicit --train-start was being overridden by the default 30-day train_period, producing a narrower training window than intended. Change: - When train_start is explicit, period is ignored. - When only train_period is explicit, use it (capped to max_training_period). - When neither is set, use the full max_training_period window.
Context: - Part 3 of test_annotation_regressors_loaded_in_pipeline asserted a 5 MW forecast gap which is not reliably produced by LGBM with default parameters. - test_add_holidays_by_package_german_school used DE/BY/school which is no longer supported by the installed version of the holidays package. Change: - Replace the unreliable 5 MW quality assertion with a structural check: verify that 96 forecast beliefs were persisted for the 4-day window. - Switch holidays test to IL (Israel) which reliably supports the school category and has ~19 entries in 2024.
This PR adds annotation regressors (e.g. holidays, custom schedules) as binary future covariates in the forecasting pipeline, consolidates holiday import into a single CLI command backed by either
workalendaror theholidayspackage, and fixes a timezone offset bug in holiday annotations.Changes Made
CLI (
flexmeasures add holidays)add_holidays_by_packageintoadd_holidays— a singleflexmeasures add holidayscommand now supports both backends via--package workalendar|holidays; automatically switches to theholidayspackage when--subdivor--categoryis provided--calendar-classand--calendar-kwargsoptions so users can select any specific workalendar class (e.g.workalendar.europe.netherlands.NetherlandsWithSchoolHolidays) and pass constructor arguments as JSON (e.g.'{"region": "north"}')--timezoneoption; defaults to theFLEXMEASURES_TIMEZONEconfig setting (with a warning) instead of UTC, fixing holiday annotations appearing at 1 AM or 2 AM in chartspytzwith stdlibzoneinfo.ZoneInfofor timezone validation_make_holiday_annotation,_holidays_from_workalendar_class,_holidays_from_package,_holidays_from_workalendar_registry) to keep cyclomatic complexity within limits--yearis now required (previously optional but broken without it)Forecasting pipeline schema
AnnotationRegressorSchema(properfields.Nested) inTrainPredictPipelineConfigSchemaannotation-regressors(wasfuture-annotation-regressors); data keys areaccount,asset,sensor, andannotation-typesensoras a supported annotation source alongsideaccountandassetPipeline & queries
BasePipelinewith_AnnotationRegressorProxy,_load_annotation_regressor_df, and updatedload_data_all_beliefs/split_data_all_beliefsTrainPipeline,PredictPipeline,TrainPredictPipelineto pass annotation regressors throughquery_account_annotations,query_sensor_annotationstoflexmeasures/data/queries/annotations.pyholidays>=0.57as a direct dependency inpyproject.tomlTests
test_add_holidays_with_workalendar_school_holidaysinvokes the CLI with--calendar-class workalendar.europe.netherlands.NetherlandsWithSchoolHolidays --calendar-kwargs '{"region": "north"}'test_add_holidays_by_package_german_schooluses the mergedadd_holidayscommand with--subdiv BY --category schooltest_add_holidays_with_timezoneverifies annotations are stored at local midnighttest_annotation_regressors_loaded_in_pipelineverifies binary regressor loading for a factory logistics scheduleDocumentation
documentation/concepts/annotations.rst: sphinx-tabs for workalendar (public), workalendar (specific class), and holidays-package (school) examples; clarified school holiday support is for selected countriesdocumentation/features/forecasting.rst: updated to reflectannotation-regressorskey andaccount/asset/sensor/annotation-typedata keys--yearnow required,holidays-by-packagemerged, field renamed)