[ENH] EXPERIMENTAL: Example notebook based on the new data pipeline #1813

phoeenniixx · 2025-04-06T13:51:55Z

Description

This PR adds example notebook for the new v2 data pipeline vignette, having the basic implementation of the tft model using this version. For more info see #1812 , #1811

Colab link: https://colab.research.google.com/drive/148MyhcNfYEh4CZ6vBXLqQNsUBF0n6_0v?usp=sharing

review-notebook-app · 2025-04-06T13:52:00Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Update comment

phoeenniixx · 2025-04-06T14:45:07Z

Hi @fkiraly, I am getting this error:

I just downloaded the notebook from colab and pasted it in the repo, is there anything else I should do to avoid this? Really have no idea 😅

codecov · 2025-04-11T02:34:59Z

Codecov Report

❌ Patch coverage is 9.09091% with 10 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@4a34931). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
pytorch_forecasting/data/examples.py	9.09%	10 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1813   +/-   ##
=======================================
  Coverage        ?   85.59%           
=======================================
  Files           ?       68           
  Lines           ?     6597           
  Branches        ?        0           
=======================================
  Hits            ?     5647           
  Misses          ?      950           
  Partials        ?        0

Flag	Coverage Δ
cpu	`85.59% <9.09%> (?)`
pytest	`85.59% <9.09%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

phoeenniixx · 2025-05-29T08:39:34Z

Thank you for the review @xandie985!

EncoderDecoderTimeSeriesDataModule assumes that the data would fit int he memory. How would your approach scale to datasets that do not fit into RAM? Are there plans to incorporate memory-efficient loading strategies like chunking or on-demand loading from disk?

So right now we assume that data could fit in the memory, but yes, in future we plan to add features like chunking, on-demand loading etc

Does your module have any mechanisms to detect or correct irregular time series within its scope?

Your _create_windows method checks for sufficient sequence length.. how does your module handle missing values within a window, and what are the potential consequences for model training if windows contain significant NaNs?

Adding random seed for reproducibility, especially in setup() where random shuffling takes place.

These are some open questions, we still need to work on - We will tackle these questions in future iterations once an end-to-end prototype is ready and we get some reviews from the users of the package on this prototype.

fkiraly

More detailed review.

please remove the install from the start of the notebook
we should test that this is running, while we are working on v2. One way is to move the content to docs/examples/tutorials, the contents of which are automatically run an tested.
the data generation cell is useful, but not too illustrative. Can you move the code to a function load_toydata or similar, in pytorch_forecasting.data, new module, e.g., toydata? Then we can also use this in testing later!
can you add basic markdown cells that explain what the notebook is showing, and what each steps are? E.g., a summary at the top of the multiple steps, and then again small headers for the steps with minimal explanations.

phoeenniixx · 2025-05-30T05:30:27Z

Thanks! I would make the changes accordingly, Just one doubt:

the data generation cell is useful, but not too illustrative. Can you move the code to a function load_toydata or similar, in pytorch_forecasting.data, new module, e.g., toydata? Then we can also use this in testing later!

I think we can add it to pytorch_forecasting.data.examples? Like right now people import get_stallion_data from there, so they can import toydata from there as well. Just think that this would follow an already available mapping of "test data" to examples...

fkiraly · 2025-05-30T05:42:58Z

Like right now people import get_stallion_data from there, so they can import toydata from there as well.

Makes sense, to add it to the established location with data loaders.

Would it make sense to split the file up and have on loader per file? Need not be done in this PR.

phoeenniixx · 2025-05-30T05:47:24Z

Would it make sense to split the file up and have on loader per file?

Then I think we need to should create a new folder called loaders or datasets and have these files there, and we can add more loaders to that folder in future

fkiraly · 2025-05-31T16:45:13Z

Then I think we need to should create a new folder called loaders or datasets and have these files there, and we can add more loaders to that folder in future

Maybe a separate PR though.

fkiraly

Great! Some minor change requests only.

in the header "data pipeline" - should it not be training and inference?
please remove unused imports
I would also suggest to move imports to the cells where they are used
at the start, can you summarize what is shown over the entire notebook in a few bullet points? Mostly just a list of the headers (table of contents and signposting)
can you explain usage of the important objects? Use markdown or in-line comments
- most important arguments of objects such as of TimeSeries
- explain and show the types and structure of important returns such as y_pred
I would split the last cell into multiple parts, too much is happening there
rule of thumb, cells should be max 15 lines, and printouts max 10 lines. There should be descriptive content, even if very minimal, at the start or before the cell in a markdown.

phoeenniixx · 2025-06-05T15:57:55Z

in the header "data pipeline" - should it not be training and inference?

I mean it is a basic vignette of "data pipeline", how data flow might look like in v2? Should I add words "training" and "inference" as well there?

Model training is just to "complete" the process

fkiraly · 2025-06-05T17:09:37Z

I mean it is a basic vignette of "data pipeline", how data flow might look like in v2? Should I add words "training" and "inference" as well there?

Data pipeline is not accurate imo - people expect pre-processing or ETL if they hear that.

But in fact this is the full basic workflow for actually using the neural networks for forecasting.

phoeenniixx · 2025-06-06T20:52:17Z

Hi @fkiraly, will this work?

fkiraly

yes, great!

Made some minor changes to the header to make this clearer.

phoeenniixx added 4 commits April 6, 2025 18:43

D1, D2 layer commit

252598d

remove one comment

d0d1c3e

model layer commit

80e64d2

Example notebook

0319c29

phoeenniixx requested review from benHeid, fkiraly, fnhirwa, jdb78 and yarnabrina as code owners April 6, 2025 13:51

phoeenniixx added 3 commits April 6, 2025 19:34

update docstring

6364780

Merge branch 'refactor-d1-d2' into refactor-model

82b3dc7

Merge branch 'refactor-d1-d2' into refactor-notebook

5d80532

Update comment

PranavBhatP added this to Dec 2024 - Mar 2025 mentee projects Apr 7, 2025

PranavBhatP moved this to PR in progress in Dec 2024 - Mar 2025 mentee projects Apr 7, 2025

phoeenniixx added 8 commits April 11, 2025 01:54

update data_module.py

257183c

update data_module.py

9cdcb19

Merge branch 'refactor-d1-d2' into refactor-model

a83bf32

Merge branch 'refactor-d1-d2' into refactor-notebook

6290dc2

Add disclaimer

ac56d4f

Merge branch 'refactor-d1-d2' into refactor-model

0e7e36f

Merge branch 'refactor-d1-d2' into refactor-notebook

a23ad8a

update notebook as well

25bc7ee

phoeenniixx added 5 commits April 11, 2025 12:44

update docstring

4bfff21

Merge branch 'refactor-d1-d2' into refactor-model

ef98273

Merge branch 'refactor-d1-d2' into refactor-notebook

7a175e9

update comments in nb

8dfcac1

Add tests for D1,D2 layer

8a53ed6

Merge branch 'main' into pr/1813

b4b4e4a

fkiraly requested changes May 29, 2025

View reviewed changes

fkiraly added the documentation Improvements or additions to documentation label May 29, 2025

fkiraly moved this from PR under review to PR in progress in May - Sep 2025 mentee projects May 30, 2025

phoeenniixx and others added 3 commits June 2, 2025 00:51

move the nb to tutorials

469ddc7

update notebook

2c1fd59

Merge branch 'main' into refactor-notebook

7e39493

fkiraly mentioned this pull request Jun 3, 2025

[ENH] Implementing D2 data module, tests and TimeXer model from tslib for v2 #1836

Merged

4 tasks

add markdown

c15854c

fkiraly moved this from PR in progress to PR under review in May - Sep 2025 mentee projects Jun 4, 2025

Merge branch 'main' into refactor-notebook

31d01db

fkiraly requested changes Jun 5, 2025

View reviewed changes

fkiraly moved this from PR under review to PR in progress in May - Sep 2025 mentee projects Jun 6, 2025

update notebook

396d2a4

phoeenniixx requested a review from fkiraly June 6, 2025 20:47

Merge branch 'main' into refactor-notebook

c6d21f8

Update ptf_V2_example.ipynb

5e71cf6

fkiraly approved these changes Jun 7, 2025

View reviewed changes

fkiraly merged commit 4ba4ff5 into sktime:main Jun 7, 2025
18 checks passed

github-project-automation bot moved this from PR in progress to Done in May - Sep 2025 mentee projects Jun 7, 2025

github-project-automation bot moved this from PR in progress to Done in Dec 2024 - Mar 2025 mentee projects Jun 7, 2025

phoeenniixx deleted the refactor-notebook branch June 8, 2025 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH] EXPERIMENTAL: Example notebook based on the new data pipeline #1813

[ENH] EXPERIMENTAL: Example notebook based on the new data pipeline #1813

Uh oh!

phoeenniixx commented Apr 6, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Apr 6, 2025

Uh oh!

phoeenniixx commented Apr 6, 2025

Uh oh!

codecov bot commented Apr 11, 2025 •

edited

Loading

Uh oh!

phoeenniixx commented May 29, 2025

Uh oh!

fkiraly left a comment •

edited

Loading

Uh oh!

phoeenniixx commented May 30, 2025

Uh oh!

fkiraly commented May 30, 2025

Uh oh!

phoeenniixx commented May 30, 2025

Uh oh!

fkiraly commented May 31, 2025

Uh oh!

fkiraly left a comment •

edited

Loading

Uh oh!

phoeenniixx commented Jun 5, 2025 •

edited

Loading

Uh oh!

fkiraly commented Jun 5, 2025 •

edited

Loading

Uh oh!

phoeenniixx commented Jun 6, 2025

Uh oh!

fkiraly left a comment

Uh oh!

Uh oh!

Uh oh!

[ENH] EXPERIMENTAL: Example notebook based on the new data pipeline #1813

[ENH] EXPERIMENTAL: Example notebook based on the new data pipeline #1813

Uh oh!

Conversation

phoeenniixx commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

review-notebook-app bot commented Apr 6, 2025

Uh oh!

phoeenniixx commented Apr 6, 2025

Uh oh!

codecov bot commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

phoeenniixx commented May 29, 2025

Uh oh!

fkiraly left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phoeenniixx commented May 30, 2025

Uh oh!

fkiraly commented May 30, 2025

Uh oh!

phoeenniixx commented May 30, 2025

Uh oh!

fkiraly commented May 31, 2025

Uh oh!

fkiraly left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phoeenniixx commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phoeenniixx commented Jun 6, 2025

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

phoeenniixx commented Apr 6, 2025 •

edited

Loading

codecov bot commented Apr 11, 2025 •

edited

Loading

fkiraly left a comment •

edited

Loading

fkiraly left a comment •

edited

Loading

phoeenniixx commented Jun 5, 2025 •

edited

Loading

fkiraly commented Jun 5, 2025 •

edited

Loading