Skip to content

Conversation

phoeenniixx
Copy link
Member

Description

This PR adds TimeXer model implementation by @PranavBhatP from #1830 to test framework
Stacks on #1841 and #1780

@phoeenniixx
Copy link
Member Author

Thanks a lot, this will for sure be helpful!

@PranavBhatP
Copy link
Contributor

PranavBhatP commented Jun 2, 2025

Another thing that has been notified to @phoeenniixx about this PR, is the fact that the wrong version of the data loader is used. Considering that the PR #1830 was an experimental PR, I had made changes to the EncoderDecoderDataModule to pass the past "target" values within the x values. This was required to pass in endogenous embeddings into TimeXer and was necessary to make it compatible with EncoderDecoderDataModule. Because of this, an error is triggered whenever you try to run the model's tests on this PR, because the data module used in this branch does not contain these changes. Going forward, I think we must be careful with the changes and understand what version we must go ahead with.

Link to diff : https://github.com/sktime/pytorch-forecasting/pull/1830/files#diff-1c57ccc8f99ccfd8e7207bca7e8dd9e1c99a311d921173df4535df8274faa040R702

If my version of the data loader is used then the model can be tested successfully, but again, brings in the possibility of breaking changes and the need for separate tests within data_module.py to handle these changes (i.e the changes mentioned above).

@phoeenniixx
Copy link
Member Author

This means that we need to make some changes to the existing test framework of ptf-v2 and add one more method that can provide the model with the data_module it is compatible with, i.e, if the model is compatible with EncoderDecoderDataModule (optimally the original models in ptf-v2) then the method returns EncoderDecoderDataModule otherwise if the model is a part of tslib then return the data module from #1836.

For this I have two ideas:

  • we add a tag to metadata class like info:lib(or any other better name?) that can have any one of the two values: ptf or tslib and we create a "global" get_data_module that will return the data_module based on this tag.
  • We can try a similar approach as used by @fkiraly in [ENH] DecoderMLP metadata container for v1 tests #1859 and create a function like _get_test_datamodules_from (the PR by @fkiraly creates a similar function: _get_test_dataloaders_from to get a "personalised" dataloader for the model). This function will be inside the metadata class.

I would really appreciate your thoughts on this!
@fkiraly, @agobbifbk, @PranavBhatP , @xandie985

@PranavBhatP
Copy link
Contributor

PranavBhatP commented Jun 3, 2025

We can try a similar approach as used by @fkiraly in #1859 and create a function like _get_test_datamodules_from (the PR by @fkiraly creates a similar function: _get_test_dataloaders_from to get a "personalised" dataloader for the model). This function will be inside the metadata class.

This is actually a good approach, if we do think about the long run, we will have to implement this to accomodate other dataloaders (not only tslib) implemented in their separate data modules. Do we need to merge tslib datamodule for testing this? Am I right in this?

@phoeenniixx
Copy link
Member Author

phoeenniixx commented Jun 3, 2025

This is actually a good approach, if we do think about the long run, we will have to implement this to accomodate other dataloaders (not only tslib) implemented in their separate data modules.

Yes, same can be done for approach 1 (add one more value for to the tag: ptf, tslib, xyz), but yes approach 1 is just a thought that struck me so I felt like sharing it :)

Do we need to merge tslib datamodule for testing this?

That depends if work (and reviews from all) is still left for the data module or not , for now I can branch off of your PR

@PranavBhatP
Copy link
Contributor

there is still some work pending, with respect to reviews and discussion on small parts of the design for tslib. But the pipeline as a whole is fully functional, if that is enough?

@phoeenniixx
Copy link
Member Author

Yeah, I'll branch off from your pr and will try to bring timexer to test framework, if I face any issue, you can help :)

@agobbifbk
Copy link

This means that we need to make some changes to the existing test framework of ptf-v2 and add one more method that can provide the model with the data_module it is compatible with, i.e, if the model is compatible with EncoderDecoderDataModule (optimally the original models in ptf-v2) then the method returns EncoderDecoderDataModule otherwise if the model is a part of tslib then return the data module from

I agree. There is a link between the d2 layer and the model layer. We need to decide which is the best way to link them: the easiest way is a 1:1 match, it solves a lot of problem. But, if we work a little on the model layer, you can make the relationship 1:N, meaning that I can use the same DLinear model with an EconderDecoder d2 layer OR with an Encoder d2 layer, but I think this will overcomplicated the codebase :-)

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 4, 2025

Perhaps not very surprisingly, I am partial towards this approach.

Compared to the approach via tag, I think it is more extensible, since adding a tag requires three points of extension (register of tags, logic for dispatching package to data module generation, and the logic for data module generation), the "method" approach has only one point of extension.

The tag approach is more inspectable, but I think this might not be that useful to the user? At least I think the extensibility reason outweights this, and we can always also add a tag that is not tied to logic.

@phoeenniixx
Copy link
Member Author

phoeenniixx commented Jun 4, 2025

Compared to the approach via tag, I think it is more extensible, since adding a tag requires three points of extension (register of tags, logic for dispatching package to data module generation, and the logic for data module generation), the "method" approach has only one point of extension.

Yes that is also a point to consider!

The tag approach is more inspectable, but I think this might not be that useful to the user?

To what I've noticed till now, the terminology and the way handling data and models are a little bit different for ptf and tslib, so a tag having this info I think can help the user to better adapt their approach accordingly.

At least I think the extensibility reason outweights this, and we can always also add a tag that is not tied to logic.

Umm, Yes we can do this as well, this will provide enough info to the user about the inheritance of the model ( if the model is from tslib or ptf) and will also solve the problem of extensibility as you mentioned.

So I think we should add tags (not necessarily now, maybe in future iterations?) and for now I just add the private method _get_test_datamodules_from to the metadata class

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 4, 2025

so a tag having this info I think can help the user to better adapt their approach accordingly.

Perhaps, but would it be informative? The user would not know in which respect estimators are different, and the meaning would be obscure?

So I think we should add tags (not necessarily now, maybe in future iterations?) and for now I just add the private method _get_test_datamodules_from to the metadata class

That would be my current preference, but it does not mean it is the best solution.

@phoeenniixx
Copy link
Member Author

Perhaps, but would it be informative? The user would not know in which respect estimators are different, and the meaning would be obscure?

I mean we could document it under "different classes of estimators" or something like that..

But I think for now we can leave this discussion for the future, as we might not even need such distinction, if we could map these differences between the 2 types of estimators (which I think is possible)

@PranavBhatP
Copy link
Contributor

Just a question. Is this PR going to be merged? I would advice against it, since some parts of the code is outdated and I will me implementing the tests from my side as well, on a separate data module and might lead to conflicts. Since you are also not replacing _timexer_metadata.py with _timexer_v2_metadata.py, I will probably add v2 metadata in a separate PR, once the v2 test framework is merged.

@phoeenniixx
Copy link
Member Author

Just a question. Is this PR going to be merged? I would advice against it, since some parts of the code is outdated and I will me implementing the tests from my side as well, on a separate data module and might lead to conflicts. Since you are also not replacing _timexer_metadata.py with _timexer_v2_metadata.py, I will probably add v2 metadata in a separate PR, once the v2 test framework is merged.

Well rn I'm working on making some changes to the metadata classes to get different data modules, so we need to add that functionality somehow... It is not committed here rn, as I was busy this weekend and didn't had much time to work on this pr, but I think I can make some changes here, and once they coincide with your mental model, you can make further changes?

@PranavBhatP
Copy link
Contributor

PranavBhatP commented Jun 8, 2025

My main concern was with the version of timexer, you can try running the tests on this version of course, but since there are updated versions in #1836, it would be a hassle for you to change the model, tests and params to match with implementations in other PRs, and those will again take some time to get merged.

think I can make some changes here, and once they coincide with your mental model, you can make further changes?

Yeah that makes sense. Just giving a heads-up with the changes you might have to make if you are actually planning to merge this PR.

@phoeenniixx
Copy link
Member Author

Sure! I'll then merge those changes from #1836

@PranavBhatP
Copy link
Contributor

PranavBhatP commented Jun 9, 2025

@phoeenniixx , if you want a meet to coordinate over work on this PR, I am open for meeting at 13UTC today if there is no meet-up at that time.

@phoeenniixx
Copy link
Member Author

Sure! I have also pinged you in pytorch-forecasting channel in discord...

@phoeenniixx
Copy link
Member Author

This PR is superseded by #1836 by @PranavBhatP

@github-project-automation github-project-automation bot moved this from PR in progress to Done in May - Sep 2025 mentee projects Jun 11, 2025
@phoeenniixx phoeenniixx deleted the timexer branch July 29, 2025 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants