Skip to content

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Jan 22, 2025

This PR contains a speculative design for a data container TimeSeries.

This follows the idea to split the current TimeSeriesDataSet in two parts:

  • a "raw data set" object, representing a collection of time series or a single time series - contained in this PR as docstring
  • a "resampled data set" which is based on a raw data set and additional resampling instructions - not contained in this PR

The idea of a split was discussed in #1736, but also already occurs in the comments of @jdb78 in the current TimeSeriesDataSet.

Iteration on #1755 upon feedback of @thawn that the output should be simpler, as few tensors as possible.

This design returns a near-minimal set, making the choice to:

  • separate endogenous from exogenous
  • separate future from past data
  • otherwise store metadata as list of str or int

This is near-minimal, as at least two tensors are needed, and this results in three. Two tensors are needed since the data is the union of two rectangular array, but not itself a rectangular array (because some features are not seen in the future). The "future features" are optimal though.

@fkiraly fkiraly added enhancement New feature or request API design API design & software architecture labels Jan 22, 2025
@fkiraly
Copy link
Collaborator Author

fkiraly commented Aug 26, 2025

@agobbifbk, @phoeenniixx, @PranavBhatP, we can close this, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API design API design & software architecture enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant