Skip to content

Variable roles in tidymodels recipe and workflow... are they respected by rSAFE? #10

@jacekkotowski

Description

@jacekkotowski

Example (I am playing with bicycle demand data from Kaggle

bike_recipe <- recipe(count ~ . , data = bike_training) %>%
  step_date(datetime, features = c("doy", "dow", "month", "year"), abbr = TRUE) %>%
   update_role("datetime", new_role = "id_variable") %>%
    step_rm("atemp")

will create time features out of the datetime index and then datetime will not take part in modelling.
I also removed "atemp" variable altogether (temp and atemp were strongly correlated). It is not taking part in the modelling either.

Next I run the explainer:

explainer <- explain_tidymodels(bike_final_fit, data = bike_all %>% select(-count), y = bike_all$count)
safe_extractor <- safe_extraction(explainer)

Safe extractor seems to ignore the lack of datetime and atemp in modelling process and proposes:

 Variable 'datetime' - selected intervals:
	(-Inf, 2011-02-16 23:00:00]
 	(2011-02-16 23:00:00, 2011-06-17 23:00:00]
 	(2011-06-17 23:00:00, 2012-04-15 23:00:00]
 	(2012-04-15 23:00:00, 2012-07-08 23:00:00]
 	(2012-07-08 23:00:00, Inf)
Variable 'season' - selected intervals:
	(-Inf, 3]
 	(3, Inf)
Variable 'holiday' - no transformation suggested.
Variable 'workingday' - no transformation suggested.
Variable 'weather' - selected intervals:
	(-Inf, 1]
 	(1, Inf)
Variable 'temp' - selected intervals:
	(-Inf, 12.3]
 	(12.3, 22.96]
 	(22.96, Inf)
Variable 'atemp' - selected intervals:
	(-Inf, 24.24]
 	(24.24, Inf)
Variable 'humidity' - selected intervals:
	(-Inf, 30]
 	(30, 48]
 	(48, 67]
 	(67, 84]
 	(84, Inf)
Variable 'windspeed' - selected intervals:
	(-Inf, 7.0015]
 	(7.0015, Inf)

How to tell rSAFE these two vars (one is time index another has been removed in the bake) are not taking part?
I am attaching my quick and dirty workflow:

timeseries_modelling_xgboost_short.zip
@agosiewska

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions