❓ Question
Hi guys.
In case of custom FeatureExtractor is it possible to teach some parts of it controlled but in terms of general Algorithm flow. I.e there's exact specific behavior is expected for the feature extractor to perform. So that it should not be trained as a black box along the input - extractor - actor flow but as a white box.
E.g. having environment that returns extra data within info result from a step() call (as it is not the observation it self)
obs, reward, terminated, info, done = env.step(action)
So that is it possible to organize feature_extractor some how to use specific data from the info as a target at back propagation phase.
Or should such logic be trained just separately and used exclusively at inference mode within feature_extractor
Thank you!
Checklist
❓ Question
Hi guys.
In case of custom FeatureExtractor is it possible to teach some parts of it controlled but in terms of general Algorithm flow. I.e there's exact specific behavior is expected for the feature extractor to perform. So that it should not be trained as a black box along the
input - extractor - actorflow but as a white box.E.g. having environment that returns extra data within
inforesult from astep()call (as it is not the observation it self)So that is it possible to organize
feature_extractorsome how to use specific data from theinfoas a target at back propagation phase.Or should such logic be trained just separately and used exclusively at inference mode within
feature_extractorThank you!
Checklist