-
-
Notifications
You must be signed in to change notification settings - Fork 108
Description
The types in built-in policies and algorithms like QBasedPolicy and TDLearner are overly specific and prevent users from using the existing code to extend to new algorithms. Rather, it forces users to rewrite large chunks of code.
For example, QBasedPolicy is defined as struct QBasedPolicy{L<:TDLearner,E<:AbstractExplorer} <: AbstractPolicy and all the methods for it similarly. Therefore, I cannot write a new learner and use it in a QBasedPolicy, even though all the methods for it seem to be very general.
Another example is TDLearner which is defined as Base.@kwdef mutable struct TDLearner{M,A} <: AbstractLearner where {A<:TabularApproximator,M<:Symbol}. However, the constructor for it only allows M=:SARS. This makes me have to rewrite the whole struct if I want to write a new TD learning algorithm or if I want to use a different kind of approximation (e.g linear).
In my opinion, these restrictions should be removed and they should be replaced with general types such as AbstractLearner.