diff --git a/Project.toml b/Project.toml index 2172f4e7..d791a46e 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "LearnAPI" uuid = "92ad9a40-7767-427a-9ee6-6e577f1266cb" authors = ["Anthony D. Blaom "] -version = "0.1.0" +version = "0.2.0" [compat] julia = "1.10" diff --git a/docs/make.jl b/docs/make.jl index 158117cd..95e0480a 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -20,7 +20,7 @@ makedocs( "predict/transform" => "predict_transform.md", "Kinds of Target Proxy" => "kinds_of_target_proxy.md", "obs and Data Interfaces" => "obs.md", - "target/weights/features" => "target_weights_features.md", + "features/target/weights" => "features_target_weights.md", "Accessor Functions" => "accessor_functions.md", "Learner Traits" => "traits.md", ], diff --git a/docs/src/anatomy_of_an_implementation.md b/docs/src/anatomy_of_an_implementation.md index c977cf50..e6dba45a 100644 --- a/docs/src/anatomy_of_an_implementation.md +++ b/docs/src/anatomy_of_an_implementation.md @@ -7,55 +7,53 @@ model = fit(learner, data) predict(model, newdata) ``` -Here `learner` specifies hyperparameters, while `model` stores learned parameters and any byproducts of algorithm execution. +Here `learner` specifies [hyperparameters](@ref hyperparameters), while `model` stores +learned parameters and any byproducts of algorithm execution. -[Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on -`predict` versus `transform`, see [Predict or transform?](@ref) +Variations on this pattern: -["Static" algorithms](@ref static_algorithms) have a `fit` that consumes no `data` -(instead `predict` or `transform` does the heavy lifting). In [density -estimation](@ref density_estimation), `predict` consumes no data. +- [Transformers](@ref) ordinarily implement `transform` instead of `predict`. For more on + `predict` versus `transform`, see [Predict or transform?](@ref) + +- ["Static" (non-generalizing) algorithms](@ref static_algorithms), which includes some + simple transformers and some clustering algorithms, have a `fit` that consumes no + `data`. Instead `predict` or `transform` does the heavy lifting. + +- In [density estimation](@ref density_estimation), the `newdata` argument in `predict` is + missing. These are the basic possibilities. -Elaborating on the core pattern above, we detail in this tutorial an implementation of the +Elaborating on the core pattern above, this tutorial details an implementation of the LearnAPI.jl for naive [ridge regression](https://en.wikipedia.org/wiki/Ridge_regression) with no intercept. The kind of workflow we want to enable has been previewed in [Sample workflow](@ref). Readers can also refer to the [demonstration](@ref workflow) of the implementation given later. -!!! note +## A basic implementation - New implementations of `fit`, `predict`, etc, - always have a *single* `data` argument as above. - For convenience, a signature such as `fit(learner, X, y)`, calling - `fit(learner, (X, y))`, can be added, but the LearnAPI.jl specification is - silent on the meaning or existence of signatures with extra arguments. +See [here](@ref code) for code without explanations. -!!! note +We suppose our algorithm's `fit` method consumes data in the form `(X, y)`, where +`X` is a suitable table¹ (the features) and `y` a vector (the target). - If the `data` object consumed by `fit`, `predict`, or `transform` is not - not a suitable table¹, array³, tuple of tables and arrays, or some - other object implementing - the [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) - `getobs`/`numobs` interface, - then an implementation must: (i) overload [`obs`](@ref) to articulate how - provided data can be transformed into a form that does support - this interface, as illustrated below under - [Providing a separate data front end](@ref); or (ii) overload the trait - [`LearnAPI.data_interface`](@ref) to specify a more relaxed data - API. +!!! important + + Implementations wishing to support other data + patterns may need to take additional steps explained under + [Other data patterns](@ref di) below. The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm. + ```@example anatomy using LearnAPI using LinearAlgebra, Tables nothing # hide ``` -## Defining learners +### Defining learners Here's a new type whose instances specify the single ridge regression hyperparameter: @@ -89,7 +87,7 @@ For example, in this case, if `learner = Ridge(0.2)`, then the docstring to the *constructor*, not the struct. -## Implementing `fit` +### Implementing `fit` A ridge regressor requires two types of data for training: input features `X`, which here we suppose are tabular¹, and a [target](@ref proxy) `y`, which we suppose is a vector.⁴ @@ -112,8 +110,7 @@ Note that we also include `learner` in the struct, for it must be possible to re The implementation of `fit` looks like this: ```@example anatomy -function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity()) - +function LearnAPI.fit(learner::Ridge, data; verbosity=1) X, y = data # data preprocessing: @@ -136,7 +133,7 @@ function LearnAPI.fit(learner::Ridge, data; verbosity=LearnAPI.default_verbosity end ``` -## Implementing `predict` +### Implementing `predict` One way users will be able to call `predict` is like this: @@ -162,23 +159,7 @@ first element of the tuple returned by [`LearnAPI.kinds_of_proxy(learner)`](@ref we overload appropriately below. -## Extracting the target from training data - -The `fit` method consumes data which includes a [target variable](@ref proxy), i.e., the -learner is a supervised learner. We must therefore declare how the target variable can be extracted -from training data, by implementing [`LearnAPI.target`](@ref): - -```@example anatomy -LearnAPI.target(learner, data) = last(data) -``` - -There is a similar method, [`LearnAPI.features`](@ref) for declaring how training features -can be extracted (something that can be passed to `predict`) but this method has a -fallback which suffices here: it returns `first(data)` if `data` is a tuple, and `data` -otherwise. - - -## Accessor functions +### Accessor functions An [accessor function](@ref accessor_functions) has the output of [`fit`](@ref) as it's sole argument. Every new implementation must implement the accessor function @@ -211,7 +192,7 @@ Crucially, we can still use `LearnAPI.strip(model)` in place of `model` to make predictions. -## Learner traits +### Learner traits Learner [traits](@ref traits) record extra generic information about a learner, or make specific promises of behavior. They are methods that have a learner as the sole @@ -232,7 +213,7 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: Ridge, constructor = Ridge, kinds_of_proxy=(Point(),), - tags = (:regression,), + tags = ("regression",), functions = ( :(LearnAPI.fit), :(LearnAPI.learner), @@ -248,25 +229,32 @@ A macro provides a shortcut, convenient when multiple traits are to be defined: nothing # hide ``` -The last trait, `functions`, returns a list of all LearnAPI.jl methods that can be -meaningfully applied to the learner or associated model. You always include the first five -you see here: `fit`, `learner`, `clone` ,`strip`, `obs`. Here [`clone`](@ref) is a utility -function provided by LearnAPI that you never overload; overloading [`obs`](@ref) is -optional (see [Providing a separate data front end](@ref)) but it is always included -because it has a fallback. See [`LearnAPI.functions`](@ref) for a checklist. +[`LearnAPI.functions`](@ref) (discussed further below) and [`LearnAPI.constructor`](@ref), +are the only universally compulsory traits. However, it is worthwhile studying the [list +of all traits](@ref traits_list) to see which might apply to a new implementation, to +enable maximum buy into functionality provided by third party packages, and to assist +third party algorithms that match machine learning algorithms to user-defined tasks. -[`LearnAPI.functions`](@ref) and [`LearnAPI.constructor`](@ref), are the only universally -compulsory traits. However, it is worthwhile studying the [list of all traits](@ref -traits_list) to see which might apply to a new implementation, to enable maximum buy into -functionality provided by third party packages, and to assist third party algorithms that -match machine learning algorithms to user-defined tasks. +With [some exceptions](@ref trait_contract), the value of a trait should depend only on +the *type* of the argument. -Note that we know `Ridge` instances are supervised learners because `:(LearnAPI.target) -in LearnAPI.functions(learner)`, for every instance `learner`. With [some -exceptions](@ref trait_contract), the value of a trait should depend only on the *type* of -the argument. +### The `functions` trait -## Signatures added for convenience +The last trait, `functions`, above returns a list of all LearnAPI.jl methods that can be +meaningfully applied to the learner or associated model, with the exception of traits. You +always include the first five you see here: `fit`, `learner`, `clone` ,`strip`, +`obs`. Here [`clone`](@ref) is a utility function provided by LearnAPI that you never +overload, while [`obs`](@ref) is discussed under [Providing a separate data front +end](@ref) below and is always included because it has a meaningful fallback. The +`features` method, here provided by a fallback, articulates how the features `X` can be +extracted from the training data `(X, y)`. We must also include `target` here to flag our +model as supervised; again the method itself is provided by a fallback valid in the +present case. + +See [`LearnAPI.functions`](@ref) for a checklist of what the `functions` trait needs to +return. + +### Signatures added for convenience We add one `fit` signature for user-convenience only. The LearnAPI.jl specification has nothing to say about `fit` signatures with more than two positional arguments. @@ -295,6 +283,7 @@ nothing # hide learner = Ridge(lambda=0.5) @functions learner ``` +(Exact output may differ here because of way documentation is generated.) Training and predicting: @@ -326,8 +315,41 @@ recovered_model = deserialize(filename) @assert predict(recovered_model, X) == predict(model, X) ``` +### Testing an implementation + +```julia +using LearnTestAPI +@testapi learner (X, y) verbosity=0 +``` + +## [Other data patterns](@id di) + +Here are some important remarks for implementations deviating in their +assumptions about data from those made above. + +- New implementations of `fit`, `predict`, etc, always have a *single* `data` argument as + above. For convenience, a signature such as `fit(learner, table, formula)`, calling `fit(learner, + (table, formula))`, can be added, but the LearnAPI.jl specification is silent on the meaning or + existence of signatures with extra arguments. + +- If the `data` object consumed by `fit`, `predict`, or `transform` is not not a suitable + table¹, array³, tuple of tables and arrays, or some other object implementing the + [MLUtils.jl](https://juliaml.github.io/MLUtils.jl/dev/) `getobs`/`numobs` interface, + then an implementation must: (i) overload [`obs`](@ref) to articulate how provided data + can be transformed into a form that does support this interface, as illustrated below + under [Providing a separate data front end](@ref) below; or (ii) overload the trait + [`LearnAPI.data_interface`](@ref) to specify a more relaxed data API. + +- Where the form of data consumed by `fit` is different from that consumed by + `predict/transform` (as in classical supervised learning) it may be necessary to + explicitly overload the functions [`LearnAPI.features`](@ref) and (if supervised) + [`LearnAPI.target`](@ref). The same holds if overloading [`obs`](@ref); see below. + + ## Providing a separate data front end +See [here](@ref code) for code without explanations. + ```@setup anatomy2 using LearnAPI using LinearAlgebra, Tables @@ -353,7 +375,7 @@ LearnAPI.strip(model::RidgeFitted) = Ridge, constructor = Ridge, kinds_of_proxy=(Point(),), - tags = (:regression,), + tags = ("regression",), functions = ( :(LearnAPI.fit), :(LearnAPI.learner), @@ -378,13 +400,28 @@ y = 2a - b + 3c + 0.05*rand(n) An implementation may optionally implement [`obs`](@ref), to expose to the user (or some meta-algorithm like cross-validation) the representation of input data internal to `fit` or `predict`, such as the matrix version `A` of `X` in the ridge example. That is, we may -factor out of `fit` (and also `predict`) a data pre-processing step, `obs`, to expose +factor out of `fit` (and also `predict`) a data preprocessing step, `obs`, to expose its outcomes. These outcomes become alternative user inputs to `fit`/`predict`. -In the default case, the alternative data representations will implement the MLUtils.jl -`getobs/numobs` interface for observation subsampling, which is generally all a user or -meta-algorithm will need, before passing the data on to `fit`/`predict` as you would the -original data. +The [`obs`](@ref) methods exist to: + +- Enable meta-algorithms to avoid redundant conversions of user-provided data into the form + ultimately used by the core training algorithms. + +- Through the provision of canned data front ends, enable users to provide data in a + variety of formats, while allowing new implementations to focus on core algorithms that + consume a standardized, preprocessed, representation of that data. + +!!! important + + While many new learner implementations will want to adopt a canned data front end, such as those provided by [LearnDataFrontEnds.jl](https://juliaai.github.io/LearnAPI.jl/dev/), we + focus here on a self-contained implementation of `obs` for the ridge example above, to show + how it works. + +In the typical case, where [`LearnAPI.data_interface`](@ref) is not overloaded, the +alternative data representations must implement the MLUtils.jl `getobs/numobs` interface +for observation subsampling, which is generally all a user or meta-algorithm will need, +before passing the data on to `fit`/`predict`, as you would the original data. So, instead of the pattern @@ -393,11 +430,10 @@ model = fit(learner, data) predict(model, newdata) ``` -one enables the following alternative (which in any case will still work, because of a -no-op `obs` fallback provided by LearnAPI.jl): +one enables the following alternative: ```julia -observations = obs(learner, data) # pre-processed training data +observations = obs(learner, data) # preprocessed training data # optional subsampling: observations = MLUtils.getobs(observations, train_indices) @@ -412,9 +448,13 @@ newobservations = MLUtils.getobs(observations, test_indices) predict(model, newobservations) ``` -See also the demonstration [below](@ref advanced_demo). +which works for any non-static learner implementing `predict`, no matter how one is +supposed to accesses the individual observations of `data` or `newdata`. See also the +demonstration [below](@ref advanced_demo). Furthermore, fallbacks ensure the above pattern +still works if we choose not to implement a front end at all, which is allowed, if +supported `data` and `newdata` already implement `getobs`/`numobs`. -Here we specifically wrap all the pre-processed data into single object, for which we +Here we specifically wrap all the preprocessed data into single object, for which we introduce a new type: ```@example anatomy2 @@ -425,7 +465,7 @@ struct RidgeFitObs{T,M<:AbstractMatrix{T}} end ``` -Now we overload `obs` to carry out the data pre-processing previously in `fit`, like this: +Now we overload `obs` to carry out the data preprocessing previously in `fit`, like this: ```@example anatomy2 function LearnAPI.obs(::Ridge, data) @@ -442,7 +482,7 @@ methods - one to handle "regular" input, and one to handle the pre-processed dat (observations) which appears first below: ```@example anatomy2 -function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=LearnAPI.default_verbosity()) +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) lambda = learner.lambda @@ -472,7 +512,7 @@ LearnAPI.fit(learner::Ridge, data; kwargs...) = Providing `fit` signatures matching the output of [`obs`](@ref), is the first part of the `obs` contract. Since `obs(learner, data)` should evidently support all `data` that `fit(learner, data)` supports, we must be able to apply `obs(learner, _)` to it's own -output (`observations` below). This leads to the additional "no-op" declaration +output (`observations` below). This leads to the additional declaration ```@example anatomy2 LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations @@ -505,15 +545,19 @@ LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = predict(model, Point(), obs(model, Xnew)) ``` -### `target` and `features` methods +### `features` and `target` methods -In the general case, we only need to implement [`LearnAPI.target`](@ref) and -[`LearnAPI.features`](@ref) to handle all possible output of `obs(learner, data)`, and now -the fallback for `LearnAPI.features` mentioned before is inadequate. +Two methods [`LearnAPI.features`](@ref) and [`LearnAPI.target`](@ref) articulate how +features and target can be extracted from `data` consumed by LearnAPI.jl +methods. Fallbacks provided by LearnAPI.jl sufficed in our basic implementation +above. Here we must explicitly overload them, so that they also handle the output of +`obs(learner, data)`: ```@example anatomy2 -LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +LearnAPI.features(learner::Ridge, data) = LearnAPI.features(learner, obs(learner, data)) +LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, data)) ``` ### Important notes: @@ -529,7 +573,7 @@ LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A Since LearnAPI.jl provides fallbacks for `obs` that simply return the unadulterated data argument, overloading `obs` is optional. This is provided data in publicized -`fit`/`predict` signatures consists only of objects implement the +`fit`/`predict` signatures already consists only of objects implement the [`LearnAPI.RandomAccess`](@ref) interface (most tables¹, arrays³, and tuples thereof). To opt out of supporting the MLUtils.jl interface altogether, an implementation must diff --git a/docs/src/common_implementation_patterns.md b/docs/src/common_implementation_patterns.md index 85ebe507..0c57ff50 100644 --- a/docs/src/common_implementation_patterns.md +++ b/docs/src/common_implementation_patterns.md @@ -10,7 +10,7 @@ which introduces the main interface objects and terminology. Although an implementation is defined purely by the methods and traits it implements, many implementations fall into one (or more) of the following informally understood patterns or -"tasks": +tasks: - [Regression](@ref): Supervised learners for continuous targets diff --git a/docs/src/examples.md b/docs/src/examples.md new file mode 100644 index 00000000..dea9bc56 --- /dev/null +++ b/docs/src/examples.md @@ -0,0 +1,192 @@ +# [Code for ridge example](@id code) + +Below is the complete source code for the ridge implementations described in the tutorial, +[Anatomy of an Implementation](@ref). + +- [Basic implementation](@ref) +- [Implementation with data front end](@ref) + + +## Basic implementation + +```julia +using LearnAPI +using LinearAlgebra, Tables + +struct Ridge{T<:Real} + lambda::T +end + +""" + Ridge(; lambda=0.1) + +Instantiate a ridge regression learner, with regularization of `lambda`. +""" +Ridge(; lambda=0.1) = Ridge(lambda) +LearnAPI.constructor(::Ridge) = Ridge + +# struct for output of `fit` +struct RidgeFitted{T,F} + learner::Ridge + coefficients::Vector{T} + named_coefficients::F +end + +function LearnAPI.fit(learner::Ridge, data; verbosity=1) + X, y = data + + # data preprocessing: + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + A = Tables.matrix(table, transpose=true) + + lambda = learner.lambda + + # apply core algorithm: + coefficients = (A*A' + learner.lambda*I)\(A*y) # vector + + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" + + return RidgeFitted(learner, coefficients, named_coefficients) +end + +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = + Tables.matrix(Xnew)*model.coefficients + +# accessor functions: +LearnAPI.learner(model::RidgeFitted) = model.learner +LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients +LearnAPI.strip(model::RidgeFitted) = + RidgeFitted(model.learner, model.coefficients, nothing) + +@trait( + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), + ) +) + +# convenience method: +LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...) +``` + +# Implementation with data front end + +```julia +using LearnAPI +using LinearAlgebra, Tables + +struct Ridge{T<:Real} + lambda::T +end + +Ridge(; lambda=0.1) = Ridge(lambda) + +# struct for output of `fit`: +struct RidgeFitted{T,F} + learner::Ridge + coefficients::Vector{T} + named_coefficients::F +end + +# struct for internal representation of training data: +struct RidgeFitObs{T,M<:AbstractMatrix{T}} + A::M # `p` x `n` matrix + names::Vector{Symbol} # features + y::Vector{T} # target +end + +# implementation of `RandomAccess()` data interface for such representation: +Base.getindex(data::RidgeFitObs, I) = + RidgeFitObs(data.A[:,I], data.names, y[I]) +Base.length(data::RidgeFitObs) = length(data.y) + +# data front end for `fit`: +function LearnAPI.obs(::Ridge, data) + X, y = data + table = Tables.columntable(X) + names = Tables.columnnames(table) |> collect + return RidgeFitObs(Tables.matrix(table)', names, y) +end +LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations + +function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1) + + lambda = learner.lambda + + A = observations.A + names = observations.names + y = observations.y + + # apply core learner: + coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix + + # determine named coefficients: + named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)] + + # make some noise, if allowed: + verbosity > 0 && @info "Coefficients: $named_coefficients" + + return RidgeFitted(learner, coefficients, named_coefficients) + +end + +LearnAPI.fit(learner::Ridge, data; kwargs...) = + fit(learner, obs(learner, data); kwargs...) + +# data front end for `predict`: +LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)' +LearnAPI.obs(::RidgeFitted, observations::AbstractArray) = observations # involutivity + +LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) = + observations'*model.coefficients + +LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) = + predict(model, Point(), obs(model, Xnew)) + +# methods to deconstruct training data: +LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A +LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y +LearnAPI.features(learner::Ridge, data) = LearnAPI.features(learner, obs(learner, data)) +LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, data)) + +# accessor functions: +LearnAPI.learner(model::RidgeFitted) = model.learner +LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients +LearnAPI.strip(model::RidgeFitted) = + RidgeFitted(model.learner, model.coefficients, nothing) + +@trait( + Ridge, + constructor = Ridge, + kinds_of_proxy=(Point(),), + tags = ("regression",), + functions = ( + :(LearnAPI.fit), + :(LearnAPI.learner), + :(LearnAPI.clone), + :(LearnAPI.strip), + :(LearnAPI.obs), + :(LearnAPI.features), + :(LearnAPI.target), + :(LearnAPI.predict), + :(LearnAPI.coefficients), + ) +) + +``` diff --git a/docs/src/features_target_weights.md b/docs/src/features_target_weights.md new file mode 100644 index 00000000..e2878672 --- /dev/null +++ b/docs/src/features_target_weights.md @@ -0,0 +1,45 @@ +# [`features`, `target`, and `weights`](@id input) + +Methods for extracting certain parts of `data` for all supported calls of the form +[`fit(learner, data)`](@ref). + +```julia +LearnAPI.features(learner, data) -> +LearnAPI.target(learner, data) -> +LearnAPI.weights(learner, data) -> +``` + +Here `data` is something supported in a call of the form `fit(learner, data)`. + +# Typical workflow + +Not typically appearing in a general user's workflow but useful in meta-alagorithms, such +as cross-validation (see the example in [`obs` and Data Interfaces](@ref data_interface)). + +Supposing `learner` is a supervised classifier predicting a vector +target: + +```julia +model = fit(learner, data) +X = LearnAPI.features(learner, data) +y = LearnAPI.target(learner, data) +ŷ = predict(model, Point(), X) +training_loss = sum(ŷ .!= y) +``` + +# Implementation guide + +| method | fallback return value | compulsory? | +|:-------------------------------------------|:---------------------------------------------:|--------------------------| +| [`LearnAPI.features(learner, data)`](@ref) | `first(data)` if `data` is tuple, else `data` | if fallback insufficient | +| [`LearnAPI.target(learner, data)`](@ref) | `last(data)` | if fallback insufficient | +| [`LearnAPI.weights(learner, data)`](@ref) | `nothing` | no | + + +# Reference + +```@docs +LearnAPI.features +LearnAPI.target +LearnAPI.weights +``` diff --git a/docs/src/fit_update.md b/docs/src/fit_update.md index c649e6dd..8e27126c 100644 --- a/docs/src/fit_update.md +++ b/docs/src/fit_update.md @@ -3,12 +3,12 @@ ### Training ```julia -fit(learner, data; verbosity=LearnAPI.default_verbosity()) -> model -fit(learner; verbosity=LearnAPI.default_verbosity()) -> static_model +fit(learner, data; verbosity=1) -> model +fit(learner; verbosity=1) -> static_model ``` A "static" algorithm is one that does not generalize to new observations (e.g., some -clustering algorithms); there is no training data and the algorithm is executed by +clustering algorithms); there is no training data and heavy lifting is carried out by `predict` or `transform` which receive the data. See example below. @@ -101,18 +101,18 @@ See also [Density Estimation](@ref). Exactly one of the following must be implemented: -| method | fallback | -|:-----------------------------------------------------------------------|:---------| -| [`fit`](@ref)`(learner, data; verbosity=LearnAPI.default_verbosity())` | none | -| [`fit`](@ref)`(learner; verbosity=LearnAPI.default_verbosity())` | none | +| method | fallback | +|:--------------------------------------------|:---------| +| [`fit`](@ref)`(learner, data; verbosity=1)` | none | +| [`fit`](@ref)`(learner; verbosity=1)` | none | ### Updating | method | fallback | compulsory? | |:-------------------------------------------------------------------------------------|:---------|-------------| -| [`update`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no | -| [`update_observations`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no | -| [`update_features`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no | +| [`update`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update_observations`](@ref)`(model, new_data; verbosity=1, hyperparameter_updates...)` | none | no | +| [`update_features`](@ref)`(model, new_data; verbosity=1, hyperparameter_updates...)` | none | no | There are some contracts governing the behaviour of the update methods, as they relate to a previous `fit` call. Consult the document strings for details. @@ -124,5 +124,4 @@ fit update update_observations update_features -LearnAPI.default_verbosity ``` diff --git a/docs/src/index.md b/docs/src/index.md index 55c18898..0d10db0f 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -47,7 +47,7 @@ Suppose `forest` is some object encapsulating the hyperparameters of the [random algorithm](https://en.wikipedia.org/wiki/Random_forest) (the number of trees, etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the type of `forest`, to enable the basic workflow below. In this case data is presented following the -"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other patterns as well. +"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other data pattern. ```julia # `X` is some training features diff --git a/docs/src/obs.md b/docs/src/obs.md index a583f27d..70b6eb46 100644 --- a/docs/src/obs.md +++ b/docs/src/obs.md @@ -12,6 +12,9 @@ obs(learner, data) # can be passed to `fit` instead of `data` obs(model, data) # can be passed to `predict` or `transform` instead of `data` ``` +- [Data interfaces](@ref data_interfaces) + + ## [Typical workflows](@id obs_workflows) LearnAPI.jl makes no universal assumptions about the form of `data` in a call @@ -93,18 +96,11 @@ A sample implementation is given in [Providing a separate data front end](@ref). obs ``` -### [Data interfaces](@id data_interfaces) - -New implementations must overload [`LearnAPI.data_interface(learner)`](@ref) if the -output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess()`](@ref). Arrays, most -tables, and all tuples thereof, implement `RandomAccess()`. - -- [`LearnAPI.RandomAccess`](@ref) (default) -- [`LearnAPI.FiniteIterable`](@ref) -- [`LearnAPI.Iterable`](@ref) +### [Available data interfaces](@id data_interfaces) ```@docs +LearnAPI.DataInterface LearnAPI.RandomAccess LearnAPI.FiniteIterable LearnAPI.Iterable diff --git a/docs/src/patterns/transformers.md b/docs/src/patterns/transformers.md index f085f928..c27f9682 100644 --- a/docs/src/patterns/transformers.md +++ b/docs/src/patterns/transformers.md @@ -1,7 +1,5 @@ # [Transformers](@id transformers) -Check out the following examples: +Check out the following examples from the TestLearnAPI.jl test suite: -- [Truncated - SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl - (from the TestLearnAPI.jl test suite) +- [Truncated SVD](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl) diff --git a/docs/src/reference.md b/docs/src/reference.md index f068afc7..18fb92df 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -12,7 +12,7 @@ The LearnAPI.jl specification is predicated on a few basic, informally defined n ### Data and observations -ML/statistical algorithms are typically applied in conjunction with resampling of +ML/statistical algorithms are frequently applied in conjunction with resampling of *observations*, as in [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)). In this document *data* will always refer to objects encapsulating an ordered sequence of @@ -35,9 +35,14 @@ see [`obs`](@ref) and [`LearnAPI.data_interface`](@ref) for details. Besides the data it consumes, a machine learning algorithm's behavior is governed by a number of user-specified *hyperparameters*, such as the number of trees in a random -forest. In LearnAPI.jl, one is allowed to have hyperparameters that are not data-generic. -For example, a class weight dictionary, which will only make sense for a target taking -values in the set of dictionary keys, can be specified as a hyperparameter. +forest. Hyperparameters are understood in a rather broad sense. For example, one is +allowed to have hyperparameters that are not data-generic. For example, a class weight +dictionary, which will only make sense for a target taking values in the set of specified +dictionary keys, should be given as a hyperparameter. For simplicity and composability, +LearnAPI.jl discourages "run time" parameters (extra arguments to `fit`) such as +acceleration options (cpu/gpu/multithreading/multiprocessing). These should be included as +hyperparameters as far as possible. An exception is the compulsory `verbosity` keyword +argument of `fit`. ### [Targets and target proxies](@id proxy) @@ -56,16 +61,16 @@ compared with censored ground truth survival times. And so on ... #### Definitions -More generally, whenever we have a variable (e.g., a class label) that can, at least in -principle, be paired with a predicted value, or some predicted "proxy" for that variable -(such as a class probability), then we call the variable a *target* variable, and the -predicted output a *target proxy*. In this definition, it is immaterial whether or not the -target appears in training (the algorithm is supervised) or whether or not predictions -generalize to new input observations (the algorithm "learns"). +More generally, whenever we have a variable that can, at least in principle, be paired +with a predicted value, or some predicted "proxy" for that variable (such as a class +probability), then we call the variable a *target* variable, and the predicted output a +*target proxy*. In this definition, it is immaterial whether or not the target appears in +training (the algorithm is supervised) or whether or not predictions generalize to new +input observations (the algorithm "learns"). LearnAPI.jl provides singleton [target proxy types](@ref proxy_types) for prediction -dispatch. These are also used to distinguish performance metrics provided by the package -[StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). +dispatch. These are the same types used to distinguish performance metrics provided by the +package [StatisticalMeasures.jl](https://juliaai.github.io/StatisticalMeasures.jl/dev/). ### [Learners](@id learners) @@ -97,7 +102,7 @@ generally requires overloading `Base.==` for the struct. !!! important No LearnAPI.jl method is permitted to mutate a learner. In particular, one should make - deep copies of RNG hyperparameters before using them in a new implementation of + deep copies of RNG hyperparameters before using them in an implementation of [`fit`](@ref). #### Composite learners (wrappers) @@ -109,9 +114,6 @@ properties that are not in [`LearnAPI.learners(learner)`](@ref). Instead, these learner-valued properties can have a `nothing` default, with the constructor throwing an error if the constructor call does not explicitly specify a new value. -Any object `learner` for which [`LearnAPI.functions(learner)`](@ref) is non-empty is -understood to have a valid implementation of the LearnAPI.jl interface. - #### Example Below is an example of a learner type with a valid constructor: @@ -134,6 +136,14 @@ GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) LearnAPI.constructor(::GradientRidgeRegressor) = GradientRidgeRegressor ``` +#### Testing something is a learner + +Any object `object` for which [`LearnAPI.functions(object)`](@ref) is non-empty is +understood to have a valid implementation of the LearnAPI.jl interface. You can test this +with the convenience method [`LearnAPI.is_learner(object)`](@ref) but this is never explicitly +overloaded. + + ## Documentation Attach public LearnAPI.jl-related documentation for a learner to it's *constructor*, @@ -149,9 +159,7 @@ interface.) [`LearnAPI.learner`](@ref), [`LearnAPI.constructor`](@ref) and [`LearnAPI.functions`](@ref). -Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). For a -minimal (but useless) implementation, see the implementation of `SmallLearner` -[here](https://github.com/JuliaAI/LearnAPI.jl/blob/dev/test/traits.jl). +Most learners will also implement [`predict`](@ref) and/or [`transform`](@ref). ### List of methods @@ -180,14 +188,14 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` implement the observation access API specified by [`LearnAPI.data_interface(learner)`](@ref). -- [`LearnAPI.target`](@ref input), [`LearnAPI.weights`](@ref input), - [`LearnAPI.features`](@ref): for extracting relevant parts of training data, where +- [`LearnAPI.features`](@ref input), [`LearnAPI.target`](@ref input), + [`LearnAPI.weights`](@ref input): for extracting relevant parts of training data, where defined. - [Accessor functions](@ref accessor_functions): these include functions like `LearnAPI.feature_importances` and `LearnAPI.training_losses`, for extracting, from training outcomes, information common to many learners. This includes - [`LearnAPI.strip(model)`](@ref) for replacing a learning outcome `model` with a + [`LearnAPI.strip(model)`](@ref) for replacing a learning outcome, `model`, with a serializable version that can still `predict` or `transform`. - [Learner traits](@ref traits): methods that promise specific learner behavior or @@ -197,11 +205,14 @@ minimal (but useless) implementation, see the implementation of `SmallLearner` ## Utilities + +- [`LearnAPI.is_learner`](@ref) - [`clone`](@ref): for cloning a learner with specified hyperparameter replacements. - [`@trait`](@ref): for simultaneously declaring multiple traits - [`@functions`](@ref): for listing functions available for use with a learner ```@docs +LearnAPI.is_learner clone @trait @functions diff --git a/docs/src/target_weights_features.md b/docs/src/target_weights_features.md deleted file mode 100644 index 925bae67..00000000 --- a/docs/src/target_weights_features.md +++ /dev/null @@ -1,47 +0,0 @@ -# [`target`, `weights`, and `features`](@id input) - -Methods for extracting parts of training observations. Here "observations" means the -output of [`obs(learner, data)`](@ref); if `obs` is not overloaded for `learner`, then -"observations" is any `data` supported in calls of the form [`fit(learner, data)`](@ref) - -```julia -LearnAPI.target(learner, observations) -> -LearnAPI.weights(learner, observations) -> -LearnAPI.features(learner, observations) -> -``` - -Here `data` is something supported in a call of the form `fit(learner, data)`. - -# Typical workflow - -Not typically appearing in a general user's workflow but useful in meta-alagorithms, such -as cross-validation (see the example in [`obs` and Data Interfaces](@ref data_interface)). - -Supposing `learner` is a supervised classifier predicting a one-dimensional vector -target: - -```julia -observations = obs(learner, data) -model = fit(learner, observations) -X = LearnAPI.features(learner, data) -y = LearnAPI.target(learner, data) -ŷ = predict(model, Point(), X) -training_loss = sum(ŷ .!= y) -``` - -# Implementation guide - -| method | fallback | compulsory? | -|:----------------------------|:-----------------:|--------------------------| -| [`LearnAPI.target`](@ref) | returns `nothing` | no | -| [`LearnAPI.weights`](@ref) | returns `nothing` | no | -| [`LearnAPI.features`](@ref) | see docstring | if fallback insufficient | - - -# Reference - -```@docs -LearnAPI.target -LearnAPI.weights -LearnAPI.features -``` diff --git a/docs/src/traits.md b/docs/src/traits.md index a95404bf..eebb9e8e 100644 --- a/docs/src/traits.md +++ b/docs/src/traits.md @@ -28,7 +28,7 @@ In the examples column of the table below, `Continuous` is a name owned the pack | [`LearnAPI.human_name`](@ref)`(learner)` | human name for the learner; should be a noun | type name with spaces | "elastic net regressor" | | [`LearnAPI.iteration_parameter`](@ref)`(learner)` | symbolic name of an iteration parameter | `nothing` | :epochs | | [`LearnAPI.data_interface`](@ref)`(learner)` | Interface implemented by objects returned by [`obs`](@ref) | `Base.HasLength()` (supports `MLUtils.getobs/numobs`) | `Base.SizeUnknown()` (supports `iterate`) | -| [`LearnAPI.fit_observation_scitype`](@ref)`(learner)` | upper bound on `scitype(observation)` for `observation` in `data` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | +| [`LearnAPI.fit_scitype`](@ref)`(learner)` | upper bound on `scitype(data)` ensuring `fit(learner, data)` works | `Union{}` | `Tuple{AbstractVector{Continuous}, Continuous}` | | [`LearnAPI.target_observation_scitype`](@ref)`(learner)` | upper bound on the scitype of each observation of the targget | `Any` | `Continuous` | | [`LearnAPI.is_static`](@ref)`(learner)` | `true` if `fit` consumes no data | `false` | `true` | @@ -78,11 +78,11 @@ requires: 1. *Finiteness:* The value of a trait is the same for all `learner`s with same value of [`LearnAPI.constructor(learner)`](@ref). This typically means trait values do not - depend on type parameters! For composite models (`LearnAPI.learners(learner)` - non-empty) this requirement is dropped. + depend on type parameters! For composite models (non-empty + `LearnAPI.learners(learner)`) this requirement is dropped. 2. *Low level deserializability:* It should be possible to evaluate the trait *value* when - `LearnAPI` is the only imported module. + `LearnAPI` and `ScientificTypesBase` are the only imported modules. Because of 1, combining a lot of functionality into one learner (e.g. the learner can perform both classification or regression) can mean traits are necessarily less @@ -105,7 +105,7 @@ LearnAPI.nonlearners LearnAPI.human_name LearnAPI.data_interface LearnAPI.iteration_parameter -LearnAPI.fit_observation_scitype +LearnAPI.fit_scitype LearnAPI.target_observation_scitype LearnAPI.is_static ``` diff --git a/src/LearnAPI.jl b/src/LearnAPI.jl index 9687c2e9..c32ab3b8 100644 --- a/src/LearnAPI.jl +++ b/src/LearnAPI.jl @@ -1,11 +1,10 @@ module LearnAPI include("types.jl") -include("verbosity.jl") include("tools.jl") include("predict_transform.jl") include("fit_update.jl") -include("target_weights_features.jl") +include("features_target_weights.jl") include("obs.jl") include("accessor_functions.jl") include("traits.jl") diff --git a/src/features_target_weights.jl b/src/features_target_weights.jl new file mode 100644 index 00000000..578772fa --- /dev/null +++ b/src/features_target_weights.jl @@ -0,0 +1,132 @@ +""" + LearnAPI.target(learner, data) -> target + +Return, for each form of `data` supported by the call [`fit(learner, data)`](@ref), the +target part of `data`, in a form suitable for pairing with predictions. The return value +is only meaningful if `learner` is supervised, i.e., if `:(LearnAPI.target) in +LearnAPI.functions(learner)`. + +The returned object has the same number of observations +as `data` has and is guaranteed to implement the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). + +# Extended help + +## What is a target variable? + +Examples of target variables are house prices in real estate pricing estimates, the +"spam"/"not spam" labels in an email spam filtering task, "outlier"/"inlier" labels in +outlier detection, cluster labels in clustering problems, and censored survival times in +survival analysis. For more on targets and target proxies, see the "Reference" section of +the LearnAPI.jl documentation. + +## New implementations + +A fallback returns `last(data)`. The method must be overloaded if [`fit`](@ref) consumes +data that includes a target variable and this fallback fails to fulfill the contract stated +above. + +If `obs` is being overloaded, then typically it suffices to overload +`LearnAPI.target(learner, observations)` where `observations = obs(learner, data)` and +`data` is any documented supported `data` in calls of the form [`fit(learner, +data)`](@ref), and to add a declaration of the form + +```julia +LearnAPI.target(learner, data) = LearnAPI.target(learner, obs(learner, data)) +``` +to catch all other forms of supported input `data`. + +Remember to ensure the return value of `LearnAPI.target` implements the data +interface specified by [`LearnAPI.data_interface(learner)`](@ref). + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) + +""" +target(::Any, data) = last(data) + +""" + LearnAPI.weights(learner, data) -> weights + +Return, for each form of `data` supported by the call [`fit(learner, data)`](@ref), the +per-observation weights part of `data`. + +The returned object has the same number of observations +as `data` has and is guaranteed to implement the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). + +Where `nothing` is returned, weighting is understood to be uniform. + +# Extended help + +# New implementations + +Overloading is optional. A fallback returns `nothing`. + +If `obs` is being overloaded, then typically it suffices to overload +`LearnAPI.weights(learner, observations)` where `observations = obs(learner, data)` and +`data` is any documented supported `data` in calls of the form [`fit(learner, +data)`](@ref), and to add a declaration of the form + +```julia +LearnAPI.weights(learner, data) = LearnAPI.weights(learner, obs(learner, data)) +``` +to catch all other forms of supported input `data`. + +Ensure the returned object, unless `nothing`, implements the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) + +""" +weights(::Any, data) = nothing + +""" + LearnAPI.features(learner, data) + +Return, for each form of `data` supported by the call [`fit(learner, data)`](@ref), the +features part `X` of `data`. + +While "features" will typically have the commonly understood meaning, the only +learner-generic guaranteed properties of `X` are: + +- `X` can be passed to [`predict`](@ref) or [`transform`](@ref) when these are supported + by `learner`, as in the call `predict(model, X)`, where `model = fit(learner, data)`. + +- `X` has the same number of observations as `data` has and is guaranteed to implement + the data interface specified by [`LearnAPI.data_interface(learner)`](@ref). + +Where `nothing` is returned, `predict` and `transform` consume no data. + +# Extended help + +# New implementations + +A fallback returns `first(data)` if `data` is a tuple, and otherwise returns `data`. The +method has no meaning for static learners (where `data` is not an argument of `fit`) and +otherwise an implementation needs to overload this method if the fallback is inadequate. + +For density estimators, whose `fit` typically consumes *only* a target variable, you +should overload this method to always return `nothing`. + +If `obs` is being overloaded, then typically it suffices to overload +`LearnAPI.features(learner, observations)` where `observations = obs(learner, data)` and +`data` is any documented supported `data` in calls of the form [`fit(learner, +data)`](@ref), and to add a declaration of the form + +```julia +LearnAPI.features(learner, data) = LearnAPI.features(learner, obs(learner, data)) +``` +to catch all other forms of supported input `data`. + +Ensure the returned object, unless `nothing`, implements the data interface specified by +[`LearnAPI.data_interface(learner)`](@ref). + +`:(LearnAPI.features)` must be included in the return value of +[`LearnAPI.functions(learner)`](@ref), unless the learner is static (`fit` consumes no +data). + +""" +features(learner, data) = _first(data) +_first(data) = data +_first(data::Tuple) = first(data) +# note the factoring above guards against method ambiguities diff --git a/src/fit_update.jl b/src/fit_update.jl index 015669e7..c33e40b8 100644 --- a/src/fit_update.jl +++ b/src/fit_update.jl @@ -1,8 +1,8 @@ # # FIT """ - fit(learner, data; verbosity=LearnAPI.default_verbosity()) - fit(learner; verbosity=LearnAPI.default_verbosity()) + fit(learner, data; verbosity=1) + fit(learner; verbosity=1) Execute the machine learning or statistical algorithm with configuration `learner` using the provided training `data`, returning an object, `model`, on which other methods, such @@ -26,7 +26,7 @@ by `fit`. Inspect the value of [`LearnAPI.is_static(learner)`](@ref) to determin Use `verbosity=0` for warnings only, and `-1` for silent training. -See also [`LearnAPI.default_verbosity`](@ref), [`predict`](@ref), [`transform`](@ref), +See also [`predict`](@ref), [`transform`](@ref), [`inverse_transform`](@ref), [`LearnAPI.functions`](@ref), [`obs`](@ref). # Extended help @@ -37,15 +37,12 @@ Implementation of exactly one of the signatures is compulsory. If `fit(learner; verbosity=...)` is implemented, then the trait [`LearnAPI.is_static`](@ref) must be overloaded to return `true`. -The signature must include `verbosity` with [`LearnAPI.default_verbosity()`](@ref) as -default. +The signature must include `verbosity` with `1` as default. If `data` encapsulates a *target* variable, as defined in LearnAPI.jl documentation, then -[`LearnAPI.target(data)`](@ref) must be overloaded to return it. If [`predict`](@ref) or -[`transform`](@ref) are implemented and consume data, then -[`LearnAPI.features(data)`](@ref) must return something that can be passed as data to -these methods. A fallback returns `first(data)` if `data` is a tuple, and `data` -otherwise. +[`LearnAPI.target`](@ref) must be implemented. If [`predict`](@ref) or [`transform`](@ref) +are implemented and consume data, then you made need to overload +[`LearnAPI.features`](@ref). The LearnAPI.jl specification has nothing to say regarding `fit` signatures with more than two arguments. For convenience, for example, an implementation is free to implement a diff --git a/src/predict_transform.jl b/src/predict_transform.jl index d4bfe0c8..0a92d3f5 100644 --- a/src/predict_transform.jl +++ b/src/predict_transform.jl @@ -98,8 +98,10 @@ implementation must be added to the list returned by [`LearnAPI.kinds_of_proxy(learner)`](@ref). List all available kinds of proxy by doing `LearnAPI.kinds_of_proxy()`. -If `data` is not present in the implemented signature (eg., for density estimators) then -[`LearnAPI.features(learner, data)`](@ref) must return `nothing`. +When `predict` is implemented, it may be necessary to overload +[`LearnAPI.features`](@ref). If `data` is not present in the implemented signature (eg., +for density estimators) then [`LearnAPI.features(learner, data)`](@ref) must always return +`nothing`. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.predict)")) @@ -161,7 +163,12 @@ See also [`fit`](@ref), [`predict`](@ref), # New implementations Implementation for new LearnAPI.jl learners is -optional. $(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) +optional. + +When `predict` is implemented, it may be necessary to overload +[`LearnAPI.features`](@ref). + +$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.transform)")) $(DOC_SLURPING(:transform)) diff --git a/src/target_weights_features.jl b/src/target_weights_features.jl deleted file mode 100644 index c14f467b..00000000 --- a/src/target_weights_features.jl +++ /dev/null @@ -1,113 +0,0 @@ -""" - LearnAPI.target(learner, observations) -> target - -Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the target variable part of `observations`. If `nothing` is returned, the -`learner` does not see a target variable in training (is unsupervised). - -The returned object `y` has the same number of observations as `observations` does and is -guaranteed to implement the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). It's form should be suitable for pairing with -the output of [`predict`](@ref), for example in a loss function. - -# Extended help - -## What is a target variable? - -Examples of target variables are house prices in real estate pricing estimates, the -"spam"/"not spam" labels in an email spam filtering task, "outlier"/"inlier" labels in -outlier detection, cluster labels in clustering problems, and censored survival times in -survival analysis. For more on targets and target proxies, see the "Reference" section of -the LearnAPI.jl documentation. - -## New implementations - -A fallback returns `nothing`. The method must be overloaded if [`fit`](@ref) consumes data -that includes a target variable. If `obs` is not being overloaded, then `observations` -above is any `data` supported in calls of the form [`fit(learner, data)`](@ref). The form -of the output `y` should be suitable for pairing with the output of [`predict`](@ref), in -the evaluation of a loss function, for example. - -Ensure the object `y` returned by `LearnAPI.target`, unless `nothing`, implements the data -interface specified by [`LearnAPI.data_interface(learner)`](@ref). - -$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.target)"; overloaded=true)) - -""" -target(::Any, observations) = nothing - -""" - LearnAPI.weights(learner, observations) -> weights - -Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the weights part of `observations`. Where `nothing` is returned, no weights -are part of `data`, which is to be interpreted as uniform weighting. - -The returned object `w` has the same number of observations as `observations` does and is -guaranteed to implement the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). - -# Extended help - -# New implementations - -Overloading is optional. A fallback returns `nothing`. If `obs` is not being overloaded, -then `observations` above is any `data` supported in calls of the form [`fit(learner, -data)`](@ref). - -Ensure the returned object, unless `nothing`, implements the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). - -$(DOC_IMPLEMENTED_METHODS(":(LearnAPI.weights)"; overloaded=true)) - -""" -weights(::Any, observations) = nothing - -""" - LearnAPI.features(learner, observations) - -Return, for every conceivable `observations` returned by a call of the form [`obs(learner, -data)`](@ref), the "features" part of `observations` (as opposed to the target variable, -for example). - -It must always be possible to pass the returned object `X` to `predict` or `transform`, -where implemented, as in the following sample workflow: - -```julia -observations = obs(learner, data) -model = fit(learner, observations) -X = LearnAPI.features(learner, observations) -ŷ = predict(model, kind_of_proxy, X) # eg, `kind_of_proxy = Point()` -``` - -For supervised models (i.e., where `:(LearnAPI.target) in LearnAPI.functions(learner)`) -`ŷ` above is generally intended to be an approximate proxy for the target variable. - -The object `X` returned by `LearnAPI.features` has the same number of observations as -`observations` does and is guaranteed to implement the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). - -# Extended help - -# New implementations - -A fallback returns `first(observations)` if `observations` is a tuple, and otherwise -returns `observations`. New implementations may need to overload this method if this -fallback is inadequate. - -For density estimators, whose `fit` typically consumes *only* a target variable, you -should overload this method to return `nothing`. If `obs` is not being overloaded, then -`observations` above is any `data` supported in calls of the form [`fit(learner, -data)`](@ref). - -It must otherwise be possible to pass the return value `X` to `predict` and/or -`transform`, and `X` must have same number of observations as `data`. - -Ensure the returned object, unless `nothing`, implements the data interface specified by -[`LearnAPI.data_interface(learner)`](@ref). - -""" -features(learner, observations) = _first(observations) -_first(observations) = observations -_first(observations::Tuple) = first(observations) -# note the factoring above guards against method ambiguities diff --git a/src/traits.jl b/src/traits.jl index 46004d17..0a99aaff 100644 --- a/src/traits.jl +++ b/src/traits.jl @@ -6,15 +6,15 @@ const DOC_UNKNOWN = "not overloaded the trait. " const DOC_ON_TYPE = "The value of the trait must depend only on the type of `learner`. " -const DOC_EXPLAIN_EACHOBS = - """ +# const DOC_EXPLAIN_EACHOBS = +# """ - Here, "for each `o` in `observations`" is understood in the sense of - [`LearnAPI.data_interface(learner)`](@ref). For example, if - `LearnAPI.data_interface(learner) == Base.HasLength()`, then this means "for `o` in - `MLUtils.eachobs(observations)`". +# Here, "for each `o` in `observations`" is understood in the sense of the data +# interface specified for the learner, [`LearnAPI.data_interface(learner)`](@ref). For +# example, if this is `LearnAPI.RandomAccess()`, then this means "for `o` in +# `MLUtils.eachobs(observations)`". - """ +# """ # # OVERLOADABLE TRAITS @@ -74,8 +74,8 @@ reference functions not owned by LearnAPI.jl. The understanding is that `learner` is a LearnAPI-compliant object whenever the return value is non-empty. -Do `LearnAPI.functions()` to list all possible elements of the return value owned by -LearnAPI.jl. +Do `LearnAPI.functions()` to list all possible elements of the return value representing +functions owned by LearnAPI.jl. # Extended help @@ -84,23 +84,23 @@ LearnAPI.jl. All new implementations must implement this trait. Here's a checklist for elements in the return value: -| expression | implementation compulsory? | include in returned tuple? | -|:----------------------------------|:---------------------------|:-----------------------------------| -| `:(LearnAPI.fit)` | yes | yes | -| `:(LearnAPI.learner)` | yes | yes | -| `:(LearnAPI.clone)` | never overloaded | yes | -| `:(LearnAPI.strip)` | no | yes | -| `:(LearnAPI.obs)` | no | yes | -| `:(LearnAPI.features)` | no | yes, unless `fit` consumes no data | -| `:(LearnAPI.target)` | no | only if implemented | -| `:(LearnAPI.weights)` | no | only if implemented | -| `:(LearnAPI.update)` | no | only if implemented | -| `:(LearnAPI.update_observations)` | no | only if implemented | -| `:(LearnAPI.update_features)` | no | only if implemented | -| `:(LearnAPI.predict)` | no | only if implemented | -| `:(LearnAPI.transform)` | no | only if implemented | -| `:(LearnAPI.inverse_transform)` | no | only if implemented | -| < accessor functions> | no | only if implemented | +| expression | implementation compulsory? | include in returned tuple? | +|:----------------------------------|:---------------------------|:---------------------------------| +| `:(LearnAPI.fit)` | yes | yes | +| `:(LearnAPI.learner)` | yes | yes | +| `:(LearnAPI.clone)` | never overloaded | yes | +| `:(LearnAPI.strip)` | no | yes | +| `:(LearnAPI.obs)` | no | yes | +| `:(LearnAPI.features)` | no | yes, unless `learner` is static | +| `:(LearnAPI.target)` | no | only if implemented | +| `:(LearnAPI.weights)` | no | only if implemented | +| `:(LearnAPI.update)` | no | only if implemented | +| `:(LearnAPI.update_observations)` | no | only if implemented | +| `:(LearnAPI.update_features)` | no | only if implemented | +| `:(LearnAPI.predict)` | no | only if implemented | +| `:(LearnAPI.transform)` | no | only if implemented | +| `:(LearnAPI.inverse_transform)` | no | only if implemented | +| < accessor functions> | no | only if implemented | Also include any implemented accessor functions, both those owned by LearnaAPI.jl, and any learner-specific ones. The LearnAPI.jl accessor functions are: $ACCESSOR_FUNCTIONS_LIST @@ -136,7 +136,7 @@ argument) are excluded. ``` julia> @functions my_feature_selector -(fit, LearnAPI.learner, strip, obs, transform) +(fit, LearnAPI.learner, clone, strip, obs, transform) ``` @@ -364,8 +364,7 @@ in representations of input data returned by [`obs(learner, data)`](@ref) or [`obs(model, data)`](@ref), whenever `learner == LearnAPI.learner(model)`. Here `data` is `fit`, `predict`, or `transform`-consumable data. -Possible return values are [`LearnAPI.RandomAccess`](@ref), -[`LearnAPI.FiniteIterable`](@ref), and [`LearnAPI.Iterable`](@ref). +See [`LearnAPI.DataInterface`](@ref) for possible return values. See also [`obs`](@ref). @@ -416,16 +415,33 @@ Implement if algorithm is iterative. Returns a symbol or `nothing`. """ iteration_parameter(::Any) = nothing +# """ +# LearnAPI.fit_observation_scitype(learner) + +# Return an upper bound `S` on the scitype of individual observations guaranteed to work +# when calling `fit`: if `observations = obs(learner, data)` and +# `ScientificTypes.scitype(collect(o)) <:S` for each `o` in `observations`, then the call +# `fit(learner, data)` is supported. + +# $DOC_EXPLAIN_EACHOBS + +# See also [`LearnAPI.target_observation_scitype`](@ref). + +# # New implementations + +# Optional. The fallback return value is `Union{}`. + +# """ +# fit_observation_scitype(::Any) = Union{} """ - LearnAPI.fit_observation_scitype(learner) + LearnAPI.fit_scitype(learner) -Return an upper bound `S` on the scitype of individual observations guaranteed to work -when calling `fit`: if `observations = obs(learner, data)` and -`ScientificTypes.scitype(collect(o)) <:S` for each `o` in `observations`, then the call -`fit(learner, data)` is supported. +Return an upper bound `S` on the `scitype` (scientific type) of `data` for which the call +[`fit(learner, data)`](@ref) is supported. Specifically, if `ScientificTypes.scitype(data) +<: S` then the call is guaranteed to succeed. If not, the call may or may not succeed. -$DOC_EXPLAIN_EACHOBS +See ScientificTypes.jl documentation for more on the `scitype` function. See also [`LearnAPI.target_observation_scitype`](@ref). @@ -434,22 +450,32 @@ See also [`LearnAPI.target_observation_scitype`](@ref). Optional. The fallback return value is `Union{}`. """ -fit_observation_scitype(::Any) = Union{} +fit_scitype(::Any) = Union{} """ LearnAPI.target_observation_scitype(learner) -Return an upper bound `S` on the scitype of each observation of an applicable target -variable. Specifically: +Return an upper bound `S` on the `scitype` (scientific type) of each observation of any +target variable associated with the learner. See LearnAPI.jl documentation for the meaning +of "target variable". See ScientificTypes.jl documentation for an explanation of the +`scitype` function, which it provides. + +Specifically, both of the following are always true: - If `:(LearnAPI.target) in LearnAPI.functions(learner)` (i.e., `fit` consumes target - variables) then "target" means anything returned by `LearnAPI.target(learner, data)`, - where `data` is an admissible argument in the call `fit(learner, data)`. + variables) then `ScientificTypes.scitype(o) <: S` for each `o` in `target_observations`, + where `target_observations = `[`LearnAPI.target(learner, observations)`](@ref), + `observations = `[`LearnAPI.obs(learner, data)`](@ref), and `data` is a supported + argument in the call [`fit(learner, data)`](@ref). Here, "for each `o` in + `target_observations`" is understood in the sense of the data interface specified for + the learner, [`LearnAPI.data_interface(learner)`](@ref). For example, if this is + `LearnAPI.RandomAccess()`, then this means "for each `o in + MLUtils.eachobs(target_observations)`". -- `S` will always be an upper bound on the scitype of (point) observations that could be - conceivably extracted from the output of [`predict`](@ref). +- `S` is an upper bound on the `scitype` of (point) observations that might normally be + extracted from the output of [`predict`](@ref). -To illustate the second case, suppose we have +To illustate the second property, suppose we have ```julia model = fit(learner, data) @@ -457,9 +483,9 @@ ŷ = predict(model, Sampleable(), data_new) ``` Then each individual sample generated by each "observation" of `ŷ` (a vector of sampleable -objects, say) will be bound in scitype by `S`. +objects, say) will be bound in `scitype` by `S`. -See also See also [`LearnAPI.fit_observation_scitype`](@ref). +See also See also [`LearnAPI.fit_scitype`](@ref). # New implementations @@ -487,6 +513,16 @@ This trait should not be overloaded. Instead overload [`LearnAPI.nonlearners`](@ """ learners(learner) = setdiff(propertynames(learner), nonlearners(learner)) + +""" + LearnAPI.is_learner(object) + +Returns `true` if `object` has a valid implementation of the LearnAPI.jl +interface. Equivalent to non-emptiness of [`LearnAPI.functions(object)`](@ref). + +This trait should never be overloaded explicitly. + +""" is_learner(learner) = !isempty(functions(learner)) preferred_kind_of_proxy(learner) = first(kinds_of_proxy(learner)) target(learner) = :(LearnAPI.target) in functions(learner) diff --git a/src/types.jl b/src/types.jl index faa6d250..3212c3f2 100644 --- a/src/types.jl +++ b/src/types.jl @@ -7,7 +7,7 @@ abstract type KindOfProxy end LearnAPI.IID <: LearnAPI.KindOfProxy Abstract subtype of [`LearnAPI.KindOfProxy`](@ref). If `kind_of_proxy` is an instance of -`LearnAPI.IID` then, given `data` constisting of ``n`` observations, the +`LearnAPI.IID` then, given `data` consisting of ``n`` observations, the following must hold: - `ŷ = LearnAPI.predict(model, kind_of_proxy, data)` is @@ -20,9 +20,10 @@ See also [`LearnAPI.KindOfProxy`](@ref). # Extended help -| type | form of an observation | -|:-------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | +| type | form of an observation | +|:----------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Point` | same as target observations; may have the interpretation of a 50% quantile, 50% expectile or mode | +| `Interpolated` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | | `Sampleable` | object that can be sampled to obtain object of the same form as target observation | | `Distribution` | explicit probability density/mass function whose sample space is all possible target observations | | `LogDistribution` | explicit log-probability density/mass function whose sample space is possible target observations | @@ -40,9 +41,8 @@ See also [`LearnAPI.KindOfProxy`](@ref). | `ProbabilisticFuzzy` | as for `Fuzzy` but labeled with probabilities (not necessarily summing to one) | | `SurvivalFunction` | survival function | | `SurvivalDistribution` | probability distribution for survival time | -| `SurvivalHazardFunction` | hazard function for survival time | +| `HazardFunction` | hazard function for survival time | | `OutlierScore` | numerical score reflecting degree of outlierness (not necessarily normalized) | -| `Continuous` | real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls) | ¹Provided for completeness but discouraged to avoid [ambiguities in representation](https://github.com/alan-turing-institute/MLJ.jl/blob/dev/paper/paper.md#a-unified-approach-to-probabilistic-predictions-and-their-evaluation). @@ -72,7 +72,7 @@ const IID_SYMBOLS = [ :SurvivalDistribution, :HazardFunction, :OutlierScore, - :Continuous, + :Interpolated, :Quantile, :Expectile, ] @@ -186,6 +186,25 @@ KindOfProxy # # DATA INTERFACES +""" + + LearnAPI.DataInterface + +Abstract supertype for singleton types designating an interface for accessing observations +within a LearnAPI.jl data object. + +New learner implementations must overload [`LearnAPI.data_interface(learner)`](@ref) to +return one of the instances below if the output of [`obs`](@ref) does not implement the +default [`LearnAPI.RandomAccess()`](@ref) interface. Arrays, most tables, and all tuples +thereof, implement `RandomAccess()`. + +Available instances: + +- [`LearnAPI.RandomAccess()`](@ref) (default) +- [`LearnAPI.FiniteIterable()`](@ref) +- [`LearnAPI.Iterable()`](@ref) + +""" abstract type DataInterface end abstract type Finite <: DataInterface end diff --git a/src/verbosity.jl b/src/verbosity.jl deleted file mode 100644 index 3723bb77..00000000 --- a/src/verbosity.jl +++ /dev/null @@ -1,25 +0,0 @@ -const DEFAULT_VERBOSITY = Ref(1) - -""" - LearnAPI.default_verbosity() - LearnAPI.default_verbosity(verbosity::Int) - -Respectively return, or set, the default `verbosity` level for LearnAPI.jl methods that -support it, which includes [`fit`](@ref), [`update`](@ref), -[`update_observations`](@ref), and [`update_features`](@ref). The effect in a top-level -call is generally: - - - -| `verbosity` | behaviour | -|:------------|:--------------| -| 1 | informational | -| 0 | warnings only | - - -Methods consuming `verbosity` generally call other verbosity-supporting methods -at one level lower, so increasing `verbosity` beyond `1` may be useful. - -""" -default_verbosity() = DEFAULT_VERBOSITY[] -default_verbosity(level) = (DEFAULT_VERBOSITY[] = level) diff --git a/test/target_features.jl b/test/features_target_weights.jl similarity index 80% rename from test/target_features.jl rename to test/features_target_weights.jl index b84ded25..4809f5df 100644 --- a/test/target_features.jl +++ b/test/features_target_weights.jl @@ -3,7 +3,7 @@ using LearnAPI struct Avocado end -@test isnothing(LearnAPI.target(Avocado(), "salsa")) +@test LearnAPI.target(Avocado(), (1, 2, 3)) == 3 @test isnothing(LearnAPI.weights(Avocado(), "salsa")) @test LearnAPI.features(Avocado(), "salsa") == "salsa" @test LearnAPI.features(Avocado(), (:X, :y)) == :X diff --git a/test/runtests.jl b/test/runtests.jl index 056fa491..e8117976 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -2,13 +2,12 @@ using Test test_files = [ "tools.jl", - "verbosity.jl", "traits.jl", "clone.jl", "predict_transform.jl", "obs.jl", "accessor_functions.jl", - "target_features.jl", + "features_target_weights.jl", ] files = isempty(ARGS) ? test_files : ARGS diff --git a/test/traits.jl b/test/traits.jl index 8b0353f3..0a7023dd 100644 --- a/test/traits.jl +++ b/test/traits.jl @@ -43,7 +43,7 @@ small = SmallLearner() @test LearnAPI.human_name(small) == "small learner" @test isnothing(LearnAPI.iteration_parameter(small)) @test LearnAPI.data_interface(small) == LearnAPI.RandomAccess() -@test !(6 isa LearnAPI.fit_observation_scitype(small)) +@test !(6 isa LearnAPI.fit_scitype(small)) @test 6 isa LearnAPI.target_observation_scitype(small) @test !LearnAPI.is_static(small) diff --git a/test/verbosity.jl b/test/verbosity.jl deleted file mode 100644 index 72ce29c8..00000000 --- a/test/verbosity.jl +++ /dev/null @@ -1,7 +0,0 @@ -using Test - -@test LearnAPI.default_verbosity() ==1 -LearnAPI.default_verbosity(42) -@test LearnAPI.default_verbosity() == 42 - -true