Skip to content

Commit c5b7218

Browse files
authored
Merge pull request #45 from JuliaAI/dev
For a 0.2.0 release
2 parents 3b31884 + 017e61e commit c5b7218

25 files changed

+685
-403
lines changed

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "LearnAPI"
22
uuid = "92ad9a40-7767-427a-9ee6-6e577f1266cb"
33
authors = ["Anthony D. Blaom <[email protected]>"]
4-
version = "0.1.0"
4+
version = "0.2.0"
55

66
[compat]
77
julia = "1.10"

docs/make.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ makedocs(
2020
"predict/transform" => "predict_transform.md",
2121
"Kinds of Target Proxy" => "kinds_of_target_proxy.md",
2222
"obs and Data Interfaces" => "obs.md",
23-
"target/weights/features" => "target_weights_features.md",
23+
"features/target/weights" => "features_target_weights.md",
2424
"Accessor Functions" => "accessor_functions.md",
2525
"Learner Traits" => "traits.md",
2626
],

docs/src/anatomy_of_an_implementation.md

Lines changed: 129 additions & 85 deletions
Large diffs are not rendered by default.

docs/src/common_implementation_patterns.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ which introduces the main interface objects and terminology.
1010

1111
Although an implementation is defined purely by the methods and traits it implements, many
1212
implementations fall into one (or more) of the following informally understood patterns or
13-
"tasks":
13+
tasks:
1414

1515
- [Regression](@ref): Supervised learners for continuous targets
1616

docs/src/examples.md

Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
# [Code for ridge example](@id code)
2+
3+
Below is the complete source code for the ridge implementations described in the tutorial,
4+
[Anatomy of an Implementation](@ref).
5+
6+
- [Basic implementation](@ref)
7+
- [Implementation with data front end](@ref)
8+
9+
10+
## Basic implementation
11+
12+
```julia
13+
using LearnAPI
14+
using LinearAlgebra, Tables
15+
16+
struct Ridge{T<:Real}
17+
lambda::T
18+
end
19+
20+
"""
21+
Ridge(; lambda=0.1)
22+
23+
Instantiate a ridge regression learner, with regularization of `lambda`.
24+
"""
25+
Ridge(; lambda=0.1) = Ridge(lambda)
26+
LearnAPI.constructor(::Ridge) = Ridge
27+
28+
# struct for output of `fit`
29+
struct RidgeFitted{T,F}
30+
learner::Ridge
31+
coefficients::Vector{T}
32+
named_coefficients::F
33+
end
34+
35+
function LearnAPI.fit(learner::Ridge, data; verbosity=1)
36+
X, y = data
37+
38+
# data preprocessing:
39+
table = Tables.columntable(X)
40+
names = Tables.columnnames(table) |> collect
41+
A = Tables.matrix(table, transpose=true)
42+
43+
lambda = learner.lambda
44+
45+
# apply core algorithm:
46+
coefficients = (A*A' + learner.lambda*I)\(A*y) # vector
47+
48+
# determine named coefficients:
49+
named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]
50+
51+
# make some noise, if allowed:
52+
verbosity > 0 && @info "Coefficients: $named_coefficients"
53+
54+
return RidgeFitted(learner, coefficients, named_coefficients)
55+
end
56+
57+
LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) =
58+
Tables.matrix(Xnew)*model.coefficients
59+
60+
# accessor functions:
61+
LearnAPI.learner(model::RidgeFitted) = model.learner
62+
LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients
63+
LearnAPI.strip(model::RidgeFitted) =
64+
RidgeFitted(model.learner, model.coefficients, nothing)
65+
66+
@trait(
67+
Ridge,
68+
constructor = Ridge,
69+
kinds_of_proxy=(Point(),),
70+
tags = ("regression",),
71+
functions = (
72+
:(LearnAPI.fit),
73+
:(LearnAPI.learner),
74+
:(LearnAPI.clone),
75+
:(LearnAPI.strip),
76+
:(LearnAPI.obs),
77+
:(LearnAPI.features),
78+
:(LearnAPI.target),
79+
:(LearnAPI.predict),
80+
:(LearnAPI.coefficients),
81+
)
82+
)
83+
84+
# convenience method:
85+
LearnAPI.fit(learner::Ridge, X, y; kwargs...) = fit(learner, (X, y); kwargs...)
86+
```
87+
88+
# Implementation with data front end
89+
90+
```julia
91+
using LearnAPI
92+
using LinearAlgebra, Tables
93+
94+
struct Ridge{T<:Real}
95+
lambda::T
96+
end
97+
98+
Ridge(; lambda=0.1) = Ridge(lambda)
99+
100+
# struct for output of `fit`:
101+
struct RidgeFitted{T,F}
102+
learner::Ridge
103+
coefficients::Vector{T}
104+
named_coefficients::F
105+
end
106+
107+
# struct for internal representation of training data:
108+
struct RidgeFitObs{T,M<:AbstractMatrix{T}}
109+
A::M # `p` x `n` matrix
110+
names::Vector{Symbol} # features
111+
y::Vector{T} # target
112+
end
113+
114+
# implementation of `RandomAccess()` data interface for such representation:
115+
Base.getindex(data::RidgeFitObs, I) =
116+
RidgeFitObs(data.A[:,I], data.names, y[I])
117+
Base.length(data::RidgeFitObs) = length(data.y)
118+
119+
# data front end for `fit`:
120+
function LearnAPI.obs(::Ridge, data)
121+
X, y = data
122+
table = Tables.columntable(X)
123+
names = Tables.columnnames(table) |> collect
124+
return RidgeFitObs(Tables.matrix(table)', names, y)
125+
end
126+
LearnAPI.obs(::Ridge, observations::RidgeFitObs) = observations
127+
128+
function LearnAPI.fit(learner::Ridge, observations::RidgeFitObs; verbosity=1)
129+
130+
lambda = learner.lambda
131+
132+
A = observations.A
133+
names = observations.names
134+
y = observations.y
135+
136+
# apply core learner:
137+
coefficients = (A*A' + learner.lambda*I)\(A*y) # 1 x p matrix
138+
139+
# determine named coefficients:
140+
named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]
141+
142+
# make some noise, if allowed:
143+
verbosity > 0 && @info "Coefficients: $named_coefficients"
144+
145+
return RidgeFitted(learner, coefficients, named_coefficients)
146+
147+
end
148+
149+
LearnAPI.fit(learner::Ridge, data; kwargs...) =
150+
fit(learner, obs(learner, data); kwargs...)
151+
152+
# data front end for `predict`:
153+
LearnAPI.obs(::RidgeFitted, Xnew) = Tables.matrix(Xnew)'
154+
LearnAPI.obs(::RidgeFitted, observations::AbstractArray) = observations # involutivity
155+
156+
LearnAPI.predict(model::RidgeFitted, ::Point, observations::AbstractMatrix) =
157+
observations'*model.coefficients
158+
159+
LearnAPI.predict(model::RidgeFitted, ::Point, Xnew) =
160+
predict(model, Point(), obs(model, Xnew))
161+
162+
# methods to deconstruct training data:
163+
LearnAPI.features(::Ridge, observations::RidgeFitObs) = observations.A
164+
LearnAPI.target(::Ridge, observations::RidgeFitObs) = observations.y
165+
LearnAPI.features(learner::Ridge, data) = LearnAPI.features(learner, obs(learner, data))
166+
LearnAPI.target(learner::Ridge, data) = LearnAPI.target(learner, obs(learner, data))
167+
168+
# accessor functions:
169+
LearnAPI.learner(model::RidgeFitted) = model.learner
170+
LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients
171+
LearnAPI.strip(model::RidgeFitted) =
172+
RidgeFitted(model.learner, model.coefficients, nothing)
173+
174+
@trait(
175+
Ridge,
176+
constructor = Ridge,
177+
kinds_of_proxy=(Point(),),
178+
tags = ("regression",),
179+
functions = (
180+
:(LearnAPI.fit),
181+
:(LearnAPI.learner),
182+
:(LearnAPI.clone),
183+
:(LearnAPI.strip),
184+
:(LearnAPI.obs),
185+
:(LearnAPI.features),
186+
:(LearnAPI.target),
187+
:(LearnAPI.predict),
188+
:(LearnAPI.coefficients),
189+
)
190+
)
191+
192+
```

docs/src/features_target_weights.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# [`features`, `target`, and `weights`](@id input)
2+
3+
Methods for extracting certain parts of `data` for all supported calls of the form
4+
[`fit(learner, data)`](@ref).
5+
6+
```julia
7+
LearnAPI.features(learner, data) -> <training "features", suitable input for `predict` or `transform`>
8+
LearnAPI.target(learner, data) -> <target variable>
9+
LearnAPI.weights(learner, data) -> <per-observation weights>
10+
```
11+
12+
Here `data` is something supported in a call of the form `fit(learner, data)`.
13+
14+
# Typical workflow
15+
16+
Not typically appearing in a general user's workflow but useful in meta-alagorithms, such
17+
as cross-validation (see the example in [`obs` and Data Interfaces](@ref data_interface)).
18+
19+
Supposing `learner` is a supervised classifier predicting a vector
20+
target:
21+
22+
```julia
23+
model = fit(learner, data)
24+
X = LearnAPI.features(learner, data)
25+
y = LearnAPI.target(learner, data)
26+
= predict(model, Point(), X)
27+
training_loss = sum(ŷ .!= y)
28+
```
29+
30+
# Implementation guide
31+
32+
| method | fallback return value | compulsory? |
33+
|:-------------------------------------------|:---------------------------------------------:|--------------------------|
34+
| [`LearnAPI.features(learner, data)`](@ref) | `first(data)` if `data` is tuple, else `data` | if fallback insufficient |
35+
| [`LearnAPI.target(learner, data)`](@ref) | `last(data)` | if fallback insufficient |
36+
| [`LearnAPI.weights(learner, data)`](@ref) | `nothing` | no |
37+
38+
39+
# Reference
40+
41+
```@docs
42+
LearnAPI.features
43+
LearnAPI.target
44+
LearnAPI.weights
45+
```

docs/src/fit_update.md

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,12 @@
33
### Training
44

55
```julia
6-
fit(learner, data; verbosity=LearnAPI.default_verbosity()) -> model
7-
fit(learner; verbosity=LearnAPI.default_verbosity()) -> static_model
6+
fit(learner, data; verbosity=1) -> model
7+
fit(learner; verbosity=1) -> static_model
88
```
99

1010
A "static" algorithm is one that does not generalize to new observations (e.g., some
11-
clustering algorithms); there is no training data and the algorithm is executed by
11+
clustering algorithms); there is no training data and heavy lifting is carried out by
1212
`predict` or `transform` which receive the data. See example below.
1313

1414

@@ -101,18 +101,18 @@ See also [Density Estimation](@ref).
101101

102102
Exactly one of the following must be implemented:
103103

104-
| method | fallback |
105-
|:-----------------------------------------------------------------------|:---------|
106-
| [`fit`](@ref)`(learner, data; verbosity=LearnAPI.default_verbosity())` | none |
107-
| [`fit`](@ref)`(learner; verbosity=LearnAPI.default_verbosity())` | none |
104+
| method | fallback |
105+
|:--------------------------------------------|:---------|
106+
| [`fit`](@ref)`(learner, data; verbosity=1)` | none |
107+
| [`fit`](@ref)`(learner; verbosity=1)` | none |
108108

109109
### Updating
110110

111111
| method | fallback | compulsory? |
112112
|:-------------------------------------------------------------------------------------|:---------|-------------|
113-
| [`update`](@ref)`(model, data; verbosity=..., hyperparameter_updates...)` | none | no |
114-
| [`update_observations`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no |
115-
| [`update_features`](@ref)`(model, new_data; verbosity=..., hyperparameter_updates...)` | none | no |
113+
| [`update`](@ref)`(model, data; verbosity=1, hyperparameter_updates...)` | none | no |
114+
| [`update_observations`](@ref)`(model, new_data; verbosity=1, hyperparameter_updates...)` | none | no |
115+
| [`update_features`](@ref)`(model, new_data; verbosity=1, hyperparameter_updates...)` | none | no |
116116

117117
There are some contracts governing the behaviour of the update methods, as they relate to
118118
a previous `fit` call. Consult the document strings for details.
@@ -124,5 +124,4 @@ fit
124124
update
125125
update_observations
126126
update_features
127-
LearnAPI.default_verbosity
128127
```

docs/src/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Suppose `forest` is some object encapsulating the hyperparameters of the [random
4747
algorithm](https://en.wikipedia.org/wiki/Random_forest) (the number of trees, etc.). Then,
4848
a LearnAPI.jl interface can be implemented, for objects with the type of `forest`, to
4949
enable the basic workflow below. In this case data is presented following the
50-
"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other patterns as well.
50+
"scikit-learn" `X, y` pattern, although LearnAPI.jl supports other data pattern.
5151

5252
```julia
5353
# `X` is some training features

docs/src/obs.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ obs(learner, data) # can be passed to `fit` instead of `data`
1212
obs(model, data) # can be passed to `predict` or `transform` instead of `data`
1313
```
1414

15+
- [Data interfaces](@ref data_interfaces)
16+
17+
1518
## [Typical workflows](@id obs_workflows)
1619

1720
LearnAPI.jl makes no universal assumptions about the form of `data` in a call
@@ -93,18 +96,11 @@ A sample implementation is given in [Providing a separate data front end](@ref).
9396
obs
9497
```
9598

96-
### [Data interfaces](@id data_interfaces)
97-
98-
New implementations must overload [`LearnAPI.data_interface(learner)`](@ref) if the
99-
output of [`obs`](@ref) does not implement [`LearnAPI.RandomAccess()`](@ref). Arrays, most
100-
tables, and all tuples thereof, implement `RandomAccess()`.
101-
102-
- [`LearnAPI.RandomAccess`](@ref) (default)
103-
- [`LearnAPI.FiniteIterable`](@ref)
104-
- [`LearnAPI.Iterable`](@ref)
99+
### [Available data interfaces](@id data_interfaces)
105100

106101

107102
```@docs
103+
LearnAPI.DataInterface
108104
LearnAPI.RandomAccess
109105
LearnAPI.FiniteIterable
110106
LearnAPI.Iterable

docs/src/patterns/transformers.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# [Transformers](@id transformers)
22

3-
Check out the following examples:
3+
Check out the following examples from the TestLearnAPI.jl test suite:
44

5-
- [Truncated
6-
SVD]((https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl
7-
(from the TestLearnAPI.jl test suite)
5+
- [Truncated SVD](https://github.com/JuliaAI/LearnTestAPI.jl/blob/dev/test/patterns/dimension_reduction.jl)

0 commit comments

Comments
 (0)