The source code in this folder is used for mining LODA programs that generate integer sequences from the OEIS database. The high-level architecture is shown below. The mining process is coordinated in the Miner class. It uses Generator instances from the gen module to produce random LODA programs, which are evaluated and matched against the integer sequence database using instances of the Matcher class. If a generated program matches a sequence and there is no known program for it or it is considered better as the existing one, it is written to the programs folder and submitted to the central API server.
There are multiple generator and matcher implementations available. They are configured and combined in the central mining configuration file miners.json. A miner profile has the following basic properties:
- "enabled": Boolean to easily enable or disable a profile.
- "backoff": Boolean to control the back-off strategy during sequence matching.
- "overwrite": Controls whether existing programs should be overwritten. Possible values are "none", "auto" or "all".
See the gen module documentation for details on program generation and the sections below for details on sequence matching.
Matchers are used to match generated programs to integer sequences. They include logic to handle "almost-matching" programs. Multiple matchers can be activated in the miner configuration, for example:
{
"matchers": [
"direct",
"linear1",
"linear2",
"binary",
"decimal",
"delta"
]
}Matchers perform two operations:
- Reduce Sequences: extract parameters from sequences and reduce sequences to more basic sequences.
- Extend Programs: extend generated programs using the extracted sequence parameters so that they match a target sequence.
See the examples in the following sections for better understanding of the process.
This is the most simple of all matchers. It just evaluates the generated program for a number of terms and matches it to sequences starting with exactly these terms. Thus, sequences are not reduced and generated program are not extended.
Supports linear output transformations to match sequences. Sequences are reduced as follows:
- Find the smallest non-negative term in a sequences and store it as
offset. Subtract theoffsetfrom all sequence terms. - Find the largest positive common divisor of all sequence terms and store it as
factor. Divide all sequence terms by thefactor.
For example, given a sequence 36,21,42,90,51. In the first step, it is reduced to 15,0,21,69,30 with offset=21. In the second step, it is further reduced to 5,0,7,23,10 with factor=3.
This reduction and parameter extraction is performed for all known target sequences in the OEIS database. We store an index of the reduced sequences together with their extracted offsets and factors.
A generated program is first evaluated to a sequence. This sequence is then reduced in the same way as described above. The reduced sequence is looked up in the index. If there was a match, the offsets and factors of the generated program and the target sequences are used to extend the generated program at its end with a linear transformation such that it matches the target sequence.
For example, say, we randomly generated this program:
; generated program
add $0,3
mul $0,2
pow $0,2It is first evaluated to 36,64,100,144,196. This sequence is reduced to 0,7,16,27,40,55 with offset=36 and factor=4.
Now assume we are looking for a program for this sequence in the database: 1,22,49,81,121,166. It is reduced to 0,7,16,27,40,55 with offset=1 and factor=3. Notice that the reduced sequences match! Therefore, the matcher can extend the generated program as follows to match the target sequence:
; extended program
add $0,3
mul $0,2
pow $0,2
sub $0,36 ; offset of generated sequence
div $0,4 ; factor of generated sequence
mul $0,3 ; factor of target sequence
add $0,1 ; offset of target sequenceThe Linear Matcher 2 works analogously to the Linear Matcher 1, with the only difference that the sequence reduction steps are reversed: first divide by common factor, then subtract the offset.
This matcher supports target sequences that consist of decimal digits only. Sequences are reduced in these steps:
- All terms are converted to
mod 10. - Find the decimal digit that occurs most often and store it as
offset. Subtract the offset from all termsmod 10.
Generated programs are extended using add and mod operations to adjust offsets.
Works analogously to the Decimal Matcher, but for sequences with binary digits (0,1).
This matcher is used for differenc sequences and partial sums. Strictly monotonically increasing sequences are reduced to difference sequences until at least one term is zero. The number of differences operations is stored as parameter. Generated programs can be extended in two ways: the difference between consecutive terms is computed, or partial sums are computed using a loops.
Caution: this matcher and its generated programs are typically computationally expensive.
