Skip to content

Commit d79268c

Browse files
authored
Merge pull request #6296 from brs96/configure-split-doc
2.2 Add LP configureSplit example
2 parents e51e259 + 3b6b3e8 commit d79268c

File tree

5 files changed

+30
-1
lines changed

5 files changed

+30
-1
lines changed
162 KB
Loading
102 KB
Loading
71.9 KB
Loading

doc/modules/ROOT/pages/machine-learning/linkprediction-pipelines/config.adoc

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -355,6 +355,35 @@ YIELD splitConfig
355355
We now reconfigured the splitting of the pipeline, which will be applied during xref:machine-learning/linkprediction-pipelines/training.adoc[training].
356356
--
357357

358+
359+
As an example, consider a graph with nodes 'Person' and 'City' and relationships 'KNOWS', 'BORN' and 'LIVES'.
360+
Please note that this is the same example as in xref:machine-learning/linkprediction-pipelines/training.adoc#linkprediction-pipelines-train-example[Training the pipeline].
361+
362+
.Full example graph
363+
image::example-graphs/link-prediction.svg[Visualization of the example graph,align="center"]
364+
365+
Suppose we filter by `sourceNodeLabel` and `targetNodeLabel` being `Person` and `targetRelationshipType` being `KNOWS`.
366+
The filtered graph looks like the following:
367+
368+
.Filtered graph
369+
image::example-graphs/lp-split.png[example graph for LP split,align="center"]
370+
371+
The filtered graph has 12 relationships.
372+
If we configure split with `testFraction` 0.25 and `negativeSamplingRatio` 1, it randomly picks `12 * 0.25 = 3` positive relationships plus `1 * 3 = 3` negative relationship as the `test` set.
373+
374+
Then if `trainFraction` is 0.6 and `negativeSamplingRatio` 1, it randomly picks `9 * 0.6 = 5.4 ≈ 5` positive relationships plus `1 * 5 = 5` negative relationship as the `train` set.
375+
376+
The remaining `12 * (1 - 0.25) * (1 - 0.6) = 3.6 ≈ 4` relationships in yellow is the `feature-input` set.
377+
378+
.Positive and negative relationships for each set according to the split. The `test` set is in blue, `train` set in red and `feature-input` set in yellow. Dashed lines represent negative relationships.
379+
image::example-graphs/lp-split-1.png[example graph for LP split,align="center"]
380+
381+
Suppose for example a node property step is added with `contextNodeLabel` `City` and `contextRelationshipType` `BORN`.
382+
Then the `feature-input` graph for that step would be:
383+
384+
.Feature-input graph. The `feature-input` set is in yellow.
385+
image::example-graphs/lp-split-2.png[example graph for LP split,align="center"]
386+
358387
[[linkprediction-adding-model-candidates]]
359388
== Adding model candidates
360389

doc/modules/ROOT/pages/machine-learning/linkprediction-pipelines/training.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ The `OUT_OF_BAG_ERROR` is only reported in (7) as validation metric and only if
140140

141141
include::partial$/machine-learning/pipeline-training-logging-note.adoc[]
142142

143-
143+
[[linkprediction-pipelines-train-example]]
144144
== Example
145145

146146
In this example we will create a small graph and use the training pipeline we have built up thus far.

0 commit comments

Comments
 (0)