Skip to content

Commit 1445d48

Browse files
Node similarity formula
1 parent 452b051 commit 1445d48

File tree

2 files changed

+15
-3
lines changed

2 files changed

+15
-3
lines changed
Lines changed: 2 additions & 0 deletions
Loading

doc/modules/ROOT/pages/algorithms/node-similarity.adoc

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,9 @@ include::partial$/algorithms/shared/algorithm-traits.adoc[]
2323

2424
The Node Similarity algorithm compares a set of nodes based on the nodes they are connected to.
2525
Two nodes are considered similar if they share many of the same neighbors.
26-
Node Similarity computes pair-wise similarities based on either the Jaccard metric, also known as the Jaccard Similarity Score, or the Overlap coefficient, also known as the Szymkiewicz–Simpson coefficient.
26+
Node Similarity computes pair-wise similarities based on the Jaccard metric, also known as the Jaccard Similarity Score, the Overlap coefficient, also known as the Szymkiewicz–Simpson coefficient, and the Cosine Similarity score.
27+
The first two are most frequently associated with unweighted sets, whereas Cosine with weighted input.
28+
2729

2830
Given two sets `A` and `B`, the Jaccard Similarity is computed using the following formula:
2931

@@ -37,6 +39,13 @@ image::nodesim-formulas/overlap_nodesim.svg[role="middle"]
3739
// This is the raw information for this image:
3840
// // O(A,B) = ∣A ∩ B∣ / min(|A|, |B|∣
3941

42+
Formulas for the weighted case can be found in the xref:algorithms-node-similarity-examples-weighted[weighted examples below].
43+
44+
45+
The cosine similarity score is computed using the following formula, where entries are implicitly given a weight of `1` when A,B are unweighted:
46+
47+
image::nodesim-formulas/cos.svg[role="middle"]
48+
4049
The input of this algorithm is a bipartite, connected graph containing two disjoint node sets.
4150
Each relationship starts from a node in the first node set and ends at a node in the second node set.
4251

@@ -653,8 +662,8 @@ ORDER BY Person1
653662
[[algorithms-node-similarity-examples-weighted]]
654663
=== Weighted Similarity
655664

656-
Relationship properties can be used to modify the similarity induced by certain relationships.
657-
Weighted node similarity has as default the weighted Jaccard similarty, according to the formula:
665+
Relationship properties can be used to modify the similarity induced by certain relationships by taking their value as a way of measuring importance.
666+
By default, Weighted node similarity uses weighted Jaccard similarity, according to the formula:
658667

659668
image::nodesim-formulas/weighted_jaccard.svg[role="middle"]
660669

@@ -664,6 +673,7 @@ It also supports weighted Overlap similarity, according to the formula:
664673

665674
image::nodesim-formulas/weighted_overlap.svg[role="middle"]
666675

676+
In addition, Cosine similarity can be used in the weighted case as mentioned in xref:algorithms-node-similarity-intro[introduction].
667677

668678
[NOTE]
669679
====

0 commit comments

Comments
 (0)