Skip to content

Commit 5b5a977

Browse files
committed
Removed AggregateLocalStep, aggregate(Scope, String), and store() in favor of using local(aggregate(String)) for lazy aggregation. Updated relevant docs and added additional feature tests.
1 parent 65a438a commit 5b5a977

File tree

57 files changed

+1119
-1603
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1119
-1603
lines changed

CHANGELOG.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ This release also includes changes from <<release-3-7-XXX, 3.7.XXX>>.
2828
* Added a Gremln MCP server.
2929
* Added the Air Routes dataset to the set of available samples packaged with distributions.
3030
* Added a minimal distribution for `tinkergraph-gremlin` using the `min` classifier that doesn't include the sample datasets.
31+
* Removed `AggregateLocalStep` and `aggregate(Scope, String)`, and renamed `AggregateGlobalStep` to `AggregateStep`.
32+
* Removed `store()` in favor of using `local(aggregate())`.
3133
* Removed Vertex/ReferenceVertex from grammar. Use vertex id in traversals now instead.
3234
* Removed `has(key, traversal)` option for `has()` step.
3335
* Fixed bug where `InlineFilterStrategy` could add an empty `has()`.

docs/src/dev/provider/gremlin-semantics.asciidoc

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -587,7 +587,7 @@ link:https://tinkerpop.apache.org/docs/x.y.z/reference/#all-step[reference]
587587
588588
*Description:* Collects all objects in the traversal into a collection.
589589
590-
*Syntax:* `aggregate(String sideEffectKey)` | `aggregate(Scope scope, String sideEffectKey)`
590+
*Syntax:* `aggregate(String sideEffectKey)`
591591
592592
[width="100%",options="header"]
593593
|=========================================================
@@ -597,8 +597,6 @@ link:https://tinkerpop.apache.org/docs/x.y.z/reference/#all-step[reference]
597597
598598
*Arguments:*
599599
600-
* `scope` - Determines the scope in which `aggregate` is applied. The `global` scope will collect all objects across the
601-
traversal. The `local` scope will collect objects within the current object (if it's a collection).
602600
* `sideEffectKey` - The name of the side-effect key that will hold the aggregated objects.
603601
604602
*Modulation:*
@@ -614,8 +612,7 @@ The aggregated objects can be accessed later using the `cap()` step.
614612
615613
None
616614
617-
See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/AggregateGlobalStep.java[source],
618-
link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/AggregateLocalStep.java[source (local)],
615+
See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/AggregateStep.java[source],
619616
link:https://tinkerpop.apache.org/docs/x.y.z/reference/#aggregate-step[reference]
620617
621618
[[and-step]]

docs/src/recipes/appendix.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ The following example assumes that the edges point in the `OUT` direction. Assum
142142
----
143143
g.V().where(without("x")).as("a").
144144
outE().as("e").inV().as("b").
145-
filter(bothE().where(neq("e")).otherV().where(eq("a"))).aggregate(local, "x").
145+
filter(bothE().where(neq("e")).otherV().where(eq("a"))).local(aggregate("x")).
146146
select("a","b").dedup()
147147
----
148148

docs/src/recipes/centrality.asciidoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ g.V().as("v").
8585
select("triples").unfold().as("t").
8686
select("x","y").where(eq("a")).
8787
select("t"),
88-
aggregate(local,"triples")). <5>
88+
local(aggregate("triples"))). <5>
8989
select("z").as("length").
9090
select("triple").select("z").where(eq("length"))). <6>
9191
select(all, "v").unfold(). <7>
@@ -124,7 +124,7 @@ g.withSack(1f).V().as("v").
124124
select("triples").unfold().as("t").
125125
select("x","y").where(eq("a")).
126126
select("t"),
127-
aggregate(local,"triples")). <5>
127+
local(aggregate("triples"))). <5>
128128
select("z").as("length").
129129
select("triple").select("z").where(eq("length"))). <6>
130130
group().by(select(first, "v")). <7>

docs/src/recipes/collections.asciidoc

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ appear by way of some side-effect steps like `aggregate()`:
3939
[gremlin-groovy,modern]
4040
----
4141
g.V().fold()
42-
g.V().aggregate(local, 'a').cap('a')
42+
g.V().local(aggregate('a')).cap('a')
4343
----
4444
4545
It is worth noting that while a `Path` is not technically a `List` it does present like one and can be manipulated in
@@ -61,7 +61,7 @@ It may seem simple, but the most obvious choice to modifying what is in a list i
6161
[gremlin-groovy,modern]
6262
----
6363
g.V().fold().unfold().values('name')
64-
g.V().aggregate(local,'a').cap('a').unfold().values('name')
64+
g.V().local(aggregate('a')).cap('a').unfold().values('name')
6565
----
6666
6767
The above examples show that `unfold()` works quite well when you don't want to preserve the `List` structure of the
@@ -164,19 +164,19 @@ the use of `aggregate()` to aid in construction of this `List`:
164164
[gremlin-groovy,modern]
165165
----
166166
g.V().has('name','marko').as('v'). <1>
167-
aggregate(local,'a'). <2>
168-
by('age').
167+
local(aggregate('a'). <2>
168+
by('age')).
169169
repeat(outE().as('e').inV().as('v')). <3>
170170
until(has('lang','java')).
171171
aggregate('b'). <4>
172172
by(select(all,'v').unfold().values('name').fold()).
173173
aggregate('c'). <5>
174174
by(select(all,'e').unfold().values('weight').mean()).
175175
fold(). <6>
176-
aggregate(local,'a'). <7>
177-
by(cap('b')).
178-
aggregate(local,'a'). <8>
179-
by(cap('c')).
176+
local(aggregate('a'). <7>
177+
by(cap('b'))).
178+
local(aggregate('a'). <8>
179+
by(cap('c'))).
180180
cap('a')
181181
----
182182
@@ -190,9 +190,9 @@ of "java"). Note however that the `by()` modulator overrides that traverser comp
190190
the list of vertices in "v". Those vertices are unfolded to retrieve the name property from each and then are reduced
191191
with `fold()` back into a list to be stored in the side-effected named "b".
192192
<5> A similar use of `aggregate()` as the previous step, though this one turns "e" into a stream of edges to calculate
193-
the `mean()` to store in a `List` called "c". Note that `aggregate()` (short form for `aggregate(global)`) was used
194-
here instead of `aggregate(local)`, as the former is an eager collection of the elements in the stream
195-
(`aggregate(local)` is lazy) and will force the traversal to be iterated up to that point before moving forward.
193+
the `mean()` to store in a `List` called "c". Note that `aggregate()` with `local()` was used
194+
here instead of `aggregate()`, as the latter is an eager collection of the elements in the stream
195+
(`local(aggregate())` is lazy) and will force the traversal to be iterated up to that point before moving forward.
196196
Without that eager collection, "v" and "e" would not contain the complete information required for the production of
197197
"b" and "c".
198198
<6> Adding `fold()`-step here is a bit of a trick. To see the trick, copy and paste all lines of Gremlin up to but
@@ -203,11 +203,11 @@ when traversing away from "marko"). The `aggregate()`-steps are side-effects and
203203
through them unchanged. The `fold()` obviously converts those three traversers to a single `List` to make one
204204
traverser with a `List` inside. That means that the remaining steps following the `fold()` will only be executed one
205205
time each instead of three, which, as will be shown, is critical to the proper result.
206-
<7> The single traverser with the `List` of three vertices in it passes to `aggregate(local)`. The `by()` modulator
206+
<7> The single traverser with the `List` of three vertices in it passes to `local(aggregate())`. The `by()` modulator
207207
presents an override telling Gremlin to ignore the `List` of three vertices and simply grab the "b" side effect created
208208
earlier and stick that into "a" as part of the result. The `List` with three vertices passes out unchanged as
209-
`aggregate(local)` is a side-effect step.
210-
<8> Again, the single traverser with the `List` of three vertices passes to `aggregate(local)` and again, the `by()`
209+
`local(aggregate())` is a side-effect step.
210+
<8> Again, the single traverser with the `List` of three vertices passes to `local(aggregate())` and again, the `by()`
211211
modulator presents an override to include "c" into the result.
212212
213213
All of the above code and explanation show that `aggregate()` can be used to construct `List` objects as side-effects
@@ -236,11 +236,11 @@ g.V().
236236
bothE().count()).
237237
fold())
238238
g.V().
239-
aggregate(local, 'x').
239+
local(aggregate('x').
240240
by(union(select('x').count(local), <2>
241241
identity(),
242242
bothE().count()).
243-
fold()).
243+
fold())).
244244
cap('x')
245245
----
246246

docs/src/recipes/connected-components.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ A straightforward way to detect the various subgraphs with an OLTP traversal is
7777
[gremlin-groovy,existing]
7878
----
7979
g.V().emit(cyclicPath().or().not(both())). <1>
80-
repeat(__.where(without('a')).aggregate(local,'a').both()).until(cyclicPath()). <2>
80+
repeat(__.where(without('a')).local(aggregate('a')).both()).until(cyclicPath()). <2>
8181
group().by(path().unfold().limit(1)). <3>
8282
by(path().unfold().dedup().fold()). <4>
8383
select(values).unfold() <5>

docs/src/recipes/shortest-path.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ g.withSideEffect('v', []). <1>
186186
inject(result.toArray()).as('kv').select(values).
187187
unfold().
188188
map(unfold().as('v_or_e').
189-
coalesce(V().where(eq('v_or_e')).aggregate(local,'v'),
189+
coalesce(V().where(eq('v_or_e')).local(aggregate('v')),
190190
select('v').tail(local, 1).unfold().bothE().where(eq('v_or_e'))).
191191
values('name','weight').
192192
fold()).

docs/src/reference/the-traversal.asciidoc

Lines changed: 13 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -454,8 +454,8 @@ image:side-effect-lambda.png[width=175,float=right]
454454
[gremlin-groovy,modern]
455455
----
456456
g.V().hasLabel('person').sideEffect(System.out.&println) <1>
457-
g.V().sideEffect(outE().count().aggregate(local,"o")).
458-
sideEffect(inE().count().aggregate(local,"i")).cap("o","i") <2>
457+
g.V().sideEffect(outE().count().local(aggregate("o"))).
458+
sideEffect(inE().count().local(aggregate("i"))).cap("o","i") <2>
459459
----
460460
461461
<1> Whatever enters `sideEffect()` is passed to the next step, but some intervening process can occur.
@@ -596,26 +596,22 @@ link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gre
596596
image::aggregate-step.png[width=800]
597597
598598
The `aggregate()`-step (*sideEffect*) is used to aggregate all the objects at a particular point of traversal into a
599-
`Collection`. The step is uses `Scope` to help determine the aggregating behavior. For `global` scope this means that
600-
the step will use link:http://en.wikipedia.org/wiki/Eager_evaluation[eager evaluation] in that no objects continue on
601-
until all previous objects have been fully aggregated. The eager evaluation model is crucial in situations
602-
where everything at a particular point is required for future computation. By default, when the overload of
603-
`aggregate()` is called without a `Scope`, the default is `global`. An example is provided below.
599+
`Collection`. By default, the step will use link:http://en.wikipedia.org/wiki/Eager_evaluation[eager evaluation] in that
600+
no objects continue on until all previous objects have been fully aggregated. The eager evaluation model is crucial in situations
601+
where everything at a particular point is required for future computation.
604602
605603
[gremlin-groovy,modern]
606604
----
607605
g.V(1).out('created') <1>
608606
g.V(1).out('created').aggregate('x') <2>
609-
g.V(1).out('created').aggregate(global, 'x') <3>
610-
g.V(1).out('created').aggregate('x').in('created') <4>
611-
g.V(1).out('created').aggregate('x').in('created').out('created') <5>
607+
g.V(1).out('created').aggregate('x').in('created') <3>
608+
g.V(1).out('created').aggregate('x').in('created').out('created') <4>
612609
g.V(1).out('created').aggregate('x').in('created').out('created').
613-
where(without('x')).values('name') <6>
610+
where(without('x')).values('name') <5>
614611
----
615612
616613
<1> What has marko created?
617614
<2> Aggregate all his creations.
618-
<3> Identical to the previous line.
619615
<3> Who are marko's collaborators?
620616
<4> What have marko's collaborators created?
621617
<5> What have marko's collaborators created that he hasn't created?
@@ -635,31 +631,23 @@ g.V().out('knows').aggregate('x').by('age').cap('x') <1>
635631
636632
<1> The "age" property is not <<by-step,productive>> for all vertices and therefore those values are not included in the aggregation.
637633
638-
For `local` scope the aggregation will occur in a link:http://en.wikipedia.org/wiki/Lazy_evaluation[lazy] fashion.
639-
640-
NOTE: Prior to 3.4.3, `local` aggregation (i.e. lazy) evaluation was handled by `store()`-step.
634+
Aggregation can be controlled to occur in a link:http://en.wikipedia.org/wiki/Lazy_evaluation[lazy] fashion by using
635+
the step inside `local()`.
641636
642637
[gremlin-groovy,modern]
643638
----
644-
g.V().aggregate(global, 'x').limit(1).cap('x')
645-
g.V().aggregate(local, 'x').limit(1).cap('x')
646-
g.withoutStrategies(EarlyLimitStrategy).V().aggregate(local,'x').limit(1).cap('x')
639+
g.V().aggregate('x').limit(1).cap('x')
640+
g.V().local(aggregate('x')).limit(1).cap('x')
647641
----
648642
649-
It is important to note that `EarlyLimitStrategy` introduced in 3.3.5 alters the behavior of `aggregate(local)`.
650-
Without that strategy (which is installed by default), there are two results in the `aggregate()` side-effect even
651-
though the interval selection is for 1 object. Realize that when the second object is on its way to the `range()`
652-
filter (i.e. `[0..1]`), it passes through `aggregate()` and thus, stored before filtered.
653-
654643
[gremlin-groovy,modern]
655644
----
656-
g.E().aggregate(local,'x').by('weight').cap('x')
645+
g.E().local(aggregate('x')).by('weight').cap('x')
657646
----
658647
659648
*Additional References*
660649
661650
link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#aggregate(java.lang.String)++[`aggregate(String)`],
662-
link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#aggregate(org.apache.tinkerpop.gremlin.process.traversal.Scope,java.lang.String)++[`aggregate(Scope,String)`]
663651
664652
[[all-step]]
665653
=== All Step

docs/src/upgrade/release-3.8.x.asciidoc

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -293,6 +293,89 @@ The properties file in the above example can either point to a remote configurat
293293
294294
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3017[TINKERPOP-3017]
295295
296+
==== `aggregate()` with `Scope` Removed
297+
298+
The `Scope` parameter is being removed from `aggregate()` to fix inconsistency between two different use cases: flow
299+
control vs. per-element application. This change aligns all side effect steps (none of the others have scope arguments)
300+
and reserves the `Scope` parameter exclusively for "traverser-local" application patterns, eliminating confusion about
301+
its contextual meanings.
302+
303+
This makes the `AggregateStep` globally scoped by default with eager aggregation. The Lazy evaluation with `aggregate()` is
304+
achieved by wrapping the step in `local()`.
305+
306+
[source,text]
307+
----
308+
// 3.7.x - scope is still supported
309+
gremlin> g.V().aggregate(local, "x").by("age").select("x")
310+
==>[29]
311+
==>[29,27]
312+
==>[29,27]
313+
==>[29,27,32]
314+
==>[29,27,32]
315+
==>[29,27,32,35]
316+
317+
// 3.8.0 - must use aggregate() within local() to achieve lazy aggregation
318+
gremlin> g.V().local(aggregate("x").by("age")).select("x")
319+
==>[29]
320+
==>[29,27]
321+
==>[29,27]
322+
==>[29,27,32]
323+
==>[29,27,32]
324+
==>[29,27,32,35]
325+
----
326+
327+
An important behavioral difference exists between the removed `aggregate(local)` and its replacement `local(aggregate())`.
328+
In 3.8.0, `local()` changed from traverser-based to object-based processing, always debulking incoming traversers into
329+
individual objects. This causes `local(aggregate())` to behave differently from the original `aggregate(local)`, which
330+
preserved bulked traversers under the old `local()` semantics. This creates an irreplaceable use case, making this a
331+
_breaking change_.
332+
333+
[source,text]
334+
----
335+
// 3.7.x - both local() and local scope will preserve bulked traversers
336+
gremlin> g.V().out().barrier().aggregate(local, "x").select("x")
337+
==>[v[3],v[3],v[3]]
338+
==>[v[3],v[3],v[3]]
339+
==>[v[3],v[3],v[3]]
340+
==>[v[3],v[3],v[3],v[2]]
341+
==>[v[3],v[3],v[3],v[2],v[4]]
342+
==>[v[3],v[3],v[3],v[2],v[4],v[5]]
343+
gremlin> g.V().out().barrier().local(aggregate("x")).select("x")
344+
==>[v[3],v[3],v[3]]
345+
==>[v[3],v[3],v[3]]
346+
==>[v[3],v[3],v[3]]
347+
==>[v[3],v[3],v[3],v[2]]
348+
==>[v[3],v[3],v[3],v[2],v[4]]
349+
==>[v[3],v[3],v[3],v[2],v[4],v[5]]
350+
351+
// 3.8.0 - bulked traversers are now split to be processed per-object, this affects local aggregation
352+
gremlin> g.V().out().barrier().local(aggregate("x")).select("x")
353+
==>[v[3]]
354+
==>[v[3],v[3]]
355+
==>[v[3],v[3],v[3]]
356+
==>[v[3],v[3],v[3],v[2]]
357+
==>[v[3],v[3],v[3],v[2],v[4]]
358+
==>[v[3],v[3],v[3],v[2],v[4],v[5]]
359+
----
360+
361+
See: link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-scoping-5.asciidoc[Lazy vs. Eager Evaluation]
362+
363+
==== Removal of `store()` Step
364+
365+
The `store()` step was a legacy name for `aggregate(local)` that has been deprecated since 3.4.3, and is now removed along
366+
with `aggregate(local)`. To achieve lazy aggregation, use `aggregate()` within `local()`.
367+
368+
[source,text]
369+
----
370+
// 3.7.x - store() is still allowed
371+
gremlin> g.V().store("x").by("age").cap("x")
372+
==>[29,27,32,35]
373+
374+
// 3.8.0 - store() removed, use local(aggregate()) to achieve lazy aggregation
375+
gremlin> g.V().local(aggregate("x").by("age")).cap("x")
376+
==>[29,27,32,35]
377+
----
378+
296379
==== split() on Empty String
297380
298381
The `split()` step will now split a string into a list of its characters if the given separator is an empty string.

gremlin-console/src/main/groovy/org/apache/tinkerpop/gremlin/console/jsr223/GephiTraversalVisualizationStrategy.groovy

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ import org.apache.tinkerpop.gremlin.process.traversal.step.map.EdgeOtherVertexSt
2828
import org.apache.tinkerpop.gremlin.process.traversal.step.map.EdgeVertexStep
2929
import org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep
3030
import org.apache.tinkerpop.gremlin.process.traversal.step.map.VertexStep
31-
import org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.AggregateGlobalStep
31+
import org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.AggregateStep
3232
import org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.LambdaSideEffectStep
3333
import org.apache.tinkerpop.gremlin.process.traversal.step.util.BulkSet
3434
import org.apache.tinkerpop.gremlin.process.traversal.strategy.AbstractTraversalStrategy
@@ -91,7 +91,7 @@ class GephiTraversalVisualizationStrategy extends AbstractTraversalStrategy<Trav
9191
Thread.sleep(acceptor.vizStepDelay)
9292
}
9393
}), s, traversal)
94-
TraversalHelper.insertAfterStep(new AggregateGlobalStep(traversal, sideEffectKey), s, traversal)
94+
TraversalHelper.insertAfterStep(new AggregateStep(traversal, sideEffectKey), s, traversal)
9595
}
9696

9797
// decay all vertices except those that made it through the filter - "this way you can watch
@@ -109,7 +109,7 @@ class GephiTraversalVisualizationStrategy extends AbstractTraversalStrategy<Trav
109109
Thread.sleep(acceptor.vizStepDelay)
110110
}
111111
}), s, traversal)
112-
TraversalHelper.insertAfterStep(new AggregateGlobalStep(traversal, sideEffectKey), s, traversal)
112+
TraversalHelper.insertAfterStep(new AggregateStep(traversal, sideEffectKey), s, traversal)
113113
}
114114
}
115115
}

0 commit comments

Comments
 (0)