Skip to content

Conversation

xiazcy
Copy link
Contributor

@xiazcy xiazcy commented Sep 18, 2025

https://issues.apache.org/jira/browse/TINKERPOP-2234

Implementation based on the proposal in review.

Added P.typeOf() predicate, which recognized a new set of GType enum token, String as registered in the GlobalTypeCache, as well as Class in Java/Groovy usage.

Main implementation is ready to be reviewed. Listing here a couple of remaining tasks, and keeping this as draft until proposal is merged.

  • Proposal merged
  • Replaced N with GType
  • Additional tests added
  • Documentations added

@xiazcy xiazcy changed the base branch from master to 3.8-dev September 19, 2025 16:03
* {@code Type} is a {@code BiPredicate} that determines whether the first argument is a type of the second argument.
*
*/
public enum Type implements PBiPredicate<Object, Object> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comment still stands from the proposal. I still think Type should be better constrained by that <Object, Object> and that <Object, Class<?>> is more agreeable. Doing so would also get rid of the catch-all else { return false } and that check for GType.NULL:

    typeOf {
        @Override
        public boolean test(final Object first, final Class<?> second) {
            if (first == null) return false;
            return secondClass.isAssignableFrom(first.getClass());
        }

    };

This way, it can never be called other than how it is intended. In P you already have the types constrained to multiple typeOf overloads. Why expand them back to the Type enum when you can just coerce them each to Class<?> by moving that logic I removed above back to those specific P overloads?

Also, you'd catch GlobalTypeCache failures at traversal construction time with that logic in P rather than at runtime which seems more advantageous to me.

Do we lose something by getting more strict with the types for the Type enum?

As a separate point, are we making a mistake using Type for the name of this enum? We have Type and GType and at some point in the future we will have a schema. "Type" is such a great word to save for when we really need it. We currently have Contains, Compare , and the unfortunately named Text. The first two are verbs which actively define what P is doing - containing or comparing. Maybe there's a verb form that would be better than Type? Maybe we call the enum InstanceOf rather than Type- that's pretty normal and understood?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes, that makes sense. I'll make the change.

Maybe we call the enum InstanceOf rather than Type- that's pretty normal and understood?

I agree. I was a bit unsure of the name as well but couldn't think of a better one. InstanceOf sounds like a good alternative. Or ClassOf? IsA?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i like InstanceOf better than ClassOf. IsA is interesting though given how generic it is but I can't think of how else we would use that offhand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if we want something more action CompareType?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could sort of extend from Compare with specific other implementations. Either are better than taking Type so maybe see if anyone else favors CompareType over InstanceOf. I think it's worth keeping in mind that most users won't even see these predicates as they are just enums within P, so they are a layer lower than the primary API for Gremlin at least which take some pressure away from getting the naming perfect.

@xiazcy xiazcy marked this pull request as ready for review September 24, 2025 20:20
The provided predicates are outlined in the table below and are used in various steps such as <<has-step,`has()`>>-step,
<<where-step,`where()`>>-step, <<is-step,`is()`>>-step, etc. Two new additional `TextP` predicate members were added in the
TinkerPop 3.6.0 release that allow working with regular expressions. These are `TextP.regex` and `TextP.notRegex`
TinkerPop 3.6.0 release that allow working with regular expressions. These are `TextP.regex` and `TextP.notRegex`. Starting with TinkerPop 3.8.0, a new type predicate `P.typeOf` is added to support filtering traversers based on types.
Copy link
Contributor

@spmallette spmallette Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please take this moment to clean up this Note on Predicates section? I don't think there's any value to speaking about 3.4.0 and other versions. Have this section just talk about predicates and what they are.

| `P.within(objects...)` | Is the incoming object in the array of provided objects?
| `P.without(objects...)` | Is the incoming object not in the array of the provided objects?
| `P.typeOf(GType)` | Is the incoming object of the type indicated by the provided `GType` token?
| `P.typeOf(string)` | Is the incoming object of the type indicated by the provided `String`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i realized that P.typeOf(Class) is sugar syntax so i guess it makes sense not to include that here. Not sure how we want to start better documenting the difference between the canonical grammar and the variants. I think that as of right now the "Differences" section is the best place to do that. Java doesn't have such a section yet, but I think this point could be a start of one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, that's why I didn't have it added here. Sure, I'll start a section.

The `asNumber()`-step (*map*) converts the incoming traverser to the nearest parsable type if no argument is provided,
or to the desired numerical type, based on the number token (`N`) provided.
or to the desired numerical type, based on the type token (`GType`) provided. If a type token entered isn't a numerical type, an `IllegalArgumentException` will be thrown.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont do a great job describing our enums. We just sort of assume folks will find and understand them via examples. I suppose it could be like "A Note on Types" to go with those related sections. Then at least there would be something to link to for GType.

if (first == null) {
return second == null;
}
return second != null && second.isAssignableFrom(first.getClass());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just a point of thought, have we made a mistake in the syntax by assuming matching on inheritance and not class equality? should there be option for both? maybe not. just thought i'd pose the question. if we're not sure, that may be fine as we could have typeOf as it is and typeEquals for equality, so easily added at a later time if so decided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had a thought on it as well. I leaned towards inheritance but the idea of having typeEquals (or tyepEq?) for exact class match might not be a bad idea for later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding onto this thread, a strict typeEquals isn't particularly useful in practice for many of our types as GTypes mostly match up with interfaces which have many concrete implementations (Vertex, TinkerVertex, ReferenceVertex... List, ArrayList, LinkedList... Map, HashMap, LinkedHashMap...)

I would agree that inheritance matching is the right choice for now, and while there may be potential for an exact match predicate in the future, I expect usage of that to be much more niche.

TREE(Tree.class),
UUID(UUID.class),
VERTEX(Vertex.class),
VP(VertexProperty.class),;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on naming: VP is nice and short, but given we spelled out everything else. also, VPmight not be understood by new users.VPROPERTY` if we want it shorter?

public Class<?> getType() { return javaType; }

public boolean isNumeric() {
Class<?> type = getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please review PR for final declarations.

import java.util.UUID;

/**
* Gremlin types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GType is an important new concept in Gremlin as a language. I think a bit more context would be good in the javadoc.

Also, Gremlin Semantics docs do not appear to be updated for asNumber or include information about type predicates, or these defined types.

| d[32].i |
| d[35].i |

Scenario: g_V_valuesXageX_isXtypeOfXGType_BYTEXX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing a lot of negatives here which is good. where are the feature tests for positives for all these types?

By defining types for Gremlin in GType I feel like we've elevated those selected ones to have special meaning. I understand none of our example graphs have that kind of data at this point, but I think that's a call to expand what our test framework can do. We already have the kitchen sink graph for this sort of thing. I think you should expand that dataset to throw in some data for this. Maybe one vertex label per testable data type (that can be stored as a property), so that you could do g.V().hasLabel("double", "string").is(typeOf(NUMBER)) and assert the right stuff.

I didn't examine this carefully, but i don't see positive tests for GRAPH and TREE. What else is missing?

Perhaps it would be wise to break this feature test up into: TypeOfNumber.feature, TypeOfGraph.feature, etc. for some specific testing of grouped types then keep TypeOf.feature as a catch-all for everything else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I'll add those tests. And yea, splitting off the types make sense. I'll organize it that way.

Then the traversal will raise an error with message containing text of "Can't parse string 'test' as number."
Then the traversal will raise an error with message containing text of "Can't convert number of type Integer to Byte due to overflow."

@GraphComputerVerificationInjectionNotSupported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's good that we restrict asNumber testing on GraphComputer so much. I think GraphComputerVerificationInjectionNotSupported should be more of an exception than a rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main reason is we used a lot of inject() cases cause they are the easiest, I'll add some non-inject ones.

}

@Test
@Test @Ignore("need to restruture to account for day light saving changes")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, good reminder. Cole had a fix for it, I'll need to check if it's in and rebased the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like my fix is already picked up here

final Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));

I would expect you can drop this ignore and it should pass now.

package org.apache.tinkerpop.gremlin.process.traversal.step.map;

import org.apache.tinkerpop.gremlin.process.traversal.N;
import org.apache.tinkerpop.gremlin.process.traversal.GType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe i'm missing it, but where are the tests for cases where you supply a non-numeric GType to asNumber like asNumber(GType.Vertex)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I thought I did, but maybe missed it, I'll add one.

| K_STRINGU | K_GTYPE DOT K_STRINGU
| K_TREE | K_GTYPE DOT K_TREE
| K_TREEU | K_GTYPE DOT K_TREEU
| K_UUID | K_GTYPE DOT K_UUID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we accept all of UUID, GType.UUID, uuid, and GType.uuid as valid GTypes for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, added a UUIDL keyword

"g.inject(\"123.4\").asNumber(GType.INT)",
"g.inject(\"123.4\").asNumber(GType.int)",
"g.inject('123.4').as_number(GType.INT)"},
{"g.inject(\"123.4\").asNumber(GType.BIGINT)",
Copy link
Contributor

@Cole-Greer Cole-Greer Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the Java and Groovy cases are excluded from most of these scenarios? I'm curious to see these working as there is no replacement for the old N translations in these translators. Also I think it would be good to have a few which use some of the lowercase GType tokens in addition to all of these using the all caps forms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null actually means the translations are no difference from the original input, the tests are still ran against them.

}

/**
* Register a type with a custom name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't think it's our place to enforce any rules on name registration, would we want to define a convention on capitalization which we recommend providers follow? The preloaded types from GType appear to be taking the PascalCase names from their Java classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards removing this custom function, or restrict it to a defined convention, like Stephen mentioned, to something like providerPrefix:simpleTypeName, where the allowed input is just a custom providerPrefix string and we attach it to the simple name.

{P.typeOf(GType.NUMBER), 1, true},
{P.typeOf(GType.STRING), "hello", true},
{P.typeOf(Boolean.class), false, true},
{P.typeOf(GType.NULL), null, true},
Copy link
Contributor

@Cole-Greer Cole-Greer Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include a few false cases here? It would also be good to have some cases which use the global type registry such as P.typeOf("Integer")

package org.apache.tinkerpop.gremlin.util;

import org.apache.tinkerpop.gremlin.process.traversal.N;
import org.apache.tinkerpop.gremlin.process.traversal.GType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to Stephen's comment on AsNumberStepTest, it would be good to include one or two cases here which assert errors for non-numeric GTypes.

Given the modern graph
And the traversal of
"""
g.V().values("age").is(P.typeOf(GType.INT))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to include a case that's uses typeOf to filter a stream of mixed types as that will likely be the main use case here. We don't really have graphs with mixed type properties, but something like this would be good:

g.V().values("age", "name").is(P.typeOf(GType.INT))
==> 29
==> 27
==> 32
==> 35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants