java: Use Integers in addition to Longs #310

mpkorstanje · 2025-07-17T22:53:22Z

⚡️ What's your motivation?

Currently, messages are a bit cumbersome to use in combination with Java. The generated code currently uses long values where Java would use integers. This leads to some silly code:

lines.add((int) (long) location.getLine());

Unfortunately, long values are needed for the timestamp and duration. So simply replacing longs with integers wouldn't work.

By specifying some upper bounds the code generation can make a more informed choice about the size of number to use. The choice for 2^31 - 1 is somewhat arbitrary but matches the maximum length of strings in Java and .Net. In practice, I would expect Gherkin documents to have a much more reasonable length though.

Partially resolves: cucumber/cucumber-jvm#3013

🏷️ What kind of change is this?

💥 Breaking change (incompatible changes to the API)

♻️ Anything particular you want feedback on?

Does anyone have a better idea?

📋 Checklist:

Clean up json schema descriptions and generated documentation #311 to make this PR smaller.
I agree to respect and uphold the Cucumber Community Code of Conduct
I've changed the behaviour of the code
- I have added/updated tests to cover my changes.
My change requires a change to the documentation.
- I have updated the documentation accordingly.
Users should know about my change
- I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

Currently, messages are a bit cumbersome to use in combination with Java. The generated code currently uses long values where Java would use integers. This leads to some silly code: ```java lines.add((int) (long) location.getLine()); ``` Unfortunately, long values are needed for the timestamp and duration. So simply replacing longs with integers wouldn't work. By specifying some upper bounds the code generation can make a more informed choice about the size of number to use. The choice for 2^31 - 1 is somewhat arbitrary but matches the maximum length of strings in Java and .Net. In practice, I would expect Gherkin documents to have a much more reasonable length though.

luke-hill · 2025-07-18T12:22:08Z

Is the codegen changes here also needed? Not sure where to review start/end in this PR that's all. Some of those changes seem like they might need a bit more investigating.

# Conflicts: # jsonschema/src/Attachment.json # jsonschema/src/Duration.json # jsonschema/src/GherkinDocument.json # jsonschema/src/Meta.json # jsonschema/src/Pickle.json # jsonschema/src/Source.json # jsonschema/src/SourceReference.json # jsonschema/src/TestCase.json # jsonschema/src/TestCaseStarted.json # jsonschema/src/Timestamp.json # perl/lib/Cucumber/Messages.pm # python/src/cucumber_messages/_messages.py # ruby/lib/cucumber/messages/attachment.rb # ruby/lib/cucumber/messages/gherkin_document.rb # ruby/lib/cucumber/messages/meta.rb # ruby/lib/cucumber/messages/pickle.rb # ruby/lib/cucumber/messages/source.rb # ruby/lib/cucumber/messages/source_reference.rb # ruby/lib/cucumber/messages/step_match_argument.rb # ruby/lib/cucumber/messages/test_case_started.rb # ruby/lib/cucumber/messages/test_step.rb

mpkorstanje · 2025-07-18T12:50:14Z

Is the codegen changes here also needed? Not sure where to review start/end in this PR that's all. Some of those changes seem like they might need a bit more investigating.

With #311 merged the diff should be much smaller.

A good place to start would be with the changes to the json schema. I've added minimum and maximum values.
Those changes are picked picked up by the codegen. For most languages nothing happens. For Java it is used to determine if an int or long should be used.

luke-hill

Reviewed about half of the files. I think some bits are ok. But other bits might cause issues where they change the response structures. But I'm not 100% sure without digging in again

The JSON schema stuff all seems great though, no issues there.

jsonschema/src/Location.json

luke-hill · 2025-07-18T14:46:49Z

codegen/generators/php.rb

@@ -53,11 +53,11 @@ def nullable?(property_name, schema)
    end

    def scalar?(property)
-      property.key?('type') && language_translations_for_data_types.key?(property['type'])
+      property.key?('type') && select_language_translations_for_data_types(property['type'], property)


This will return false or a non-boolean value which I haven't checked downstream if this matters, but it could

luke-hill · 2025-07-18T14:49:01Z

codegen/generators/java.rb

-      }
+    def select_language_translations_for_data_types(type, property)
+      if type == 'integer'
+        if property['maximum'] and property['maximum'] <= 2147483647


Suggested change

if property['maximum'] and property['maximum'] <= 2147483647

if property['maximum']&.between?(1, 2147483647)

In Java, for the choice between integer and long only the upper bound matters. They're both signed.

luke-hill · 2025-07-18T14:50:08Z

codegen/generators/base.rb

    end

    def type_for(parent_type_name, property_name, property)
      if property['$ref']
        property_type_from_ref(property['$ref'])
      elsif property['type']
-        property_type_from_type(parent_type_name, property_name, property, type: property['type'])


I assume this is the only location for this? If so this seems good. I think a lot of the initial work I did for codegen was converting into a generic structure - I didn't go through and tidy up. So good catch if so

luke-hill · 2025-07-18T14:52:55Z

codegen/generators/cpp.rb

@@ -21,12 +21,12 @@ def format_description(raw_description, indent_string: '')

    private

-    def language_translations_for_data_types


Review here but for all generic implementors. Is the only usage of this in property_type_from_type? If so then I figure the change here is ok, but it does more heavy lifting upfront

luke-hill · 2025-07-18T14:55:27Z

codegen/generators/php.rb

@@ -68,12 +68,12 @@ def default_value(class_name, property_name, property, schema)
      super(class_name, property_name, property)
    end

-    def language_translations_for_data_types
+    def select_language_translations_for_data_types(type, property)


If this is never consumed here we can use _property to be memory efficient

Suggested change

def select_language_translations_for_data_types(type, property)

def select_language_translations_for_data_types(type, _property)

mpkorstanje requested a review from luke-hill July 17, 2025 23:32

mpkorstanje added 5 commits July 18, 2025 14:51

Reduce diff

7bbbf00

Reduce diff

6e67701

Reduce diff

db27e26

Fix range on nanos

b149e9d

Consistency

c2bc505

luke-hill reviewed Jul 18, 2025

View reviewed changes

mpkorstanje added 2 commits July 18, 2025 19:32

Fix location.line bound

60ec843

Fix

01acc5c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

java: Use Integers in addition to Longs #310

java: Use Integers in addition to Longs #310

Uh oh!

mpkorstanje commented Jul 17, 2025 •

edited

Loading

Uh oh!

luke-hill commented Jul 18, 2025

Uh oh!

mpkorstanje commented Jul 18, 2025

Uh oh!

luke-hill left a comment •

edited

Loading

Uh oh!

Uh oh!

luke-hill Jul 18, 2025

Uh oh!

luke-hill Jul 18, 2025

Uh oh!

mpkorstanje Jul 18, 2025 •

edited

Loading

Uh oh!

luke-hill Jul 18, 2025

Uh oh!

luke-hill Jul 18, 2025

Uh oh!

luke-hill Jul 18, 2025

Uh oh!

Uh oh!

	if property['maximum'] and property['maximum'] <= 2147483647
	if property['maximum']&.between?(1, 2147483647)

		@@ -21,12 +21,12 @@ def format_description(raw_description, indent_string: '')

		private

		def language_translations_for_data_types

	def select_language_translations_for_data_types(type, property)
	def select_language_translations_for_data_types(type, _property)

Uh oh!

java: Use Integers in addition to Longs #310

Are you sure you want to change the base?

java: Use Integers in addition to Longs #310

Uh oh!

Conversation

mpkorstanje commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚡️ What's your motivation?

🏷️ What kind of change is this?

♻️ Anything particular you want feedback on?

📋 Checklist:

Uh oh!

luke-hill commented Jul 18, 2025

Uh oh!

mpkorstanje commented Jul 18, 2025

Uh oh!

luke-hill left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

luke-hill Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

luke-hill Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

mpkorstanje Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luke-hill Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

luke-hill Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

luke-hill Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mpkorstanje commented Jul 17, 2025 •

edited

Loading

luke-hill left a comment •

edited

Loading

mpkorstanje Jul 18, 2025 •

edited

Loading