Skip to content

Conversation

@wantsui
Copy link
Collaborator

@wantsui wantsui commented Nov 7, 2025

What does this PR do?

The goal of AIDM-253 is to add process tags to the trace payloads.

After this gets merged, the next step is to add it for the other products.

To run the tests in docker

docker compose run --rm tracer-3.3 /bin/bash
bundle exec rake compile
bundle exec rake test:core_with_rails

Main tests:

BUNDLE_GEMFILE=/app/gemfiles/ruby_3.3_rails8.gemfile bundle exec rspec spec/datadog/core/environment/process_spec.rb
bundle exec rspec spec/datadog/tracing/transport/trace_formatter_spec.rb
bundle exec rspec spec/datadog/core/normalizer_spec.rb
bundle exec rspec spec/datadog/core/configuration/settings_spec.rb

Motivation:

We're trying to add process tags to various payloads so they can be used in different use cases.

Note I still want to try adding server type but I'll have to tackle that in a separate PR.

Change log entry

Yes. Add process tags to the trace payloads.

Additional Notes:

How to test the change?

… This is still missing memoization and additional tests.
@github-actions github-actions bot added core Involves Datadog core libraries tracing labels Nov 7, 2025
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

Thank you for updating Change log entry section 👏

Visited at: 2025-11-14 09:35:11 UTC

@datadog-official
Copy link

datadog-official bot commented Nov 7, 2025

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 6042830 | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@wantsui wantsui added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Nov 7, 2025
def tag_process_tags!
return unless trace.experimental_propagate_process_tags_enabled
process_tags = Core::Environment::Process.formatted_process_tags_k1_v1
return if process_tags.empty?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impossible right? If so, we can remove it, as it would give us a false sense of uncertainty here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed it in 8dae705 by just removing the check in process tags, but let me know if you spot issues with it!

@pr-commenter
Copy link

pr-commenter bot commented Nov 10, 2025

Benchmarks

Benchmark execution time: 2025-11-18 23:19:43

Comparing candidate commit 6042830 in PR branch add-process-tags-to-tracing with baseline commit 49cee89 in branch master.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 43 metrics, 2 unstable metrics.

scenario:tracing - Tracing.log_correlation

  • 🟥 throughput [-10853.442op/s; -10495.857op/s] or [-9.774%; -9.452%]

format!
expect(first_span.meta).to include('_dd.tags.process')
expect(first_span.meta['_dd.tags.process']).to eq(Datadog::Core::Environment::Process.serialized)
# TODO figure out if we need an assertion for the value, ie
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcotc - do you think there's value in asserting for the values of the tag? Or is the test in process_spec enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are doing with expect(first_span.meta['_dd.tags.process']).to eq(Datadog::Core::Environment::Process.serialized) seems good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't test realistic values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing to test here is that it's respecting the configuring option, which you did.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing to test here is that it's respecting the configuring option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! In that case it doesn't seem like I need to make any changes to the assertions then?

@github-actions
Copy link

github-actions bot commented Nov 11, 2025

Typing analysis

Note: Ignored files are excluded from the next sections.

Untyped methods

This PR introduces 1 partially typed method. It increases the percentage of typed methods from 54.67% to 54.77% (+0.1%).

Partially typed methods (+1-0)Introduced:
sig/datadog/core/normalizer.rbs:12
└── def self.normalize: (untyped original_value, ?remove_digit_start_char: bool) -> ::String

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept to the end of the line to remove it from the stats.

normalized_value.sub!(LEADING_INVALID_CHARS, "")
normalized_value.sub!(TRAILING_UNDERSCORES, "")
normalized_value.squeeze!('_')
normalized_value = normalized_value[MAX_CHARACTER_LENGTH]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a value of having range when it could be normalized_value[0, 200]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, so I removed the range approach.

I played around with a few things here: 4747259 and now it looks like this:

normalized_value.slice!(MAX_CHARACTER_LENGTH..-1) if normalized_value.length > MAX_CHARACTER_LENGTH

(the conditional portion should help skip this operation if the text is small enough)

Let me know if this is more in line with what you're thinking of!

Comment on lines 5 to 15
@serialized: untyped

def self?.entrypoint_workdir: () -> untyped

def self?.entrypoint_type: () -> untyped

def self?.entrypoint_name: () -> untyped

def self?.entrypoint_basedir: () -> untyped
def self?.serialized_kv_helper: (untyped key, untyped value) -> ::String
def self?.serialized: () -> untyped
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Could you please type it. You can use Codex, it's good at it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks for pointing this out! Addressed in adfa416!

module Core
module Normalizer
INVALID_TAG_CHARACTERS: ::Regexp
def self.normalize: (untyped original_value) -> ("" | untyped)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

untyped will cover everything, but still, it's not untyped, it's a ::String?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in adfa416!

Comment on lines 54 to 57
tags << serialized_kv_helper(Core::Environment::Ext::TAG_ENTRYPOINT_WORKDIR, entrypoint_workdir) if entrypoint_workdir
tags << serialized_kv_helper(Core::Environment::Ext::TAG_ENTRYPOINT_NAME, entrypoint_name) if entrypoint_name
tags << serialized_kv_helper(Core::Environment::Ext::TAG_ENTRYPOINT_BASEDIR, entrypoint_basedir) if entrypoint_basedir
tags << serialized_kv_helper(Core::Environment::Ext::TAG_ENTRYPOINT_TYPE, entrypoint_type) if entrypoint_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need to specify namespacing in Ruby up until the common point between: the current class or module we are in; and the object we want to reference.
In this case, we are in Datadog::Core::Environment::Process and want to reference Datadog::Core::Environment::Ext::TAG_ENTRYPOINT_WORKDIR.

We can remove the prefix namespace that is identical. For example Ext::TAG_ENTRYPOINT_WORKDIR will work here.

BUT, Ruby namespace resolution is very lenient, and we will try to match Ext (from Ext::TAG_ENTRYPOINT_WORKDIR), in order, to: Datadog::Core::Environment::Process::Ext, Datadog::Core::Environment::Ext, Datadog::Core::Ext, Datadog::Ext, and ::Ext.
This is important because the namespace matching doesn't try to match the complete Ext::TAG_ENTRYPOINT_WORKDIR path; it only tries to match the first token you provided: the Ext in Ext::TAG_ENTRYPOINT_WORKDIR.
And because more than one of these locations in the possible search logic are realistic matches, we should be a bit more specific than Ext::TAG_ENTRYPOINT_WORKDIR.

A good practice is to stop at the closet common namespace location. In this case, it would be the Environment. So I suggest using Environment::Ext::TAG_ENTRYPOINT_WORKDIR (and the equivalent for the other constants) here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! I'll keep this in mind going forward!
Addressed in: 31d9796

# Returns the entrypoint type of the process
# @return [String] the type of the process, which is fixed in Ruby
def entrypoint_type
Core::Environment::Ext::PROCESS_TYPE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove Core:: from this constant access (see comment in def serialized).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this note! see 31d9796!

# - Trailing underscores are removed
# - Consecutive underscores are merged into a single underscore
# - Maximum length is 200 characters
def self.normalize(original_value)
Copy link
Member

@marcotc marcotc Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how many operations happen inside this method, I recommend adding a "fast-case", where we do some checks and return immediately if the provided original_value is already valid.
This suggestion is equivalent to the early return by the agent here.

I suggest trying to use a regular expression, instead of implementing the agent's isNormalizedASCIITag in Ruby, since Ruby code is slower than Go code, but Ruby regex is pretty fast.

Something like:

return original_value if original_value.size <= MAX_CHARACTER_LENGTH && original_value.matches?(VALID_ASCII_TAG)

The hypothetical VALID_ASCII_TAG doesn't have to catch all valid cases: it's a trade-off between matching most valid tags vs making the regex complicated and slow. As long as it never matches invalid tags, it's all good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in be9587d ! Let me know if this is better now!

TRAILING_UNDERSCORES = %r{_++\z}
MAX_CHARACTER_LENGTH = 200

# Based on https://github.com/DataDog/datadog-agent/blob/45799c842bbd216bcda208737f9f11cade6fdd95/pkg/trace/traceutil/normalize.go#L131
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. In a2643a6, I kept the "tag" the originally defined constants, but I just adjusted the logic so that the normalization only takes place in the string values.

Copy link
Contributor

@mabdinur mabdinur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments based on my review on the python PR. I'll defer to Marco for the final approval. Overall this looks good to me

# @return [String] the last segment of the base directory of the script
def entrypoint_basedir
current_basedir = File.expand_path(File.dirname($0))
normalized_basedir = current_basedir.tr(File::SEPARATOR, '/')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we do this normalization in the python implementation. We should align on the same approach (either normalize in the SDK or do it one central place like the Agent).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I made an adjustment in a2643a6 so it's the same as the python tracer behavior now.

normalized_value.gsub!(INVALID_TAG_CHARACTERS, '_')
normalized_value.sub!(LEADING_INVALID_CHARS, "")
normalized_value.sub!(TRAILING_UNDERSCORES, "")
normalized_value.squeeze!('_') if normalized_value.include?('__')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use _ for invalid values? This a character that is common in file names. It will be hard to distinguish between cases where a value contains a legitimate underscore or if it matches an invalid character.

Can we defer this normalization to the Agent? It would be nice if we could centralize this logic instead of duplicating it across SDKs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, for process tags, the agent is just one of the transport targets.

But, since the transport format is a tag value for traces, we have to either live with this ambiguity around _, or create an escape pattern for valid _s.

normalized_value.sub!(LEADING_INVALID_CHARS, "")
normalized_value.sub!(TRAILING_UNDERSCORES, "")
normalized_value.squeeze!('_') if normalized_value.include?('__')
normalized_value.slice!(MAX_CHARACTER_LENGTH..-1) if normalized_value.length > MAX_CHARACTER_LENGTH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python implementation is a bit simpler. I don't think it enforces a max length

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was basing this off the Trace Agent: https://github.com/DataDog/datadog-agent/blob/45799c842bbd216bcda208737f9f11cade6fdd95/pkg/trace/traceutil/normalize_test.go#L17.

The python implementation currently fails some of those tests and I left a comment in the dd-trace-py PR about it.

That said, I will update this logic to back off sooner if the tag is already valid!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size check is sane and reasonable! We should add it to python if we have the chance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-reviewed the Trace Agent implementation with the Python one and the main thing is that for span tag values, it's ok for it to start with a digit.

I ended up following the trace agent more closely so it passes the case where emoji get added: be9587d

Note - the trace agent has an interesting bytes test that I cannot get to pass in Ruby:

# This test case doesn't work with the current logic because it yields 202 characters
      # {in: 'A' + ('0' * 200) + ' ' + ('0' * 11), out: 'a' + ('0' * 200) + '_0'},

(Sometimes it can go over 200 characters in the Trace Agent so that's the only test I am skipping for now)

tag_sampling_priority!
tag_profiling_enabled!
tag_apm_tracing_disabled!
tag_process_tags!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to move this check into:

if first_span

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! (and add a test to assert that we do check for first_span).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6042830!

Comment on lines 5 to 14
@serialized: ::String

def self?.entrypoint_workdir: () -> ::String

def self?.entrypoint_type: () -> ::String

def self?.entrypoint_name: () -> ::String

def self?.entrypoint_basedir: () -> ::String
def self?.serialized_kv_helper: (::String key, ::String value) -> ::String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should most of these methods/fields be private? I think we only need to expose self?.serialized

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like only serialized needs to be public.
We should privatized everything else.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do that then the process_spec needs to be adjusted 👀
I'll make the test adjustments and see if this helps.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted in 47efb90

Comment on lines 13 to 33
# Returns the last segment of the working directory of the process
# @return [String] the last segment of the working directory
def entrypoint_workdir
File.basename(Dir.pwd)
end

# Returns the entrypoint type of the process
# @return [String] the type of the process, which is fixed in Ruby
def entrypoint_type
Environment::Ext::PROCESS_TYPE
end

# Returns the last segment of the base directory of the process
# @return [String] the last segment of base directory of the script
def entrypoint_name
File.basename($0)
end

# Returns the last segment of the base directory of the process
# @return [String] the last segment of the base directory of the script
def entrypoint_basedir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a simple example string to these methods (except entrypoint_type).
For example /home/server/app/script.rb -> ... (insert real output).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 47efb90 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos core Involves Datadog core libraries tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants