Skip to content

Conversation

cdoern
Copy link
Contributor

@cdoern cdoern commented Aug 13, 2025

What does this PR do?

this does a few things:

  1. fixes on_start so that all span [START] and [END] is printed. not just [END]
  2. change log.py to set the default telemetry category to WARN not INFO

This allows us to keep the metric logging and the verbosity of seeing the span [START] and [END] but by default hides it from normal users.

This conforms to our logging system since a user just need to switch the category to INFO to see the logs

Test Plan

without setting any env variables:

Screenshot 2025-08-13 at 3 12 19 PM

set export LLAMA_STACK_LOGGING=telemetry=info

see all metrics, chat completion info, etc.

Screenshot 2025-08-13 at 3 13 28 PM

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 13, 2025
@cdoern cdoern force-pushed the fix-telemetry-log branch 4 times, most recently from 43a2012 to ed10d87 Compare August 13, 2025 18:02
dictConfig(logging_config)

# Ensure third-party libraries follow the root log level
for _, logger in logging.root.manager.loggerDict.items():
if isinstance(logger, logging.Logger):
logger.setLevel(root_level)

# Explicitly silence telemetry loggers when telemetry is not enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why is this needed? isn't the filter above good enough?
  2. all of this feels a bit too complex, trying to see if there is a simpler way to accomplish it all. the telemetry category already exists why doesn't telemetry=info work right now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, this might be some leftover from different versions I was trying, I think you are right I can remove it.

As for:

why doesn't telemetry=info work right now?

The filter is necessary because all logging categories have a default of info, so If I am adding telemetry logs that deserve to be "info" and I, as someone who cares about telemetry in the logs, want to see them, I can just do telemetry=info and I will see them all. the default state just for the telemetry category is disabled so regular folks don't get bombarded with logs unless they explicitly set to info.

The issue is that info is the default state for all these categories, so some special handling needs to be done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I made the logic a fair bit simpler. Basically just the filter and figuring out if _telemetry_enabled is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that info is the default state for all these categories, so some special handling needs to be done

what if we simply made warn be the default for the telemetry category? then a user doing telemetry=info would start seeing everything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm yeah, I think that sounds pretty reasonable!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one final thought:

now we made it so default level is magically warn, so anyone who sees logging code in telemetry will see log.info() but won't see logs.

alternately, we could have made what I did -- which is made logs in telemetry be log.debug() and then you would need to specify telemetry=debug to see the logs.

I kind of feel the latter is slightly more transparent to the user and requires no other surreptitious change to the system. thoughts @ehhuang @raghotham others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is a fair point. I just don't think everything in files like https://github.com/llamastack/llama-stack/blob/46ff302d87562cf266d2a304f7409593ac7bb0ca/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py "deserves" to be hidden behind a debug level. I think its a greater value to have logger... calls have their proper leveling and instead change categories' default leveling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdoern on that one, I definitely agree with you and had acknowledged as such in my PR as well. we need to make the output higher signal not completely kill it. neither my PR nor yours helps with that I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwinb what if I added some concept of "priority" beyond the levels to the logger such that the default level for telemetry (and all other levels) was INFO, but using a numerical priority (which we can add to the logging config) a user won't see ALL info logs by default unless they bump down their priority? Just an idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I'd want to push on is to make the defaults better and higher signal. so for example, what do I hate?

  • there is just so much repetition. uvicorn is going to emit a line. then our telemetry logger emits very repetitive junk, so that's not useful. let's make uvicorn take care of all of that?
  • the metrics outputs -- they are multi-line on the console which expands it too much, can they be json instead?
  • in an ideal world, if we are going to emit something automatically on an aggregate level for a request, it should be a single compact, nice line in the log.
  • now beyond that, you can add a couple bells and whistles if you want more stuff but then you could just hide it behind a simple log.debug()

@cdoern cdoern force-pushed the fix-telemetry-log branch 2 times, most recently from 227544c to e14cdb8 Compare August 13, 2025 19:09
@cdoern cdoern requested a review from ashwinb August 13, 2025 19:10
@cdoern cdoern force-pushed the fix-telemetry-log branch 4 times, most recently from 7e49dfd to 39136a5 Compare August 14, 2025 00:01
@cdoern cdoern changed the title feat: omit telemetry logs feat: telemetry logging fixes Aug 14, 2025
@cdoern cdoern force-pushed the fix-telemetry-log branch 3 times, most recently from da2866d to 8e78478 Compare August 14, 2025 17:53
this does a few things:

1. fixes `on_start` so that all span [START] and [END] is printed. not just [END]
2. change `log.py` to set the default `telemetry` category to WARN not INFO

This allows us to keep the metric logging and the verbosity of seeing the span [START] and [END] but by default hides it from normal users.

This conforms to our logging system since a user just need to switch the category to INFO to see the logs

Signed-off-by: Charlie Doern <[email protected]>
@cdoern cdoern force-pushed the fix-telemetry-log branch from 8e78478 to 6bcc1ad Compare August 15, 2025 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants