-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: telemetry logging fixes #3133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
43a2012
to
ed10d87
Compare
llama_stack/log.py
Outdated
dictConfig(logging_config) | ||
|
||
# Ensure third-party libraries follow the root log level | ||
for _, logger in logging.root.manager.loggerDict.items(): | ||
if isinstance(logger, logging.Logger): | ||
logger.setLevel(root_level) | ||
|
||
# Explicitly silence telemetry loggers when telemetry is not enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- why is this needed? isn't the filter above good enough?
- all of this feels a bit too complex, trying to see if there is a simpler way to accomplish it all. the
telemetry
category already exists why doesn'ttelemetry=info
work right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, this might be some leftover from different versions I was trying, I think you are right I can remove it.
As for:
why doesn't telemetry=info work right now?
The filter is necessary because all logging categories have a default of info
, so If I am adding telemetry logs that deserve to be "info" and I, as someone who cares about telemetry in the logs, want to see them, I can just do telemetry=info
and I will see them all. the default state just for the telemetry category is disabled so regular folks don't get bombarded with logs unless they explicitly set to info
.
The issue is that info
is the default state for all these categories, so some special handling needs to be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I made the logic a fair bit simpler. Basically just the filter and figuring out if _telemetry_enabled
is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that info is the default state for all these categories, so some special handling needs to be done
what if we simply made warn
be the default for the telemetry category? then a user doing telemetry=info
would start seeing everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm yeah, I think that sounds pretty reasonable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one final thought:
now we made it so default level is magically warn, so anyone who sees logging code in telemetry will see log.info() but won't see logs.
alternately, we could have made what I did -- which is made logs in telemetry be log.debug() and then you would need to specify telemetry=debug to see the logs.
I kind of feel the latter is slightly more transparent to the user and requires no other surreptitious change to the system. thoughts @ehhuang @raghotham others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is a fair point. I just don't think everything in files like https://github.com/llamastack/llama-stack/blob/46ff302d87562cf266d2a304f7409593ac7bb0ca/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py "deserves" to be hidden behind a debug level. I think its a greater value to have logger...
calls have their proper leveling and instead change categories' default leveling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cdoern on that one, I definitely agree with you and had acknowledged as such in my PR as well. we need to make the output higher signal not completely kill it. neither my PR nor yours helps with that I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashwinb what if I added some concept of "priority" beyond the levels to the logger such that the default level for telemetry (and all other levels) was INFO
, but using a numerical priority (which we can add to the logging config) a user won't see ALL info logs by default unless they bump down their priority? Just an idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I'd want to push on is to make the defaults better and higher signal. so for example, what do I hate?
- there is just so much repetition. uvicorn is going to emit a line. then our telemetry logger emits very repetitive junk, so that's not useful. let's make uvicorn take care of all of that?
- the metrics outputs -- they are multi-line on the console which expands it too much, can they be json instead?
- in an ideal world, if we are going to emit something automatically on an aggregate level for a request, it should be a single compact, nice line in the log.
- now beyond that, you can add a couple bells and whistles if you want more stuff but then you could just hide it behind a simple
log.debug()
227544c
to
e14cdb8
Compare
7e49dfd
to
39136a5
Compare
da2866d
to
8e78478
Compare
this does a few things: 1. fixes `on_start` so that all span [START] and [END] is printed. not just [END] 2. change `log.py` to set the default `telemetry` category to WARN not INFO This allows us to keep the metric logging and the verbosity of seeing the span [START] and [END] but by default hides it from normal users. This conforms to our logging system since a user just need to switch the category to INFO to see the logs Signed-off-by: Charlie Doern <[email protected]>
8e78478
to
6bcc1ad
Compare
What does this PR do?
this does a few things:
on_start
so that all span [START] and [END] is printed. not just [END]log.py
to set the defaulttelemetry
category to WARN not INFOThis allows us to keep the metric logging and the verbosity of seeing the span [START] and [END] but by default hides it from normal users.
This conforms to our logging system since a user just need to switch the category to INFO to see the logs
Test Plan
without setting any env variables:
set
export LLAMA_STACK_LOGGING=telemetry=info
see all metrics, chat completion info, etc.