-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add labels to sys.server #18547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add labels to sys.server #18547
Conversation
Since the label is defined as a Map internally, will these configuration work?
|
You are right, I have updated the docs to reflect that both formats are acceptable |
@GabrielCWT , how do you intend to use these labels? |
As of now, the use case is to display the IDC location of the service. Putting it in the web console would make it easier for us to see the locations of all services. This feature is similar to the kubernetes label feature which helps to add identifying attributes to each service. I felt that since it is attached to each service, we should store it in the |
sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java
Outdated
Show resolved
Hide resolved
Thanks for sharing the details, @GabrielCWT ! That sounds like a reasonable use case. My only concern was with adding a new type of free-form info to The alternative would be to just bind it to a completely new config object and then it would be served over the |
This was indeed one of my concerns as well but right now the use case will be a maximum of 2-3 values. As you mentioned, an absurdly large number of label values would be a configuration issue.
The main idea is for it to be visible on the web-console and therefore I think sticking to the current implementation is better. |
I think it's better to show the label in the web console. Another use case is that we can inject some container meta data like container name to this property when druid is deployed in K8s, so that we know which service/host maps to which container from web-console |
Yeah, I agree. Labels can have several uses. I was only thinking out loud. |
); | ||
|
||
@JsonProperty | ||
private Map<String, String> labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private Map<String, String> labels; | |
private final Map<String, String> labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently all variables are being assigned in the init
function, this prevents the object's variables from being final. I will be following this convention and as such not be assigning final
to the variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed the init()
call. final
is always better for such bean fields.
But you are right, we cannot do it in this PR.
.add("max_size", ColumnType.LONG) | ||
.add("is_leader", ColumnType.LONG) | ||
.add("start_time", ColumnType.STRING) | ||
.add("labels", ColumnType.NESTED_DATA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of the other columns use nested data as the type.
Even columns in sys.segments
like last_compaction_state
, dimensions
, metrics
, etc
use STRING
as the type. I think we should just stick to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I misunderstood our previous discussion but the issue I was facing was that when it was a STRING
type, the data being returned for the Map
was in the form "labels": "{brokerTest=myValue, brokerTest2=myValue2}"
, therefore I used jsonMapper
to serialise the labels.
We then agreed to return it in JSON format thus changing it to NESTED_DATA
type. Just want to confirm that you want to revert it back to the original implementation
node.getLabels() == null ? null : jsonMapper.writeValueAsString(node.getLabels())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see what's happening here. Sorry, I had missed this in the initial pass.
You will need to use a jsonMapper
and keep the column type as STRING
since that is what SegmentsTable
is doing as well but lazilly.
The segments table uses a list of SEGMENTS_JSON_FIELDS
to identify which fields to serialize as json.
Ideally, you would want to do something similar to avoid invoking jsonMapper.writeValueAsString()
every time
we receive a query for sys.servers
so that we serialize that field only when needed.
But for now, you may just stick to what you had originally that is:
node.getLabels() == null ? null : jsonMapper.writeValueAsString(node.getLabels())
The perf improvement can be done later.
Sorry for the confusion 😅 !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the optimization would also require making ServersTable
implement ProjectableFilterableTable
instead of ScannableTable
. Not sure if that change would have any other impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this optimization helps. because on the web-console, the label
is always selected, this is the most scenario that sys.server
table is queried.
Considering the number of servers is not very large(and even the number is huge, at the early phase of this feature, it may not be widely used), the serialization here might NOT be a problem.
If we're going to optimize the serialization in future, my suggestion is to cache the serialized value in the DruidNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open up an issue for a possible refactor to implement ProjectableFilterableTable
. It would be useful if the number of fields requiring serialising increases. For now I will stick with what was written originally.
If we're going to optimize the serialization in future, my suggestion is to cache the serialized value in the DruidNode.
This was also an optimisation I was thinking of. What are your thoughts? @kfaraz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the serialization problem, I think the fundament problem here is that JSON is not supported by calcite. Not sure if there's a way to make it at the calcite side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to optimize the serialization in future, my suggestion is to cache the serialized value in the DruidNode.
Yeah, I suppose that is possible by passing a @JacksonInject ObjectMapper
into the DruidNode
constructor.
But it really seems unnecessary at this point.
@FrankChen021 , as you mention, serializing a small map is not a costly operation, we need not worry about that optimization right now.
I should think that there are many more costlier serialization steps involved in a single SELECT * FROM sys.servers
query.
If the labels map becomes large, then we have other problems as mentioned before anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this optimization helps. because on the web-console, the label is always selected, this is the most scenario that sys.server table is queried.
Considering the number of servers is not very large(and even the number is huge, at the early phase of this feature, it may not be widely used), the serialization here might NOT be a problem.
+1
sql/src/test/java/org/apache/druid/sql/calcite/schema/SystemSchemaTest.java
Outdated
Show resolved
Hide resolved
@vogievetsky, requested your review since this PR also touches multiple web-console files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some final non-blocking comments.
+1 after CI passes.
Best to get an approval from @vogievetsky too before merging this off.
isLeader ? 1L : 0L, | ||
toStringOrNull(discoveryDruidNode.getStartTime()) | ||
}; | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding the try-catch in the 2 places, better to add a new method in JacksonUtils
named writeValueAsString()
which catches the JsonProcessingException
and throws a DruidException
instead. It will be useful for other places in the code too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my initial implementation, let me know if you have any comments about it.
public static String writeValueAsString(ObjectMapper jsonMapper, Object value) throws DruidException
{
try {
return jsonMapper.writeValueAsString(value);
}
catch (JsonProcessingException e) {
throw InvalidInput.exception(e, "Failed to serialize object as JSON");
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw InternalServerError.exception()
since we don't know exactly why the serialization failed.
sql/src/test/java/org/apache/druid/sql/calcite/schema/SystemSchemaTest.java
Show resolved
Hide resolved
sql/src/test/java/org/apache/druid/sql/calcite/schema/SystemSchemaTest.java
Outdated
Show resolved
Hide resolved
Hi @vogievetsky |
This PR adds support for labels being configured in the config file. It uses the key druid.labels and expects the value to be a JSON structure of key-value pairs.
The labels will be displayed in the web-console under the "Services" tab.
Example:
druid.labels={"broker-label":"broker-value","location": "Airtrunk"}
Below is a screenshot of the implementation with different labels for different node types. Note that some columns have been hidden to capture the important details.