-
Notifications
You must be signed in to change notification settings - Fork 584
Lsof improvements (show deleted as in lsof output) + files_only argument #1827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lsof improvements (show deleted as in lsof output) + files_only argument #1827
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, a couple of things fell out of it. Firstly, messing with the core linux extensions can have knock on effects, which is why we version everything. There is a way of updating it so that not everything breaks, but that one's a show stopper.
The other is just about appending to filenames to represent information about the file. If we're going to do that, it has to be unambiguously separable from the filename (ie, you could always tell what was the tag and what was the filename).
return f"<potentially smeared> {path}" | ||
|
||
if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0: | ||
path = f"{LinuxUtilities.deleted} {path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to put constants into the constants.linux
module, as all caps? Again, I'm not sure it's worthwhile for a tag like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the tag value will change in the future? I can see other modules rely on this tag to check specifics...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if it fits there, I have only seen constants from linux kernel header files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not keen on tacking flags at the start of the path. It feels more and more like there should be a structure or a method that can determine this, that's separate to the filename. I don't really like it for smear, but at least the filename can be smeared. It's not the filename that's deleted, it's the file, this should be a property of the file object, or something you can pass the file to which tells you whether it's been deleted or not. That then also solves the need for string based flags to say meta things about the file that shouldn't actually be part of the filename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about the logic in 8f60299?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow up to my own comment, we already show (deleted)
correctly, which was fixed in the parity push, in our d_path
replication, so whatever new PRs replace/enhance the existing code need to keep it. Link here:
return "/" + pre_name + " (deleted)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so my bad, apparently this is how the kernel does it (and it does it at the end). Sorry for messing you about @SolitudePy . I do still need @atcuno to do a review of it (because clearly I don't know linux that deeply), but he's very busy in the run up to blackhat/defcon so this may have to just wait a while I'm afraid. In the interim, feel free to move the string back to the end of the filename, sorry for the confusion I caused! 5:S
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well I absolutely agree that it shouldnt be in the end for the problems you have mentioned, but thats convention. were there other issues I should fix except that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SolitudePy @ikelos what is the latest on this now that the "(deleted)" string is understood?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably waiting on me. I'm still not happy about it, but apparently this is how the kernel does it. I absolutely wash my hands of resolving any difficulties that arise later from us doing things this way though. It seems ill thought out by the kernel to conflate the filename and its deleted state. I'll merge it though, thanks for your work on it (and trying your best to appease our differing opinions @SolitudePy )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much improved, last little bits and then it should be good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not happy with tagging filesnames with a string to identify them. It feels like a quick solution that could be setting us up for a whole heap of trouble in the future. Hopefully @atcuno can give this a look and suggest a better means of handling this?
return f"<potentially smeared> {path}" | ||
|
||
if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0: | ||
path = f"{LinuxUtilities.deleted} {path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not keen on tacking flags at the start of the path. It feels more and more like there should be a structure or a method that can determine this, that's separate to the filename. I don't really like it for smear, but at least the filename can be smeared. It's not the filename that's deleted, it's the file, this should be a property of the file object, or something you can pass the file to which tells you whether it's been deleted or not. That then also solves the need for string based flags to say meta things about the file that shouldn't actually be part of the filename.
|
||
fd_generator = linux.LinuxUtilities.files_descriptors_for_process( | ||
context, linuxutils_symbol_table, task | ||
context, linuxutils_symbol_table, task, files_only=include_files_only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure this can be determined from the file desriptor, and therefore doesn't require tampering with the files_descriptors_for_process
method? I really can't believe that we get this structure back, but we can only separate out files from not files within that method? I'd also prefer that we just add it to the file descriptor to say "type is file" rather than this overly specific additional parameter added to this method...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean the method files_descriptors_for_process
should return file type as well that will then be used in linux.lsof
to filter out non-files? right now linux.lsof
uses files_descriptors_for_process
directly, which uses LinuxUtilities.path_for_file
- and that's where the filtering happens via dname_is_valid
, so both methods need to have this argument. or I can also reuse the dname_is_valid
logic in linux.lsof
which seems less elegant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it feels like it should be one of the fd_fields
returned in the generator and then handled by FDInternal
. It might be a lot of work to thread it in that way, and I haven't even considered the version bumps that might be required to achieve it, plus I've no idea whether @atcuno would be happy doing that, but there's no indication that there aren't more fields (requiring more boolean flags to be added to the method) still to come. As such I'd like to find an extensible way of adding those flags, and given the entire method is about returning a set of flags, it feels as though this should be added to that existing set of flags...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my view on these hasn't really changed. Rather than designing the data flow properly, we're trying to carry the information by cramming it into other fields which is then extremely difficult to separate at a later date. It's the wrong way of doing it, and though I don't know how much work the right way would be, this is not it.
|
||
fd_generator = linux.LinuxUtilities.files_descriptors_for_process( | ||
context, linuxutils_symbol_table, task | ||
context, linuxutils_symbol_table, task, files_only=include_files_only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it feels like it should be one of the fd_fields
returned in the generator and then handled by FDInternal
. It might be a lot of work to thread it in that way, and I haven't even considered the version bumps that might be required to achieve it, plus I've no idea whether @atcuno would be happy doing that, but there's no indication that there aren't more fields (requiring more boolean flags to be added to the method) still to come. As such I'd like to find an extensible way of adding those flags, and given the entire method is about returning a set of flags, it feels as though this should be added to that existing set of flags...
return f"<potentially smeared> {path}" | ||
|
||
if inode and inode.is_readable() and inode.is_valid() and inode.i_nlink == 0: | ||
path = f"{LinuxUtilities.deleted} {path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's still fundamentally trying to cram to pieces of information into a single field. Whether the file has been deleted or not is distinct from the filename it has. We should be using a separate piece of information to track that, not overloading the filename field with two pieces of information that we'll later have difficult separating.
Improvements for Lsof to show deleted files as in lsof output on live system.
its probably shouldnt be in that function directly as its no longer mimic
prepend_path
kernel functionthis PR breaks commit 68b51e8 but it can easily be fixed if desired.