Skip to content

Testing#66

Open
svenssonaxel wants to merge 6 commits intomartinpitt:mainfrom
svenssonaxel:testing
Open

Testing#66
svenssonaxel wants to merge 6 commits intomartinpitt:mainfrom
svenssonaxel:testing

Conversation

@svenssonaxel
Copy link
Contributor

@svenssonaxel svenssonaxel commented Aug 5, 2025

@svenssonaxel svenssonaxel marked this pull request as draft August 5, 2025 23:29
tests/test.py Outdated
def convert(self, line: str):
return json.loads(line)

def parse_fatrace_text_line(line):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR, I'm not a fan of this. This logic is way too complex for a test -- you would now need a whole unit test for this function.

@svenssonaxel
Copy link
Contributor Author

svenssonaxel commented Aug 6, 2025 via email

@martinpitt
Copy link
Owner

Yes, maybe, but

  • it forces all tests to be in JSON mode, and thus makes them harder to write. Text streaming output is still a valid (and at least for me, the primary) mode of operation, and I want to test it as such.
  • Debugging and extending this parsing is unnecesarily hard.
  • As you can see on the diff stat, the effort of writing such a generalization is way more expensive than direct tests, which are also easier to understand and adjust.

Sorry, I don't want to take this.

@svenssonaxel
Copy link
Contributor Author

it forces all tests to be in JSON mode

FTR, you can use FatraceRunnerText independently, similarly to how I use FatraceRunnerJson independently in test_json. So, FatraceRunner allows, but does not force all tests to be in JSON (or rather "structured") mode.

As you can see on the diff stat, the effort of writing such a generalization is way more expensive than direct tests, which are also easier to understand and adjust.

FTR, this PR is 161 insertions(+), 179 deletions(-). The stats shown on GitHub includes #60 since it's not merged yet.

Sorry, I don't want to take this.

That's fair, feel free to close it.

@martinpitt
Copy link
Owner

So, FatraceRunner allows, but does not force all tests to be in JSON (or rather "structured") mode.

Ah right -- sorry, I didn't read it carefully enough.

The stats shown on GitHub includes #60 since it's not merged yet.

Oops, sorry! I just took a quick look so far, I'm stuck in $daytime job work mostly.

@svenssonaxel
Copy link
Contributor Author

So, FatraceRunner allows, but does not force all tests to be in JSON (or rather "structured") mode.

Ah right -- sorry, I didn't read it carefully enough.

Turns out I'm the hasty one here. What I said should be true after the recent push, and as an example I restored test_command to use two separate runners.

@svenssonaxel svenssonaxel force-pushed the testing branch 2 times, most recently from 4cfc374 to 72efee6 Compare September 18, 2025 06:37
@svenssonaxel
Copy link
Contributor Author

@martinpitt Rebased and polished a bit. Ultimately this is a matter of taste, so let me know if this is a go or not. If it's a go, I might fix type annotations a bit.

@svenssonaxel svenssonaxel marked this pull request as ready for review September 18, 2025 06:40
@svenssonaxel svenssonaxel mentioned this pull request Sep 21, 2025
2 tasks
Copy link
Owner

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-read this, and sorry, I still don't like it. It's completely opposite of how I want a test to look like.

The bit that I do like is the the consistent parallel running of text and json, though. I.e. I would actually appreciate writing

self.assert_log(
    lambda e: e["comm"] == "rm" and e["path"] == cwd and e["types"] == "D"),
    rf"^rm.*:\s+D\s+{cwd_re}$"

as that forces doing consistent checks between the two modes, yet keeps the direct readable REs how the output is expected to look like. WDYT?

tests/test.py Outdated
class FatraceRunner:
def __init__(self, args: list[str]):
class FatraceRunnerBase:
def __init__(self, args: list[str], convert_line = lambda x: x, convert_condition = lambda x: x):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the conversion functions here and condition arguments below are missing types. Probably best to add a ConvertLineType constant etc. at the top, with some comment/example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will go away when the "lambda gymnastics" is removed.

tests/test.py Outdated
def has_log(self, condition) -> bool:
"""Check if any line matches the condition."""

if not self.finished:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh magic side effect, I like the explicit .finish() call better. But it's not a strong "meh".

tests/test.py Outdated

def FatraceRunnerText(*args: str):
assert "--json" not in args
return FatraceRunnerBase([*args], convert_condition = lambda regex: lambda line: bool(re.search(regex, line)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just making self.convert_condition and _line actual methods, instead of this lambda gymnastics? The base function can throw a NotImplementedError for them. Then the parsing code can also move there.

test_file_str = str(test_file)

# file creation
f.assert_log(rf"^touch.*\sC?W?O\s+{re.escape(test_file_str)}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still hate this, really. It's now not clear at all any more how the expected text output is supposed to look like. I often write the test first, as that's a good way to design how it should look like. And then implement the functionality with a good model in my mind. This is completely upside down, and adjusting the big generic parsing function is the opposite of what a test should be -- simple, obvious, orthogonal, robust.

@svenssonaxel
Copy link
Contributor Author

I re-read this, and sorry, I still don't like it. It's completely opposite of how I want a test to look like.

That's fine. I think this is ultimately a matter of taste, and your gοod taste is what made fatrace what it is.

The bit that I do like is the the consistent parallel running of text and json, though. I.e. I would actually appreciate writing

self.assert_log(
    lambda e: e["comm"] == "rm" and e["path"] == cwd and e["types"] == "D"),
    rf"^rm.*:\s+D\s+{cwd_re}$"

as that forces doing consistent checks between the two modes, yet keeps the direct readable REs how the output is expected to look like. WDYT?

I think it's a fine idea. In your taste, REs are more readable than the lambdas, while in my taste they represent unnecessary repetition of ad-hoc parsing logic. Let's go with your taste. How about:

  • The test case provides both a RE and a lambda, as you suggested.
  • The lambda is used to test for presence in the json log
  • The RE is used to test for presence in the text log
  • In addition, every line in the text log is parsed and used to check that the RE and lambda are consistent. I think this is necessary in order to force consistent checks. Otherwise it'd just force parallell checks.

@martinpitt
Copy link
Owner

Hello @svenssonaxel !

I think it's a fine idea. In your taste, REs are more readable than the lambdas, while in my taste they represent unnecessary repetition of ad-hoc parsing logic.

Note that I'm not ranking REs/lambdas for readability (FWIW, JSON is easier to validate, I fully agree). My point is, the text output is a thing, and tests should directly reflect how it looks like, also to design new features.

* The test case provides both a RE and a lambda, as you suggested.
* The lambda is used to test for presence in the json log
* The RE is used to test for presence in the text log

* In addition, every line in the text log is parsed and used to check that the RE and lambda are consistent. I think this is necessary in order to force consistent checks. Otherwise it'd just force _parallell_ checks.

That feels a bit too much effort for me, and it will still require that complicated/hard to maintain translator. But if you really like this, I'll take it 😬

Thanks!

@svenssonaxel
Copy link
Contributor Author

@martinpitt ok, finally got around to it.

I made a significant effort to avoid the parsing logic, but to no avail. What I tried was to run two instances of fatrace in parallel and match up the printed events. Turns out that the kernel can report events merged in different ways to different listeners, so this line of effort turned out to be at least as complex as the parsing.

If it makes you feel better, the parsing did pay off. Using that check, I've found and fixed several critical regex bugs. If you use the github runner on fc43959 Testing: Cross check text and json mode you'll see some failures, which are then fixed in the next commit.

@svenssonaxel svenssonaxel marked this pull request as ready for review November 25, 2025 05:25
Copy link
Owner

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Next round.

Comment on lines 26 to 27
- name: Run tests
run: sudo python3 -m unittest -v
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running linters before test may be a temporary convenience for this, but we shouldn't land this. Clearly functionality is more important than code style? I.e. if a MR fails, that's the more important piece of information than "your line is too long".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will remove.

class Device(TypedDict):
major: int
minor: int
class Parent(TypedDict, total=False):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Why do we have ruff when it doesn't complain about missing newlines between class definitions? I want my money back! 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

Comment on lines +81 to +84
self.log_dir: tempfile.TemporaryDirectory[str] = tempfile.TemporaryDirectory()
self.text_output_file: str = os.path.join(self.log_dir.name, "fatrace.log.txt")
self.json_output_file: str = os.path.join(self.log_dir.name, "fatrace.log.json")
self.finished: bool = False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOI: These types are rather obvious, and mypy should be perfectly able to figure these out? Did you run into type errors, or is that more of a personal preference? (No big objection from me, just a bit unnecessary clutter)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I got any type errors. I'm happy removing type hints given a rule for which ones.

tests/test.py Outdated
# start processes
self.text_process: subprocess.Popen[bytes] = subprocess.Popen(
[fatrace_bin, "-o", str(self.text_output_file)] +
[x for x in args if x!="--json"])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of asserting it and filtering it our here, wouldn't it be easier to not repeat this in the caller, and prepending/appending it in json_process below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +293 to +294
f.assert_log(lambda e: e["comm"] == "touch" and e["path"] == test_file_str and "O" in e["types"],
rf"^touch.*\sC?W?O\s+{re.escape(test_file_str)}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR, I like these assertions. They are explicit in how both outputs are expected to look like. 👍

Comment on lines +155 to +156
# For each text mode output line, assert that the regex matches it iff
# the predicate matches it after parsing
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please drop this, together with parse_fatrace_text_line()? The assertion structure already enforces checking text and JSON output. Let's assume that us developers are not malicious against ourselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you haven't already, take a look at 60bf4ac Fix critical bugs in testing regexes. No malice was required to produce these bugs. They are critical bugs, in the sense that the buggy regexes do not test what they intend to test.

For example, rf"^rm.*:\s+D\s+{cwd_re}$") would result in a passing test even if fatrace didn't output anything at all from any rm command. These bugs were only caught using the assertions you now want to remove; neither of us detected them through manual inspection.

I agree that the parsing code is on the brittle end of the scale. It's also not strictly necessary. My judgment is that it's still a net positive and should remain, until it's no longer a net positive. If a time comes when adjusting or growing the parsing code is more than you can bear, it can then be discarded, safely except for the increased risk of such regex bugs. Removing it now would increase this risk earlier than necessary, and I don't see the point of doing so. WDYT?

tests/test.py Outdated
Comment on lines 47 to 48
def _exe(argv_any: tuple[Any, ...], **kwargs) -> None:
argv: list[str] = [str(x) for x in argv_any]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm allergic to Any. In cases where a function is really just passing on some opaque value to the stdlib, Unknown is better and more explicit. However, that's not what happens here: A process argv must always be a str list. What blows up when declaring it that way?

I see below you are trying to get rid of str()ing pathlib.Path objecs. For quality-of-live it could support that explicitly (Sequence[str|Path]).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

tests/test.py Outdated
def _exe(argv_any: tuple[Any, ...], **kwargs) -> None:
argv: list[str] = [str(x) for x in argv_any]
print(f"Running command: {' '.join(argv)}", file=_exe_stream)
proc = subprocess.run(argv, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=False, **kwargs)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unsafe as a generic API. If a process outputs too much stuff on stdout and nobody reads the buffers, it'll block. The previous code used stdout=NULL for that reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Merge FatraceRunner* classes into one.

Let class FatraceRunner run two instances of fatrace in parallel, one
with and one without --json. Each assert provides a regex for testing
the text mode log and a lambda for testing the JSON mode log.

The regex and lambda should match the same sets of events. However, we
can't rely on the assumption that the two fatrace instances catch the
exact same sequence of events. Instead, for each assert and text mode
output line, we assert that the regex matches the line iff the lambda
matches it after parsing.
These bugs are actual false positives, i.e. they fix regexes that
unintentionally matched an event actually present in that test case,
that the lambda did not match.

- Three instances of searching for `rm` but also matching
  `rmdir`. Fixed by adding `\(`.
- One instance of matching an unintended sub path, fixed by adding
  `$`.
- One instance of matching any path, fixed by adding the intended path.
@svenssonaxel svenssonaxel force-pushed the testing branch 2 times, most recently from 737af3d to b02afc5 Compare December 7, 2025 19:26
@svenssonaxel
Copy link
Contributor Author

@martinpitt ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants