Skip to content

[#5152] Write preprocessed P4 to <program_name>.p4pp file when --save-temps option is provided#5153

Open
kfcripps wants to merge 6 commits intop4lang:mainfrom
kfcripps:5152
Open

[#5152] Write preprocessed P4 to <program_name>.p4pp file when --save-temps option is provided#5153
kfcripps wants to merge 6 commits intop4lang:mainfrom
kfcripps:5152

Conversation

@kfcripps
Copy link
Contributor

Closes #5152.

Marking as draft until we decide:

  • What to name the option - I have named it -P (for "Preprocessed") for now, but @asl also suggested --save-temps.
  • What to name the generated file - I named it repro.p4 (short for "reproducer"), but maybe there is a better name.

@asl
Copy link
Contributor

asl commented Feb 27, 2025

Not sure why the output should be named repro.p4. Normally for gcc / clang we're having:

  • Option is named --save-temps
  • For an input file named "foo.c" the preprocessed output is named foo.i.

@kfcripps
Copy link
Contributor Author

Not sure why the output should be named repro.p4. Normally for gcc / clang we're having:

  • Option is named --save-temps
  • For an input file named "foo.c" the preprocessed output is named foo.i.

Sure, that sounds reasonable to me.

@ChrisDodd
Copy link
Contributor

Isn't there already a -E option that just runs the preprocessor and writes that to stdout?

@kfcripps
Copy link
Contributor Author

@ChrisDodd Yes, but the compiler exits immediately after doing so - the purpose of this option is to be able to save the preprocessed P4 and continue with compilation instead of exiting.

@vgurevich
Copy link

Not sure if it matters, but the existing Tofino implementation names this file as foo.p4pp

@vgurevich
Copy link

vgurevich commented Feb 28, 2025

The other thing I'd strongly recommend if you want to do it in the frontend: sanitize the preprocessed file by not including the standard system includes (i.e. core.p4 and/or your own architectural files), or at least have an option to do that. That means the file will have the user's code preprocessed, but will still use #include <core.p4> and #include <arch.p4>.

This will allow to reproduce old issues much easier or notice compatibility issues whenever you change your arch files. Believe me, you'd like you did it.

@fruffy fruffy added the core Topics concerning the core segments of the compiler (frontend, midend, parser) label Feb 28, 2025
@kfcripps
Copy link
Contributor Author

kfcripps commented Mar 3, 2025

Not sure if it matters, but the existing Tofino implementation names this file as foo.p4pp

Giving the generated preprocessed file a .p4pp suffix sounds good to me.

The other thing I'd strongly recommend if you want to do it in the frontend: sanitize the preprocessed file by not including the standard system includes (i.e. core.p4 and/or your own architectural files), or at least have an option to do that. That means the file will have the user's code preprocessed, but will still use #include <core.p4> and #include <arch.p4>.

This will allow to reproduce old issues much easier or notice compatibility issues whenever you change your arch files. Believe me, you'd like you did it.

I currently have no strong motivation to do that, but I am not opposed if someone else wants to add an option to do that in the future.

@vlstill
Copy link
Member

vlstill commented Mar 3, 2025

My opinion is that --save-temps sounds like a reasonable option and I would say .p4pp is better than just .p4.

I have no strong opinion about leaving system includes, although having it configurable (maybe with default being "leaving" #include) could be best. The disadvantage of desugaring back to #include is that it can hide modifications (which is probably irrelevant for just preprocessing) and obviously it excludes the text of the includes so some issues might be accidentally resolved by later changes.

newName += baseSuffix;
newName += name.extension();
if (savePreprocessed) {
std::filesystem::path fileName = makeFileName(dumpFolder, "repro.p4", "");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that using <program_name>.p4pp would be better, compared to having a fairly arbitrary fixed name.

/// if true preprocess only
bool doNotCompile = false;
/// if true save preprocessed P4 to repro.p4
bool savePreprocessed = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you decide to address my first comment, then, obviously, this comment needs to be changed too.

@kfcripps
Copy link
Contributor Author

Updates:

  • Renamed the option to --save-temps
  • Renamed the generated file to filename.p4pp
  • Original code was broken because in was consumed for repro.p4 generation, and was empty by the time actual program parsing occurred. The new code runs cpp twice and uses a second FILE stream for filename.p4pp generation instead.

@kfcripps kfcripps marked this pull request as ready for review March 18, 2025 17:54
@kfcripps kfcripps requested a review from vgurevich March 18, 2025 17:54
@kfcripps kfcripps changed the title [#5152] Write preprocessed P4 to repro.p4 file when -P option is provided [#5152] Write preprocessed P4 to <program_name>.p4pp file when --save-temps option is provided Mar 19, 2025
Copy link
Member

@vlstill vlstill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the double preprocessing is a bit unfortunate (it is wasteful and in an extreme case, they could produce different results because of changes on the filesystem). Maybe instead if we have --save-temps we should direct the preprocessor output to the p4pp file and then open that file for the parser.

Comment on lines 453 to 439
#ifdef __clang__
std::string cmd("cc -E -x c -Wno-comment");
cmd = "cc -E -x c -Wno-comment";
#else
std::string cmd("cpp");
cmd = "cpp";
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the cmd var moved outside the if? When you are already touching this, I suggest converting it to a plain C++ if. Both branches are valid C++ after all. It could even be a ternary op: std::string cmd = (__clang__) ? ... : ...; (I would add the extra brackets because it is a macro, but maybe formatter will not like them).

Copy link
Contributor Author

@kfcripps kfcripps May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the cmd declaration back inside the #ifdef, but I'm not sure that macros can be used inside of regular if statements / ternary operations. When I tried doing this, I got the error:

../../../frontends/common/parser_options.cpp: In member function 'std::optional<std::unique_ptr<_IO_FILE, void (*)(_IO_FILE*)> > P4::ParserOptions::preprocess() const':
../../../frontends/common/parser_options.cpp:435:27: error: '__clang__' was not declared in this scope
  435 |         std::string cmd = __clang__ ? "cc -E -x c -Wno-comment" : "cpp";
      |                           ^~~~~~~~~
ninja: build stopped: subcommand failed.

@kfcripps
Copy link
Contributor Author

I think the double preprocessing is a bit unfortunate (it is wasteful and in an extreme case, they could produce different results because of changes on the filesystem). Maybe instead if we have --save-temps we should direct the preprocessor output to the p4pp file and then open that file for the parser.

This does seem better to me too. Done.

@kfcripps kfcripps requested a review from vlstill May 12, 2025 22:39
kfcripps added 5 commits May 23, 2025 07:32
Signed-off-by: Kyle Cripps <kyle@pensando.io>
…program.p4pp.

Signed-off-by: Kyle Cripps <kyle@pensando.io>
Signed-off-by: kfcripps <kyle@pensando.io>
Signed-off-by: kfcripps <kyle@pensando.io>
Signed-off-by: Kyle Cripps <kyle@pensando.io>
@kfcripps kfcripps requested review from asl and vlstill and removed request for asl and vlstill May 27, 2025 17:23
Signed-off-by: Kyle Cripps <60898032+kfcripps@users.noreply.github.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the --save-temps option to save preprocessed P4 code to a <program_name>.p4pp file without exiting compilation, addressing issue #5152.

Key Changes:

  • Adds --save-temps command-line option to save preprocessed output
  • Implements logic to write preprocessed content to a .p4pp file and continue compilation
  • Updates help text across test output files to document the new option

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
frontends/common/parser_options.h Adds savePreprocessed boolean flag to track when the option is enabled
frontends/common/parser_options.cpp Implements --save-temps option registration and preprocessing logic to save and reopen the preprocessed file
testdata/p4_16_errors_outputs/issue4365_no_disable_error.p4-stderr Updates help text to include the new --save-temps option
testdata/p4_16_errors_outputs/issue4365_no_demote_error_to_warning.p4-stderr Updates help text to include the new --save-temps option
testdata/p4_16_errors_outputs/issue4365_no_demote_error_to_info.p4-stderr Updates help text to include the new --save-temps option

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


in = fopen(fileName.c_str(), "r");
if (in == nullptr) {
::P4::error(ErrorType::ERR_IO, "Error invoking preprocessor");
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "Error invoking preprocessor" is misleading here since the preprocessor already succeeded. The actual error is that the saved preprocessed file cannot be reopened for reading. Consider a more accurate error message like "Error opening preprocessed file for reading" or "Failed to reopen saved preprocessed file".

Suggested change
::P4::error(ErrorType::ERR_IO, "Error invoking preprocessor");
::P4::error(ErrorType::ERR_IO, "Error opening preprocessed file for reading");

Copilot uses AI. Check for mistakes.
savePreprocessed = true;
return true;
},
"Saves preprocessed P4 to filename.p4pp and do not exit compilation.");
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "do not exit compilation" should be "does not exit compilation" to match the singular subject of the sentence.

Suggested change
"Saves preprocessed P4 to filename.p4pp and do not exit compilation.");
"Saves preprocessed P4 to filename.p4pp and does not exit compilation.");

Copilot uses AI. Check for mistakes.
-U arg Undefine macro (passed to preprocessor)
-E Preprocess only, do not compile (prints program on stdout)
-M Output `make` dependency rule only (passed to preprocessor)
--save-temps Saves preprocessed P4 to filename.p4pp and do not exit compilation.
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "do not exit compilation" should be "does not exit compilation" to match the singular subject of the sentence.

Suggested change
--save-temps Saves preprocessed P4 to filename.p4pp and do not exit compilation.
--save-temps Saves preprocessed P4 to filename.p4pp and does not exit compilation.

Copilot uses AI. Check for mistakes.
Comment on lines +475 to +477
std::filesystem::path fileName(file.stem());
fileName += ".p4pp";
fileName = makeFileName(dumpFolder, fileName, "");
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The code unnecessarily constructs fileName in lines 475-476 only to pass it to makeFileName which deconstructs and reconstructs it. This can be simplified by calling makeFileName(dumpFolder, file, ".p4pp") directly, which would be clearer and avoid the intermediate construction.

Suggested change
std::filesystem::path fileName(file.stem());
fileName += ".p4pp";
fileName = makeFileName(dumpFolder, fileName, "");
std::filesystem::path fileName = makeFileName(dumpFolder, file, ".p4pp");

Copilot uses AI. Check for mistakes.
std::stringstream stream;
char *line = nullptr;
size_t len = 0;
ssize_t read = 0;
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Unused variable: The variable read is assigned but never used. The return value of getline is only needed to check for end-of-file, so you can remove the variable declaration and check the condition directly: while (getline(&line, &len, in) != -1).

Copilot uses AI. Check for mistakes.
-U arg Undefine macro (passed to preprocessor)
-E Preprocess only, do not compile (prints program on stdout)
-M Output `make` dependency rule only (passed to preprocessor)
--save-temps Saves preprocessed P4 to filename.p4pp and do not exit compilation.
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "do not exit compilation" should be "does not exit compilation" to match the singular subject of the sentence.

Suggested change
--save-temps Saves preprocessed P4 to filename.p4pp and do not exit compilation.
--save-temps Saves preprocessed P4 to filename.p4pp and does not exit compilation.

Copilot uses AI. Check for mistakes.
-U arg Undefine macro (passed to preprocessor)
-E Preprocess only, do not compile (prints program on stdout)
-M Output `make` dependency rule only (passed to preprocessor)
--save-temps Saves preprocessed P4 to filename.p4pp and do not exit compilation.
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "do not exit compilation" should be "does not exit compilation" to match the singular subject of the sentence.

Suggested change
--save-temps Saves preprocessed P4 to filename.p4pp and do not exit compilation.
--save-temps Saves preprocessed P4 to filename.p4pp and does not exit compilation.

Copilot uses AI. Check for mistakes.
std::ofstream filestream{fileName};
if (filestream) {
if (Log::verbose()) std::cerr << "Writing preprocessed P4 to " << fileName << std::endl;
filestream << stream.str();
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for file write failure. If the file write fails (e.g., due to disk space issues or permission problems), the code silently continues. Consider adding error reporting if !filestream after opening or if the write operation fails.

Suggested change
filestream << stream.str();
filestream << stream.str();
if (!filestream) {
::P4::error(ErrorType::ERR_IO, "Failed to write preprocessed P4 to %s", fileName.c_str());
perror("");
filestream.close();
return std::nullopt;
}

Copilot uses AI. Check for mistakes.
Comment on lines +466 to +472
char *line = nullptr;
size_t len = 0;
ssize_t read = 0;

return folder / newName;
while ((read = getline(&line, &len, in)) != -1) {
stream << line;
}
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory leak: The line buffer allocated by getline is never freed. According to POSIX standards, getline allocates memory that must be freed by the caller. Add free(line) after the while loop to prevent memory leaks.

Copilot uses AI. Check for mistakes.
while ((read = getline(&line, &len, in)) != -1) {
stream << line;
}
closeFile(in);
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Calling closeFile(in) on stdin will cause issues. When file == "-", in is set to stdin (line 433), but closeFile calls pclose() which should only be used on streams opened with popen(). This will result in undefined behavior. The code should check if in == stdin before calling closeFile(in), or handle stdin differently.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Topics concerning the core segments of the compiler (frontend, midend, parser)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add option to print preprocessed P4 and continue with compilation

7 participants