Skip to content

[build] Reformatter not reaching a fixed point, removed information #3895

@kaby76

Description

@kaby76

@mike-lischke

This is a problem with the recently added antlr-format. The tool has a bug in reaching a fixed point (i.e., if I take version "before" which is outputed by the tool and rerun the tool on it, the tool should return exactly the same file). In addition it removed a comment that should not have been removed. It's essential that the antlr-format tool outputs a format that reliably follows the coding standard.

Here are four successive versions of PlSqlLexer.g4, the first version the initial. With the second through fourth versions, the START_CMD rule was changed by antlr-format, each time to a different format.

  1. Version 7fbb97b, committed 4 months ago.

  2. Version 7535367, committed 3 weeks ago. This file is the 1st reformat using antlr-format, which was part of the PR to reformatted all the grammars.

  3. Version f083ee2. This file is the 2nd reformat using antlr-format, associated with my PR to perform "auto reformat".

  4. Version be1d809. This file is the 3rd reformat using antlr-format, committed 12 hours ago, and associated with a PR to modify PlSql.

Each version of the file was altered only by the tool.

Starting with the 1st version of START_CMD:

// TODO: should starts with newline
START_CMD
//: 'STA' 'RT'? SPACE ~('\r' | '\n')* NEWLINE_EOF
// https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12002.htm
// https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12003.htm
: '@''@'?
;

After 1st application of antlr-format:

// TODO: should starts with newline
START_CMD
//: 'STA' 'RT'? SPACE ~('\r' | '\n')* NEWLINE_EOF
: // https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12002.htm
'@' '@'?
; // https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12003.htm

The 2nd application of antlr-format (from my PR) reformats it again.

// TODO: should starts with newline
START_CMD
: // https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12002.htm
'@' '@'?
; // https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12003.htm

Finally, the last PR applies antlr-format a third time, changing the rule again.

// TODO: should starts with newline
START_CMD: // https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12002.htm
'@' '@'?
; // https://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12003.htm

Analysis

The comment //: 'STA' 'RT'? SPACE ~('\r' | '\n')* NEWLINE_EOF was removed by antlr-format on the 2nd application of antlr-format. The formatter should not be removing information.

Of the 16 grammar files that were reformatted with my PR. f083ee2 https://github.com/antlr/grammars-v4/actions/runs/7242750577/job/19728583870#step:6:11 , most seem to be minor changes. But, the formatter should output a fixed point version of the grammar on first try. Otherwise I will have to repeat the application until it a fixed point is achieved.

I wrote a script to repeatably apply antlr-format until a fixed point is achieved.

#

rm -rf foo
mkdir foo
pushd foo
cp ../PlSqlLexer.g4.7fbb97b PlSqlLexer.g4
i=1
while :
do
	echo Iteration $i
	cp PlSqlLexer.g4 PlSqlLexer.g4.before
	cp PlSqlLexer.g4 PlSqlLexer.g4.after
	dos2unix *.g4
	antlr-format PlSqlLexer.g4.after 2>&1 1> /dev/null
	dos2unix *.g4
	diff PlSqlLexer.g4.before PlSqlLexer.g4.after
	if [ $? -ne 0 ]
	then
		echo No fixed point yet.
	else
		echo Fixed point achieved.
		break
	fi
	cp PlSqlLexer.g4.after PlSqlLexer.g4
	i=`expr $i + 1`
done

We don't see the fixed point achieved for PlSqlLexer.g4 until 5 applications of antlr-format.
out.txt

What is more troubling is whether there is a grammar (or many) that has (have) no fixed point at all. In this case the tool always produces a new version ad infinitum.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions