-
Notifications
You must be signed in to change notification settings - Fork 81
Add string format tools to library. #2373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2373 +/- ##
=======================================
Coverage 47% 47%
- Complexity 6560 6571 +11
=======================================
Files 780 780
Lines 64398 64398
Branches 9628 9628
=======================================
+ Hits 30509 30517 +8
+ Misses 31569 31555 -14
- Partials 2320 2326 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Just a few remarks and questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinging @jurgenvinju since he has in the past done some more stuff around formatting & indentation.
I have the feeling we're missing something in this PR, a feature we already have in rascal. (aside from the fact I don't really like all these regexps and char loops)
} | ||
@synopsis{Split a string to an indentation prefix and the remainder of the string.} | ||
tuple[str indentation, str rest] splitIndentation(/^<indentation:\s*><rest:.*>/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the string contains multiple lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should be an explicit exception (invalid argument or similar)
int count = size(findAll(input, nl)); | ||
linesepCounts[nl] = count; | ||
// subtract all occurrences of substrings of newline characters that we counted before | ||
for (str snl <- substrings(nl), linesepCounts[snl]?) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this almost looks like pattern matching on strings? (which we only have reasonable support over)
for example:
rascal>visit("abcd") { case str m : println(m); }
abcd
bcd
cd
d
come to think of it, this whole function smells like an parsing automata. Where we build a big state table of all the possible matches and then iterate through all the chars and count the matches based on their state.
In java this would be 20/30 lines, but in rascal we might be missing some primitives (as we don't have a character loop).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we hard-code the set of newline characters (e.g. to all Unicode newline chars), we could write it as a grammar and use the parser generator. Downside (as we discussed) is that all (transitive) imports of this module will trigger generation of a parser. We could also move some of those to a specific Format
module.
fa8fe8b
to
836f117
Compare
836f117
to
8d8de3e
Compare
This PR adds many generic tools that can be used to process/format strings.
TODO