Detect requires expressions in more unusable contexts. This is far from
perfect, but currently we have no good metric to decide between a
requires expression and a trailing requires clause.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
2824–2828 | s/doesn't/don't/ | |
2861–2867 | I think it's weird that your heuristic parses backward rather than forward. I would think that the next token after the requires keyword tells you what it is with pretty high probability: Or are those heuristics already present in trunk, and this PR is just dealing with the "unclear" case? | |
2908 |
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
2861–2867 | That would be so much better, but I can't easily look forward. Next is still nullptr, until I call nextToken(), but then I'm already moved along. But this got me thinking, at least for the easy stuff I can just go forward and don't start on the keyword in parseRequiresClause() and parseRequiresExpression(). The paren case is more tricky, but I will try something. |
clang/lib/Format/UnwrappedLineParser.cpp | ||
---|---|---|
2861–2867 | Present in main everything is a clause, except for requires expressions in a constraint expression. So the stuff where you use the requires expression in a "normal" boolean expression are misparsed and thus most likely misformatted. There is actually a peekToken(), let's see if this is better. |
Hey ho, sorry for the late comment here, but adding peekNextToken(n) is problematic, as this gets in the way of future changes we want to do to handle macros better.
Usually we want to use X = Tokens->getPosition() and FormatTok = Tokens->setPosition(X) pairs when doing look-ahead.
I did a quick attempt at fixing this, but ran into infinite loops later in the annotator :(
Generally, why do we need to have that much information? I.e. why do we need to know the exact type of the "requires" keyword?
I do understand we need to know the brace type, but that seems like it would be easier to figure out in the TokenAnnotator (where we already parsed UnwrappedLines).
Do we ever parse UnwrappedLines differently depending on requires clauses/expressions?
If not, we should really do the annotation in TokenAnnotator, where we already have nice parsing bounds from the parsed UnwrappedLines.
Who is we, I'm not part of that we and haven't heard of some macro improvements. And I don't see how that feature is harming you, but be my guest in changing that. If you look into the history of this change I had a heuristic approach which would only look behind to differentiate.
I don't know if that can be solved in the TokenAnnotator, but you and I have different opinions about that. I'd put more annotating in the UnwrappedLineParser, annotate it as soon as we can.
I'll happily review any changes proposed, but I will not rework this piece of code, unless I can see a big flaw in it (which I can't right now).
I changed it in 49aca00d63e14df8bc68fc4329e6cbc9c9805eb8.
"We" is the people working on clang-format :) I hope that we have a common goal of making clang-format as easy to maintain as we can.
FWIW, I once had the same opinion as you about best doing all parsing as early as possible, but djasper convinced me that the split was a good idea, and in the end, I think it turns out to be significantly less brittle to do more complex annotation in TokenAnnotator. E.g. we now have a lookahead limit of 50, which seems rather arbitrary, while in TokenAnnotator we could simply limit lookahead towards the current UnwrappedLine. Similarly, in TokenAnnotator, we already have all the parens connected, so we could simply look from requires l_paren to the corresponding r_paren and whether the next token is an l_brace. If I can find a bit of time I'll take an attempt at implementing it.
Your commit is in my view a an example of making that maintaining a bit harder, it didn't went through review, had you not posted it here I'd never seen it. LLVM receives to many commits to scan them for changes in clang-format. And as someone who isn't that long involved in clang-format I think there is an overview really missing.
For non-functional clean-ups generally llvm doesn't require pre-commit review - I did communicate here so people involved in the original change wouldn't miss the clean-up. I do agree that what commits to pre-review is a fine line, and usually try to err on the side of pre-review; I'll take your feedback into consideration for future changes.
Regarding a better overview, you're 100% right. This is something we've definitely not been good enough and we need to get better at.