This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Handle Verilog preprocessor directives
ClosedPublic

Authored by sstwcw on May 1 2022, 8:04 PM.

Details

Summary

Verilog uses the backtick instead of the hash. In this revision
backticks are lexed manually and then get labeled as hashes so the logic
for handling C preprocessor stuff don't have to change. Hashes get
labeled as identifiers for Verilog-specific stuff like delays.

Diff Detail

Event Timeline

sstwcw created this revision.May 1 2022, 8:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2022, 8:04 PM
sstwcw requested review of this revision.May 1 2022, 8:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2022, 8:04 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
sstwcw edited the summary of this revision. (Show Details)May 1 2022, 8:23 PM

This revision depends on D124748, but somehow it doesn't show up when I open the parent revision dialog.

clang/lib/Format/FormatTokenLexer.cpp
1105

Consider a raw string, for a better reading.

1129

Otherwise I don't see why you change that.

You add significant number of keywords here but I don't see any of them being tested? can you add a unit tests to cover what you are adding

sstwcw updated this revision to Diff 429532.May 15 2022, 6:05 AM
  • add tests and remove LINE and FILE special cases
sstwcw added inline comments.May 15 2022, 6:19 AM
clang/lib/Format/FormatTokenLexer.cpp
1105

You mean like this?

static const llvm::Regex VerilogToken(R"re(^('|``?|\\(\\)re"
                                      "(\r?\n|\r)|[^[:space:]])*)");
1129

The function is only supposed to be used for Verilog, so && is correct. If you mean why this part is different from D121758, I changed the name from readRawTokenLanguageSpecific to readRawTokenVerilogSpecific as suggested by a reviewer. Then I moved the check for language out of that function like what tryParseJSRegexLiteral does.

The two parents of this revision change the same file, so the build bot says patch does not apply. Does that mean I have to submit the parent patches with less context?

The two parents of this revision change the same file, so the build bot says patch does not apply. Does that mean I have to submit the parent patches with less context?

Maybe just wait until one of the parents has landed. We can make a review without the bot. :)
Or put the parents in relation.

You add significant number of keywords here but I don't see any of them being tested? can you add a unit tests to cover what you are adding

I think this is still open?

clang/lib/Format/FormatTokenLexer.cpp
1105

I'd put it in one line, but basically yeah.

1129

I think I missed the parens.

1131

As demonstrated by me, the parens can go under in a quick read.

sstwcw marked 5 inline comments as done.Jun 1 2022, 6:35 PM
sstwcw added inline comments.
clang/unittests/Format/FormatTestVerilog.cpp
179

You add significant number of keywords here but I don't see any of them being tested? can you add a unit tests to cover what you are adding

I think this is still open?

The keywords added in this revision are tested here. Most of those added in D123450 don't have tests, but I haven't added stuff to handle them yet.

sstwcw updated this revision to Diff 433614.Jun 1 2022, 6:36 PM
  • use raw string for regex
  • use default style in test
  • remove parentheses
  • use seek now that skipOver no longer exists
clang/lib/Format/UnwrappedLineParser.cpp
1783

Why did you remove that if?

1800

And now the indentation here is off.

sstwcw marked 2 inline comments as done.Jun 2 2022, 2:40 AM
sstwcw added inline comments.
clang/lib/Format/UnwrappedLineParser.cpp
1783

Previously the if checked whether the line is comment and an identifier. For Verilog it's comment, a backtick, and an identifier. I found it easier to break if the line doesn't match the pattern.

sstwcw marked an inline comment as done.Jun 2 2022, 2:42 AM
sstwcw updated this revision to Diff 433702.Jun 2 2022, 2:43 AM
  • add comment and format
clang/lib/Format/FormatTokenLexer.cpp
1109

Add braces. Maybe check all other added ifs too. You can check with clang-format with InsertBraces and RemoveBracesLLVM.

clang/lib/Format/UnwrappedLineParser.cpp
1785
1805

I find just scopes a bit confusing. Maybe put the check if we want to break into a lambda and call that?

sstwcw updated this revision to Diff 435788.Jun 9 2022, 8:50 PM
  • add kw_ prefix to symbols treated as keywords
  • use lambda instead of breaking
sstwcw marked 3 inline comments as done.Jun 9 2022, 8:55 PM
sstwcw added inline comments.
clang/lib/Format/FormatToken.h
1172

Now that I added kw_ to verilogHashHash, four columns can't fit.

HazardyKnusperkeks added inline comments.
clang/lib/Format/FormatToken.h
981

It seems a comment before is missing. Should be fixed in a separate patch.

1172

Happens. :)

This revision is now accepted and ready to land.Jun 13 2022, 3:48 AM
This revision was automatically updated to reflect the committed changes.