Page MenuHomePhabricator

[analyzer] Drastically simplify the tblgen files used for checkers
ClosedPublic

Authored by Szelethus on Nov 1 2018, 1:54 PM.

Details

Summary

TL;DR: Interestingly, only about the quarter of the emitter file is used, the DescFile entry hasn't ever been touched [1], and the entire concept of groups is a mystery, so I removed them.

Details:

Now that I've spent quite a lot of time digging through how the analyzer is invoked, specifically analyzer configurations, I wanted to make sense out of the (in my opinion) somewhat outdated checker option issue, and more generally, checker registration.

@NoQ mentioned on the mailing list [2] that maybe it'd be worth converting the tblgen solution to a .def file, which makes a lot of sense:

  • As far as I know, tblgen is mainly used by actual LLVM backend developers, and folks on the Static Analyzer may not be that familiar with the language.
  • We wouldn't need to run tblgen to see what the file actually will turn into, that is a feature I'd really appreciate.
  • Somewhat related to the earlier point, but unless you spend 10-20 minutes figuring out where checkers come from, it would seem they are created out of thin air. Changing to a .def format removes an unnecessary complication.

There are some cons though:

  • tblgen files look awesome. I get that this isn't a very strong point, but it's a pretty sight compared to a .def file.
  • It would probably imply very invasive changes in a lot of files doing the checker registration, because there's a lot of work the emitter does, that would ideally need to be handled inside the analyzer.

I did this change, so that I had less complications to deal with, should we agree that a change to .def is a good idea.

[1] http://lists.llvm.org/pipermail/cfe-dev/2018-October/059664.html
[2] http://lists.llvm.org/pipermail/cfe-dev/2018-October/059976.html

Diff Detail

Repository
rL LLVM

Event Timeline

Szelethus created this revision.Nov 1 2018, 1:54 PM
xazax.hun accepted this revision.Nov 2 2018, 5:09 AM

LGTM, but let's wait for @NoQ before committing.

This revision is now accepted and ready to land.Nov 2 2018, 5:09 AM
Szelethus updated this revision to Diff 172533.Nov 4 2018, 3:04 PM
  • Came to the shocking realization that Hidden is also an unused property, so I removed that too.
  • Added comments to CheckerBase.td.
  • Added missing // end "packagename" comments

Now that CheckerBase.td isn't that cryptic, maybe it isn't worth pursuing the .def file conversion -- it probably would take a lot of effort for little value. My main issue was the maintenance of these tblgen files, but not that they have been simplified, I think it's fine to leave as is.

Szelethus updated this revision to Diff 172534.Nov 4 2018, 3:05 PM
Szelethus set the repository for this revision to rC Clang.
NoQ accepted this revision.Nov 9 2018, 2:59 PM

Thx!! Burn it.

tblgen files look awesome. I get that this isn't a very strong point, but it's a pretty sight compared to a .def file.

I agree that it's aesthetically satisfying, but it is (1) inconvenient to write because that's the whole new syntax you need to memorize, i.e. all those <> and semicolons and (2) not cooperating when i want to look up checker name by source file (have to scan a few pages up to see the alias for the package and then jump multiple times to see what it expands to). So i'm still for ditching tblgen eventually :) But i definitely don't insist.

I agree that it's aesthetically satisfying, but it is (1) inconvenient to write because that's the whole new syntax you need to memorize, i.e. all those <> and semicolons and (2) not cooperating when i want to look up checker name by source file (have to scan a few pages up to see the alias for the package and then jump multiple times to see what it expands to). So i'm still for ditching tblgen eventually :) But i definitely don't insist.

It is also a very powerful tool, and once you figure out how it works (which obviously can't be expected every time it has to be touched), it does its job wonderfully. The issue with recreating this by abusing the preprocessor is that it leads to

  • Very cryptic errors messages while developing
  • A ton of warnings if by accident it's commited with a flaw that a platform will pick up (I managed to generate 976! And that's not counting non-windows platforms)
  • Arguably harder to comprehend code, that either "just works", or will be the nightmare of the ages to fix

Like many other things around here, tblgen usage is severely underdocumented, and if the greatest drawback of not understanding how it works could be fixed (via even more documentation), I believe we'd appreciate this tool a lot more. I'm more than willing to work on this a bit more, as I already chew through most of the related files in the last couple weeks.

This revision was automatically updated to reflect the committed changes.