This is an archive of the discontinued LLVM Phabricator instance.

[globalisel][tablegen] Generate rule coverage and use it to identify untested rules
ClosedPublic

Authored by dsanders on Nov 7 2017, 10:21 AM.

Details

Summary

This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:

  • Enable LLVM_ENABLE_GISEL_COV in CMake
  • Build the compiler and run some tests
  • cat gisel-coverage-[0-9]* > gisel-coverage-all
  • Delete lib/Target/*/*GenGlobalISel.inc*
  • Build the compiler

Known issues:

  • ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual step due to a lack of a portable 'cat' command. It should be the concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
  • There's no mechanism to discard coverage information when the ruleset changes

Depends on D39742

Event Timeline

dsanders created this revision.Nov 7 2017, 10:21 AM
vsk added a subscriber: vsk.Nov 7 2017, 10:39 AM
vsk added inline comments.
include/llvm/Support/CodeGenCoverage.h
28

Name this 'setCovered' to be more explicit?

lib/Support/CodeGenCoverage.cpp
30

Could you use llvm::SmartMutex<true> and make this a member of CodeGenCoverage?

84

llvm::SmartScopedLock<true>

97

This is starting to look like sys::fs::createUniqueFile(). It might be better to just use the existing utility.

utils/llvm-gisel-cov.py
20

Missing doc string?

dsanders added inline comments.Nov 7 2017, 11:17 AM
include/llvm/Support/CodeGenCoverage.h
28

Ok

lib/Support/CodeGenCoverage.cpp
30

Could you use llvm::SmartMutex<true>

Sure.

and make this a member of CodeGenCoverage?

I don't think so. It needs to be a process-wide mutex and if I put it in CodeGenCoverage then there will be separate mutexes for each subtarget. I could put it in as a static member but then anyone that includes the header will have to include all the mutex support too.

84

Ok

97

I'm not sure on this one, createUniqueFile() would solve the multiple-writer problem too but it goes a lot further than necessary. It's ok for the files to be re-used as the OS recycles process id's.

utils/llvm-gisel-cov.py
20

Well spotted. I've forgotten to add the usage information for this script.

dsanders updated this revision to Diff 121944.Nov 7 2017, 11:27 AM
dsanders marked 4 inline comments as done.

CodeGenCoverage::covered() -> setCovered()
Use SmartMutex and SmartScopedLock
Add missing docstring

vsk added a comment.Nov 7 2017, 11:40 AM

This looks good to me, but it'd be worth getting another +1.

rovka accepted this revision.Nov 7 2017, 10:44 PM

Awesome, I could use something like this. LGTM with a few nits.

Is the long-term intention to try and drive this coverage to 100% via in-tree tests, or rather via the test-suite? I'm asking because I was actually considering removing some of the arm-instruction-selector.mir tests which cover the same kind of pattern - e.g. ADD, SUB, AND, OR etc are all derived from an AsI1_bin_irs, so it should be enough to test one of them. We already have tests for the TableGen emitter, so each backend should only have acceptance tests, to make sure TableGen does the right thing for each kind of pattern that it's interested in. Having one test for each rule would just explode the number of tests to the point where they can only be managed automatically, which would really reduce my confidence in them (mostly because TableGen is quirky and I would expect whatever edge cases are handled incorrectly in the emitter to also be handled incorrectly in whatever automatic test generator we'd derive with TableGen).

cmake/modules/TableGen.cmake
59

"use-gisel-coverage" sounds more like a true/false option. How about "gisel-coverage-file" or something along those lines?

include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h
682

I'm not a compiler, but this looks like it should be setCovered :)

include/llvm/Support/CodeGenCoverage.h
2

The path seems out of date here.

lib/Support/CodeGenCoverage.cpp
2

Wrong path.

utils/TableGen/GlobalISelEmitter.cpp
74

I'm not convinced this is a good default, since it ends with a '-'

75

This should be in the GlobalISelEmitterCat.

This revision is now accepted and ready to land.Nov 7 2017, 10:44 PM

Is the long-term intention to try and drive this coverage to 100% via in-tree tests, or rather via the test-suite?

Awesome, I could use something like this. LGTM with a few nits.

Is the long-term intention to try and drive this coverage to 100% via in-tree tests, or rather via the test-suite? I'm asking because I was actually considering removing some of the arm-instruction-selector.mir tests which cover the same kind of pattern - e.g. ADD, SUB, AND, OR etc are all derived from an AsI1_bin_irs, so it should be enough to test one of them. We already have tests for the TableGen emitter, so each backend should only have acceptance tests, to make sure TableGen does the right thing for each kind of pattern that it's interested in. Having one test for each rule would just explode the number of tests to the point where they can only be managed automatically, which would really reduce my confidence in them (mostly because TableGen is quirky and I would expect whatever edge cases are handled incorrectly in the emitter to also be handled incorrectly in whatever automatic test generator we'd derive with TableGen).

I'd definitely be interested in knowing what the statistics show today on coverage of regression tests vs test-suite. My expectation is that we actually have higher coverage via the regression tests rather than the test-suite, since the regression tests try harder to cover the paths of logic in the compiler.
My guess is that trying to get 100% code coverage via the test-suite is not going to work. If we'd need to get 100% coverage, I guess it needs to be done via the regression tests. But as you say - if they are going to be mostly automatically generated and not manually inspected, there is a lot of scope for the tests to be wrong.

dsanders marked 10 inline comments as done.Nov 8 2017, 1:21 PM

Awesome, I could use something like this. LGTM with a few nits.

Is the long-term intention to try and drive this coverage to 100% via in-tree tests, or rather via the test-suite? I'm asking because I was actually considering removing some of the arm-instruction-selector.mir tests which cover the same kind of pattern - e.g. ADD, SUB, AND, OR etc are all derived from an AsI1_bin_irs, so it should be enough to test one of them. We already have tests for the TableGen emitter, so each backend should only have acceptance tests, to make sure TableGen does the right thing for each kind of pattern that it's interested in. Having one test for each rule would just explode the number of tests to the point where they can only be managed automatically, which would really reduce my confidence in them (mostly because TableGen is quirky and I would expect whatever edge cases are handled incorrectly in the emitter to also be handled incorrectly in whatever automatic test generator we'd derive with TableGen).

Via the test-suite and any other testing people generally run to qualify a release. I don't think we should try to cover everything with the in-tree tests for the same reasons you give. There is an optimization to the table I'd like to have that would make it a bit more feasible to have high coverage with in-tree tests (I want very-similar imported rules to share an implementation in the generated table) but even then I don't think we should aim for 100% coverage with in-tree tests..

include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h
682

I'm not sure how I missed that.

utils/TableGen/GlobalISelEmitter.cpp
74

That's a good point. It made more sense before I fixed the multiple-writers problem.

dsanders updated this revision to Diff 122147.Nov 8 2017, 1:23 PM
dsanders marked 2 inline comments as done.

Fix covered() -> setCovered() change that was somehow missed, and the other nits.

dsanders closed this revision.Nov 15 2017, 4:46 PM
EricWF added a subscriber: EricWF.Nov 18 2017, 1:04 PM
EricWF added inline comments.
lib/CodeGen/GlobalISel/InstructionSelect.cpp
23

config.h isn't an installed header, so this breaks the use of installed LLVM versions.

Could you please revert or fix?