This is an archive of the discontinued LLVM Phabricator instance.

Injection of kernel features into the LLVM IR during the OpenMP transform stage
AcceptedPublic

Authored by ksidorov on Jun 6 2021, 10:21 AM.

Details

Summary

This patch contains:

  • the code for retrieving & injecting features of the kernel into the LLVM IR as the global [* x i64] variable,
  • the regression test that checks the injected feature vectors for (a) empty kernel and (b) matrix multiplication kernel.

Diff Detail

Event Timeline

ksidorov created this revision.Jun 6 2021, 10:21 AM
ksidorov requested review of this revision.Jun 6 2021, 10:21 AM
Herald added a reviewer: baziotis. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

Some minor thing but overall this looks sensible.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
578

We need a command line option to guard this. Doing it always is (for now) not helpful.

1515–1516

Or use LLVM_DEBUG({ ... }); to avoid two consecutive ones.

1521

Early exits are preferred. if (!FunctionInfo) { debug; continue; } will make it easier to read.

ksidorov updated this revision to Diff 350470.Jun 7 2021, 7:32 PM
  • Command line option -openmp-inject-kernel-features for turning on the feature injection.
  • Refactoring + formatting issues.

squash the commits so that we see the diff not from the last change but the entire change set

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
1499

No else.

ksidorov updated this revision to Diff 350763.Jun 8 2021, 7:34 PM

Squash of diff versions

jdoerfert accepted this revision.Jun 8 2021, 7:58 PM

LG, one nit. Once the nit is resolved I can commit this for you.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
1522

Address the warning, maybe just (void) FeatureVector;

This revision is now accepted and ready to land.Jun 8 2021, 7:58 PM
ksidorov updated this revision to Diff 351765.Jun 13 2021, 8:03 PM
ksidorov marked 4 inline comments as done.

Code review comments: (void) + clang-format issues

Good feature, but using an array for this looks likely to be error prone. Do we need to iterate over it somewhere? Order of fields seems likely to end up repeated in a few places.

I think I'd prefer N distinct scalars with names like multiply_kernel.KernelFeature.MaxLoopDepth. Might make the test cases more legible too.

Good feature, but using an array for this looks likely to be error prone. Do we need to iterate over it somewhere? Order of fields seems likely to end up repeated in a few places.

I think I'd prefer N distinct scalars with names like multiply_kernel.KernelFeature.MaxLoopDepth. Might make the test cases more legible too.

When we read out the values it is much more reasonable to do so in one go. More features are on the way, reading them out one by one will cost too much time to make it practical.
With the usage and printing in the runtime D109882 this should make more sense.

That said, we should even merge other globals into a single one, incl. exec_mode.

ksidorov updated this revision to Diff 387065.Nov 14 2021, 4:37 AM
ksidorov marked an inline comment as done.

Aligned the order of kernel features to the order in the struct definition