This adds ~27 more runtime calls to the OpenMPKinds.def file, all with
attributes. We deduplicate 16 of those automatically in function =
thread scope. And we annotate all of them automatically during the
OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to
track annotation coverage is included.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 40667 Build 40792: arc lint + arc unit
Event Timeline
Some comments on the verbosity of the patch. Definitely like the effect - deduplicating calls where legal is a great result.
On second thoughts - should these attributes be in the openmp header, instead of inserted by the compiler?
llvm/include/llvm/IR/OpenMPKinds.def | ||
---|---|---|
221 | The repeated expression AttributeSet(EnumAttr(ReadOnly), EnumAttr(NoUnwind), EnumAttr(NoFree), EnumAttr(NoSync)) is probably worth factoring out into a named variable. Starting to approach the point where a tablegen style generator is worthwhile, but I don't think it's there yet | |
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
106 | for (auto op : {OMPRTL_omp_get_thread_num, OMPRTL_omp_get_num_threads,..}) { Changed |= deduplicateRuntimeCalls(*F, RFIs[op]); }``` ? | |
llvm/test/Transforms/OpenMP/add_attributes.ll | ||
1 | Is there a standard in-tree way of deduplicating stuff like this? It could be a hash of function name to expected attribute, with a generator that writes the IR. My last backend generated many thousands of lines of IR from much shorter python scripts. Maybe the in tree equivalent is tablegen? |
Agreed, on both.
On second thoughts - should these attributes be in the openmp header, instead of inserted by the compiler?
Yes, that would be great. We are actually working on that, see our HTO optimization work @wsmoses presented at LLVM-Dev'19.
llvm/include/llvm/IR/OpenMPKinds.def | ||
---|---|---|
221 | I thought about it and I might just use another macro for it. I'll update once I figured it out. | |
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
106 | Looks better, will do. | |
llvm/test/Transforms/OpenMP/add_attributes.ll | ||
1 | Can you rephrase this comment, I'm not sure I get it. (btw. this is for attributes not deduplication) |
llvm/test/Transforms/OpenMP/add_attributes.ll | ||
---|---|---|
1 | Sure! Some collision of terminology here. This patch is to add attributes to some, but not all functions. So the test case checks that some functions have gained specific function attrs, and that other functions still have no attrs. The (de)duplication I refer to is the extra ascii characters in the IR test, not removing duplication function calls. The format of textual IR means the signal to noise ratio is rather worse than, say, { "void @omp_set_num_threads(i32)" : (nofree, nosync, nounwind, writeonly,), "i32 @omp_get_team_size(i32)" : (nofree, nosync, nounwind, readonly,), "i32 @omp_get_max_task_priority()" : (), } or the more concise, # name nofree nosync nounwind readonly writeonly { "omp_set_num_threads" : "x x x _ x", "omp_get_team_size" : "x x x x _", "omp_get_max_tax_priority : "_ _ _ _ _", } The denser table is likely to be easier to scan for errors but can't be directly fed into LLVM, so a dense table + a trivial conversion to IR allows both. Iff there is an established way to do that sort of thing within LLVM. I did lots of this style of generated test for an out of tree target, in very ad hoc python, and wondered if there's an approved equivalent in the open source. |
llvm/test/Transforms/OpenMP/add_attributes.ll | ||
---|---|---|
1 | I am unsure why the check lines in the end are bad. They are like your first solution but in 2 instead of one line. That being said, I'm open to suggestions on how to improve this ;) |
llvm/test/Transforms/OpenMP/add_attributes.ll | ||
---|---|---|
520 | The checks require some cleanup, I think |
The repeated expression
is probably worth factoring out into a named variable.
Starting to approach the point where a tablegen style generator is worthwhile, but I don't think it's there yet