This patch adds OMPIRBuilder support for the simd directive (without any clause). This will be a first step towards lowering simd directive in LLVM_Flang. The patch uses existing CanonicalLoop infrastructure of IRBuilder to add the support. Also adds necessary code to add llvm.access.group and llvm.loop metadata wherever needed.
Details
Diff Detail
Event Timeline
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h | ||
---|---|---|
520 | I believe LLVM uses American English. |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
---|---|---|
2151 | Properties is unused in the function? |
Thanks for helping to complete the OpenMPIRBuilder implementation!
clang/lib/CodeGen/CGStmtOpenMP.cpp | ||
---|---|---|
2618–2619 | [nit] Unnecessary whitespace | |
2619 | Did you consider adding a check whether OpenMPIRBuilder actually implements all features requested before using it? See isSupportedByOpenMPIRBuilder. | |
2626 | [style] According to the LLVM coding standard, auto is only used in specific cases. | |
clang/test/OpenMP/simd_codegen_irbuilder.cpp | ||
16 ↗ | (On Diff #388954) | Could you use regexes to match the virtual register names? The script update_cc_checks.py may help with this, see the other tests. The naming scheme for files testing the OpenMPIRBiulder is usually irbuilder_<constructs> |
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h | ||
524 | Capitalizing SIMD leads to long runs of capital letters. While not an LLVM coding style rule, it is still common to title-case acronyms in identifiers if the acronym is ~4 letters or more, such as OMPSimdDirective instead of OMPSIMDDirective. For methods transformation CanonicalLoopInfo, we don't use createXYZ naming scheme which is reserved for inserting something new. Suggestions: applySimd, vectorizeLoop, or simdizeLoop. | |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
2150 | The LLVM coding style still capitalizes parameters and variables. | |
2153 | The property also applies to any memory access, e.g. memset. You could use mayWriteToMemory() and mayReadFromMemory(). | |
2156 | [serious] The access group must be unique only to the loop being accessed. | |
2187–2198 | [serious] header and cond don't contain any memory accesses. body may consist of multiple basic blocks, of which this only applies to the first one. |
clang/lib/CodeGen/CGStmtOpenMP.cpp | ||
---|---|---|
2618–2619 | I took a look at isSupportedByOpenMPIRBuilder() and seems like it is more suitable for a check against work-sharing loop construct, not for simd. And for simd, I am not sure what features I should be checking for. One check I can think of is checking whether any of the not-yet-supported clauses is present in the simd directive or not. Am I on the right track? | |
2626 | I am inspired by existing code though |
clang/lib/CodeGen/CGStmtOpenMP.cpp | ||
---|---|---|
2618–2619 | Indeed, isSupportedByOpenMPIRBuilder is for the workshoring-loop, you could introduce another one for simd. Or modify it to accept OMPExecutableDirective as an argument. This might be useful for combined constructs such as for simd. Best approach IMHO is to whitelist clauses. Reject any clause you did not test/add support yet. This way we can fallback to the non-OpenMPIRBuilder codegen and not break things that worked without -fopenmp-enable-irbuilder (a lot of code already does not work -fopenmp-enable-irbuilder, so don't put too much effort into getting the interaction between OpenMPIRBuilder and current codegen to work; but we need an overview what is already supported and which clauses are still require some work) | |
2626 | There already is a lot of code violating the coding standard, either because the coding standard changed or it slipped through a review. A prominent example is that the methods of the IRBuilder are still PascalCase, although the current standard says they should use camelCase. Same applies to SourceLocToDebugLoc itself (which every time I see it makes me think SourceLocToDebugLoc is a type) Not a reason to not use the newest coding standard for new code. |
Addressing reviewers comments Major changes are as follows:
- Skipping unsupported clauses and skip the case when there is an ordered directive inside the simd construct
- Traversing all the blocks in the Canonical loop body (not only the first block of the loop body) to look for locations to insert metadata
- Making metadata unique for each Canonical loop
- Update test case to reflect changes.
clang/lib/CodeGen/CGStmtOpenMP.cpp | ||
---|---|---|
2590–2596 | I think it would be better to list the clauses that are supported. In addition to being shorter, it would also reject any clause we forgot about and new ones. | |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
2152 | [style] According to the LLVM coding standard, auto is only used in specific cases. | |
2154 | [style] Please use the current LLVM coding style. | |
2156 | [serious] This does not create a LoopID. A LoopID is a bag of properties where the first element points to itself (which automatically makes then distinct). E.g. !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}} where !1 is an access group (eg !1 = distinct !{}). [serious] The instruction's metadata should not be a LoopID, but a access group (eg !1 for the above example). I think you should not pass an array of properties, but just the access group MDNode. | |
2181 | [serious] The llvm.loop.parallel_accesses property takes an argument: the access group. | |
2186–2187 | [Style] Please use the current LLVM coding standard. | |
2211 | [serious] Using DominatorTree to check whether we passes Exit is redundant: For any block that is dominated by Exit (by definition of dominator) we will have to visited Exit first anyway. Therefore not adding Exit itself would be sufficient. You actually don't want paths that go to exit, but the basic blocks that come from Body (the successor of Cond) to the Continue/Latch block. By the definition of CanonicalLoopInfo, any block reachable from Body can only go to Continue[1], so it would be sufficient to iterate everything reachable from Body, including Body itself, until we hit the Latch. Header and Cond themselves are not allowed to contain any memory accesses. The safest way to determine the set of blocks inside the loop would be using LoopInfo, which internally uses DominatorTree, but for checking dominance with the header, not the exit. [1] There are exceptions: paths can end in an unreachable instruction and I am currently working on defining what happens at cancellation[2]. There will also be a BasicBlock like Latch/Continue that signals the end of the loop. In both cases these still contain blocks that are not inside the loop in LoopInfo's sense. However, marking instructions in them with access groups should is not a problem since the access groups marker only has an effect if inside a loop that has the llvm.loop.parallel_accesses property set to that access group. [2] I found interesting cases in which Clang re-uses the same basic blocks (cleanup block for destructors and exception handling) for multiple paths by using a PHINode and a switch to branch to the scope it came from. That is, if part of a loop it would always branch back to another block in the loop, but statically any other switch target would be reachable as well. Not sure yet whether this will be a problem that requires us to use LoopInfo. |
Thanks @Meinersbur for the comments. Addressing reviewers comments:
- Update code to make the matadata addition correct, as per reviewers comments.
- Using LoopInfo to identify BasicBlocks to iterate through.
- Updating the test case.
- Following coding standard.
[testing] Could you add a test for applySimd into OpenMPIRBuilderTests.cpp, so we have a test that is independent of Clang?
clang/test/OpenMP/irbuilder_simd.cpp | ||
---|---|---|
4 | The invocation is identical to the RUN on line 1. CHECKTWOLOOPS is not needed and can also be made as CHECK. | |
5 | A single expected-no-diagnostics is sufficient | |
68–70 | If you added these CHECK lines by hand, remove those that are not relevant | |
72 | Do not add a regex specification for any but the first occurance. A regex specification will "overwrite" the previous match, i.e. the different occurrences of META4 would not need to be the same. | |
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | ||
2149 | ||
2153 | ||
2154 | The instruction may already have an access group assigned e.g. from a nested #pragma clang loop vectorize(assume_safety). This would overwrite the access group. See LoopInfoStack::InsertHelper (CGLoopInfo.cpp) on how to assign multiple access groups, or add a TODO. | |
2188 | Compiler warning: warning C4189: 'DT': local variable is initialized but not referenced | |
2194 | OpenMPIRBuilder is part of the llvm namespace, llvm:: is not necessary. | |
2198 | Could you add a todo note such as: // TODO: Generalize getting all blocks inside a CanonicalizeLoopInfo, preferably without running any passes. | |
2205–2207 | [style] | |
2213 | The loop might already have a llvm.loop.parallel_accesses, in which case those two lists could be combined. I don't think its very relevant, but maybe add a TODO? |
LGTM. Thanks for the patch.
llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp | ||
---|---|---|
1696 | It would also be possible to use any_of instead of a loop. |
llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp | ||
---|---|---|
1691–1696 | Can make it a one-liner. |