Basic codegen for the declarations marked as nontemporal. Also, if the
base declaration in the member expression is marked as nontemporal,
lvalue for member decl access inherits nonteporal flag from the base
lvalue.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Unit tests: pass. 61027 tests passed, 0 failed and 728 were skipped.
clang-tidy: fail. Please fix clang-tidy findings.
clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
clang/lib/AST/StmtProfile.cpp | ||
---|---|---|
777 | Can E actually be null here? | |
clang/lib/CodeGen/CGExpr.cpp | ||
3952 | Is the !E->isArrow() condition necessary here? Why not just propagate the non-temporality of the base? Similarly, what's the case in which the declaration is marked non-temporal and you *don't* want to trust that? | |
clang/lib/CodeGen/CGOpenMPRuntime.cpp | ||
11350 | This is effectively clearing the active non-temporal decls set. Is that okay (even for the non-SIMD directives?), or should the current set be copied? |
clang/lib/AST/StmtProfile.cpp | ||
---|---|---|
777 | No, remnants of the initial version. | |
clang/lib/CodeGen/CGExpr.cpp | ||
3952 | That's the main question. I try to mimic the behavior we currenty have in the codegen. If the lvalue for the pointer is marked as nontemporal, only loads/stores for the pointer itself are marked as nontemporal. Operations with the memory it points to are not marked as nontemporal. I'm trying to do the same. E.g., if have something like a.b->c and a is nontemporal, then only a.b = x; must be marked as nontemporal, but not a.b->c = x; | |
clang/lib/CodeGen/CGOpenMPRuntime.cpp | ||
11350 | Yes, it must be copied, thanks. |
clang/lib/CodeGen/CGExpr.cpp | ||
---|---|---|
3952 |
There is no such case. We can mark this->member as nontemporal or declref. The first check here checks if we have this->member marked as nontemporal, the second check just propagates the flag if the base is marked as nontemporal. |
Unit tests: pass. 61052 tests passed, 0 failed and 728 were skipped.
clang-tidy: fail. Please fix clang-tidy findings.
clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
clang/lib/CodeGen/CGExpr.cpp | ||
---|---|---|
3952 | Okay. Then this should just be (BaseLV.isNontemporal() || CGM.getOpenMPRuntime().isNontemporalDecl(Field)). | |
clang/lib/CodeGen/CGOpenMPRuntime.cpp | ||
11350 | Okay. I see that you're now scanning the whole list, which is probably better than copying, but could you leave a comment here saying that that's how it works? I'd also suggest only pushing something onto the stack if you actually have any non-temporal clauses; it should be easy enough to remember in the RAII object whether you pushed. | |
clang/lib/CodeGen/CGOpenMPRuntime.h | ||
664 | Please add to this comment that the real set is the union of all the current stack elements, not just the innermost set. |
clang/lib/CodeGen/CGExpr.cpp | ||
---|---|---|
3952 | Still, a->c will be marked as nontemporal if a is nontemporal. Also, if we remove the check for the CXXThis in the condition, we can erroneously mark the instruction as nontemporal if we reference member of another base, which is nontemporal: struct S; extern S s; struct S { int a, b; void foo() { #pragma omp simd nontemporal(a) for (int i = 0; i < 10; ++i) s.a = 0; // Will be marked as nontemporal though it should not? } } s; |
Thanks. Could you add some tests that nontemporal directives aren't disturbed by either (1) nested nontemporal directives or (2) nested directives of some other kind?
clang/lib/CodeGen/CGExpr.cpp | ||
---|---|---|
3952 |
No, the base LValue we build in this case will not be related to the LValue for a; the pointer is loaded from that l-value, but its properties do not propagate to the pointer, just like how S * volatile s doesn't mean that s->x is volatile.
I see, this is specifically part of the semantics that the field only becomes non-temporal for this. Okay, I accept that part of the original condition. |
clang/lib/CodeGen/CGExpr.cpp | ||
---|---|---|
3952 | But please add some tests that you don't honor nontemporal on base expressions not derived from this. Does "wrapped" include looking through derived-to-base conversions? You should test that as well. |
Unit tests: pass. 61072 tests passed, 0 failed and 728 were skipped.
clang-tidy: fail. Please fix clang-tidy findings.
clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
Can E actually be null here?