LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.
Details
- Reviewers
nemanjai - Group Reviewers
Restricted Project - Commits
- rGf1aca5ac96eb: [PowerPC] Fix L[D|W]ARX Implementation
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/include/llvm/IR/IntrinsicsPowerPC.td | ||
---|---|---|
1571 | Just curious, compiler optimizes the instruction out even setting IntrArgMemOnly and IntrReadMem here? |
llvm/include/llvm/IR/IntrinsicsPowerPC.td | ||
---|---|---|
1571 | Yes. Well, those two more or less tell it to go ahead and optimize it out if there are no uses. What is more surprising is that even with IntrHasSideEffects. Heck, it even optimizes with all of: IntrReadMem IntrArgMemOnly IntrHasSideEffects Throws IntrNoDuplicate IntrNoMerge I certainly found that surprising but I assume there is some justification as to why that is. There is some precedent for producing inline asm for builtins (see EmitX86BitTestIntrinsic() and similar). I don't know why that intrinsic was implemented using inline asm (commit messages don't say) but I presume it might have been for a similar reason. |
looking at the failing test cases, for example ./clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vadd.c, which are not test cases compiled for PPC, I'm seeing this following error:
clang: /home/conanap/llvm/ccom/llvm-project/llvm/lib/IR/Instructions.cpp:494: void llvm::CallInst::init(llvm::FunctionType *, llvm::Value *, ArrayRef<llvm::Value *>, ArrayRef<llvm::OperandBundleDef>, const llvm::Twine &): Assertion `(i >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType()) && "Calling a function with a bad signature!"' failed.
Debug prints show that it's entering the code block emitLoadReserveIntrinsic and BuiltinID evaluates to clang::PPC::BI__builtin_ppc_lwarx. The FTy dump shows i32 (i32 *), which is correct... I'm not exactly sure what to make of this yet, just putting this update here.
clang/lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
997 | Maybe rename to emitPPCLoadReserveIntrinsic should be more appropriate, since other targets also have similar load-reserve/load-linked/load-acquire notions and what this function does is ad hoc to PPC. |
clang/lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
1015 | Adding default: llvm_unreachable would be nice. |
I believe that your failing test case is because you are attempting to emit code for these builtins in the target independent code and the values just happen to match up.
clang/lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
997 | +1 | |
1000 | Nit: please follow naming conventions. Variables are to be capitalized. | |
1015 | +1 | |
5146 | Why? That doesn't really make sense to me. | |
llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond-64bit-only.ll | ||
18–19 | This is not the asm that the front end generates. Why would you generate one thing in the front end and then test a different thing in the back end? |
llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond-64bit-only.ll | ||
---|---|---|
18–19 | I'm not quite sure what you mean by this; the IR output is taken from the .c test case above. |
llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond-64bit-only.ll | ||
---|---|---|
18–19 | I don't think that is the case. |
llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond-64bit-only.ll | ||
---|---|---|
18–19 | ah I see I missed the modifier. |
Maybe rename to emitPPCLoadReserveIntrinsic should be more appropriate, since other targets also have similar load-reserve/load-linked/load-acquire notions and what this function does is ad hoc to PPC.