The vector being loaded might not be in address space 0. In this case the vldn/vstn call creation will trigger an assert in CallInst:Create, because the argument type and the parameter are in different address spaces. Resolve this by adding an appropriate address space cast.
Details
Diff Detail
Event Timeline
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | This is interesting.. We ideally wouldn't want to introduce address space casts when not necessary (I can imagine some users using the address space qualifier to help with alias analysis). We should be able to get a variant of the vldn intrinsic that operates on the correct address space. |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | Maybe. I would need some pointers on how to change the code in that case, because this would require changes to IntrinsicsARM.td, which I'm not very familiar with. I'm also wondering: is alias analysis information still being used during instruction lowering or later? |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | I've looked into this and changing IntrinisicsARM.td should be fairly simple, all that's needed is to replace in definitions like int_arm_neon_vld2 the llvm_ptr_ty argument with a llvm_anyptr_ty. However it turns out that this changes the intrinisic name mangling (which might be a problem with backward compatibility) and also requires clang changes. This is of course a bit complicated.. |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | Thanks for the pointer regarding llvm_anyptr_ty. Maybe map, say, __builtin_neon_vld3_v to @llvm.arm.neon.vld3.v2i32.p0i8 instead of @llvm.arm.neon.vld3.v2i32 and keep that as the only exposed version? Or are there objections to that solution too? |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | Yes, we would have to do that and also add auto-upgrade support for the old forms of the intrinsics. Perhaps an email should be sent to llvm-dev as well. |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | I'm not sure what you're saying. Sticking with vld3 Do you mean you both want to expose @llvm.arm.neon.vld3.v2i32.p0i8 and @llvm.arm.neon.vld3.v2i32? Or do you want more exposed? I'm also not quite sure what you mean by auto-upgrade support? |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | I think we should make the change to IntrinsicARM.td (replace llvm_ptr_ty with llvm_anyptr_ty for the vldN intrinsics). At that point we will only have the new style intrinsics in llvm IR (like @llvm.arm.neon.vld3.v2i32.p0i8) and the old ones ( like @llvm.arm.neon.vld3.v2i32) will not be supported anymore. We therefore need to add a case in lib/IR/AutoUpgrade.cpp for the old style vldN intrinsics so that we can convert old style intrinsics to the new one and not break backwards compatibility. Regarding the clang changes (__builtin_neon_vld*..), yes, these should be emitting @llvm.arm.neon.vld3.v2i32.p0i8. |
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11827 | Got it, thanks for the explanation. I was not aware of lib/IR/AutoUpgrade.cpp. I'll update this patch to take your comments into account and whip up an associated patch for clang. Once that's done I'll also send a message to llvm-dev to ask for more feedback. |
This now upgrades the vld[234] and vst[234] intrinsics to take into account an address space, as discussed.
Open issues as far as I'm concerned:
- Should vld1 and vst1 also be upgraded for uniformity? Similarly for the vld[234]lane and vst[234]lane instructions?
- Should all the tests referring to vld[234] and vst[234] be updated to use the new intrinsic names, or are they allowed to depend on the auto-upgrade code?
Thanks, Jeroen!
They seem to have the same issues, so I would say yes.
- Should all the tests referring to vld[234] and vst[234] be updated to use the new intrinsic names, or are they allowed to depend on the auto-upgrade code?
I think they should use the new names, and have different tests to check that the old ones get auto-upgraded.
This extends the patch to cover all of vld[1234], vls[234]lane, vst[1234], and vst[234]lane. All tests now use the updated intrinsic names, and new tests have been added to test auto-upgrading.
Should I still email llvm-dev about this? If so, what exaclty should I ask about.
Forgot to say in my previous comment. The related clang patch is http://reviews.llvm.org/D13127
Both patches look good!
The purpose of the email to llvm-dev would be to announce the interface change and to ask if this would be a problem for anyone (it shouldn't be since you've added the auto-upgrade).
This is interesting.. We ideally wouldn't want to introduce address space casts when not necessary (I can imagine some users using the address space qualifier to help with alias analysis).
We should be able to get a variant of the vldn intrinsic that operates on the correct address space.