This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Take into account address spaces in interleaved access vectorization
ClosedPublic

Authored by jketema on Sep 18 2015, 2:31 PM.

Details

Summary

The vector being loaded might not be in address space 0. In this case the vldn/vstn call creation will trigger an assert in CallInst:Create, because the argument type and the parameter are in different address spaces. Resolve this by adding an appropriate address space cast.

Diff Detail

Event Timeline

jketema updated this revision to Diff 35137.Sep 18 2015, 2:31 PM
jketema retitled this revision from to [ARM] Take into account address spaces in interleaved access vectorization.
jketema updated this object.
jketema added reviewers: sbaranga, rengolin.
jketema added a subscriber: llvm-commits.
sbaranga added inline comments.Sep 21 2015, 2:40 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

This is interesting.. We ideally wouldn't want to introduce address space casts when not necessary (I can imagine some users using the address space qualifier to help with alias analysis).

We should be able to get a variant of the vldn intrinsic that operates on the correct address space.

jketema added inline comments.Sep 21 2015, 9:30 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

Maybe. I would need some pointers on how to change the code in that case, because this would require changes to IntrinsicsARM.td, which I'm not very familiar with.

I'm also wondering: is alias analysis information still being used during instruction lowering or later?

sbaranga added inline comments.Sep 22 2015, 2:37 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

I've looked into this and changing IntrinisicsARM.td should be fairly simple, all that's needed is to replace in definitions like int_arm_neon_vld2 the llvm_ptr_ty argument with a llvm_anyptr_ty.

However it turns out that this changes the intrinisic name mangling (which might be a problem with backward compatibility) and also requires clang changes. This is of course a bit complicated..

jketema added inline comments.Sep 22 2015, 4:42 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

Thanks for the pointer regarding llvm_anyptr_ty.

Maybe map, say, __builtin_neon_vld3_v to @llvm.arm.neon.vld3.v2i32.p0i8 instead of @llvm.arm.neon.vld3.v2i32 and keep that as the only exposed version? Or are there objections to that solution too?

sbaranga added inline comments.Sep 22 2015, 4:49 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

Yes, we would have to do that and also add auto-upgrade support for the old forms of the intrinsics. Perhaps an email should be sent to llvm-dev as well.

jketema added inline comments.Sep 22 2015, 6:16 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

I'm not sure what you're saying. Sticking with vld3 Do you mean you both want to expose @llvm.arm.neon.vld3.v2i32.p0i8 and @llvm.arm.neon.vld3.v2i32? Or do you want more exposed?

I'm also not quite sure what you mean by auto-upgrade support?

sbaranga added inline comments.Sep 23 2015, 2:07 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

I think we should make the change to IntrinsicARM.td (replace llvm_ptr_ty with llvm_anyptr_ty for the vldN intrinsics). At that point we will only have the new style intrinsics in llvm IR (like @llvm.arm.neon.vld3.v2i32.p0i8) and the old ones ( like @llvm.arm.neon.vld3.v2i32) will not be supported anymore.

We therefore need to add a case in lib/IR/AutoUpgrade.cpp for the old style vldN intrinsics so that we can convert old style intrinsics to the new one and not break backwards compatibility.

Regarding the clang changes (__builtin_neon_vld*..), yes, these should be emitting @llvm.arm.neon.vld3.v2i32.p0i8.

jketema added inline comments.Sep 23 2015, 2:21 AM
lib/Target/ARM/ARMISelLowering.cpp
11708

Got it, thanks for the explanation. I was not aware of lib/IR/AutoUpgrade.cpp.

I'll update this patch to take your comments into account and whip up an associated patch for clang. Once that's done I'll also send a message to llvm-dev to ask for more feedback.

jketema updated this revision to Diff 35602.Sep 24 2015, 3:47 AM

This now upgrades the vld[234] and vst[234] intrinsics to take into account an address space, as discussed.

Open issues as far as I'm concerned:

  • Should vld1 and vst1 also be upgraded for uniformity? Similarly for the vld[234]lane and vst[234]lane instructions?
  • Should all the tests referring to vld[234] and vst[234] be updated to use the new intrinsic names, or are they allowed to depend on the auto-upgrade code?
sbaranga edited edge metadata.Sep 24 2015, 4:11 AM

Thanks, Jeroen!

This now upgrades the vld[234] and vst[234] intrinsics to take into account an address space, as discussed.

Open issues as far as I'm concerned:

  • Should vld1 and vst1 also be upgraded for uniformity? Similarly for the vld[234]lane and vst[234]lane instructions?

They seem to have the same issues, so I would say yes.

  • Should all the tests referring to vld[234] and vst[234] be updated to use the new intrinsic names, or are they allowed to depend on the auto-upgrade code?

I think they should use the new names, and have different tests to check that the old ones get auto-upgraded.

jketema updated this revision to Diff 35870.Sep 28 2015, 8:03 AM
jketema edited edge metadata.

This extends the patch to cover all of vld[1234], vls[234]lane, vst[1234], and vst[234]lane. All tests now use the updated intrinsic names, and new tests have been added to test auto-upgrading.

Should I still email llvm-dev about this? If so, what exaclty should I ask about.

Forgot to say in my previous comment. The related clang patch is http://reviews.llvm.org/D13127

This extends the patch to cover all of vld[1234], vls[234]lane, vst[1234], and vst[234]lane. All tests now use the updated intrinsic names, and new tests have been added to test auto-upgrading.

Should I still email llvm-dev about this? If so, what exaclty should I ask about.

Forgot to say in my previous comment. The related clang patch is http://reviews.llvm.org/D13127

Both patches look good!

The purpose of the email to llvm-dev would be to announce the interface change and to ask if this would be a problem for anyone (it shouldn't be since you've added the auto-upgrade).

sbaranga accepted this revision.Sep 30 2015, 2:37 AM
sbaranga edited edge metadata.

LGTM

This revision is now accepted and ready to land.Sep 30 2015, 2:37 AM
This revision was automatically updated to reflect the committed changes.

Committed. Thanks for all the feedback!

Thanks for all the work! :)