This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics
ClosedPublic

Authored by wschmidt on Nov 5 2014, 2:30 PM.

Details

Summary

This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for PowerPC, which provide programmer access to the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions.

New LLVM intrinsics are provided to represent these four instructions in IntrinsicsPowerPC.td. These are patterned after the similar intrinsics for lvx and stvx (Altivec). In PPCInstrVSX.td, these intrinsics are tied to the code gen patterns, with additional patterns to allow plain vanilla loads and stores to still generate these instructions.

At -O1 and higher the intrinsics are immediately converted to loads and stores in InstCombineCalls.cpp. This will open up more optimization opportunities while still allowing the correct instructions to be generated. (Similar code exists for aligned Altivec loads and stores.)

The new intrinsics are added to the code that checks for consecutive loads and stores in PPCISelLowering.cpp, as well as to PPCTargetLowering::getTgtMemIntrinsic().

There's a new test to verify the correct instructions are generated. The loads and stores tend to be reordered, so the test just counts their number. It runs at -O2, as it's not very effective to test this
at -O0, when many unnecessary loads and stores are generated.

I ended up having to modify vsx-fma-m.ll. It turns out this test case is slightly unreliable, but I don't know a good way to prevent problems with it. The xvmaddmdp instructions read and write the same
register, which is one of the multiplicands. Commutativity allows either to be chosen. If the FMAs are reordered differently than expected by the test, the register assignment can be different as a result. Hopefully this doesn't change often, but my patch appears to have been sufficient to trigger a different schedule for some unknown reason. Ideas welcome of a better way to deal with this.

There is a companion patch for Clang, to be reviewed separately.

Diff Detail

Event Timeline

wschmidt updated this revision to Diff 15830.Nov 5 2014, 2:30 PM
wschmidt retitled this revision from to [PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics.
wschmidt updated this object.
wschmidt edited the test plan for this revision. (Show Details)
wschmidt added reviewers: hfinkel, seurer, willschm.
wschmidt added a subscriber: Unknown Object (MLST).
hfinkel accepted this revision.Nov 11 2014, 12:29 AM
hfinkel edited edge metadata.

With updates as noted, LGTM.

lib/Target/PowerPC/PPCInstrVSX.td
58

FYI: Changing these, I believe, is what is changing the schedule. The matching pattern order treats patterns attached to instructions in a different order from free patterns (and the inferred attributes could also be different -- although they shouldn't be in this case).

lib/Transforms/InstCombine/InstCombineCalls.cpp
621

This will create a load with the default alignment, but that's not right. These need to be align 1.

Also, you're missing tests for these.

639

(Same comments as above)

This revision is now accepted and ready to land.Nov 11 2014, 12:29 AM

Corrected alignment issue and added a test case for InstCombine. r221767. Thanks for the review!

wschmidt closed this revision.Nov 11 2014, 8:12 PM