This is an archive of the discontinued LLVM Phabricator instance.

[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.
ClosedPublic

Authored by timshen on May 2 2017, 2:58 PM.

Details

Summary

This fixes pr32392.

The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo.

The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)).

Diff Detail

Repository
rL LLVM

Event Timeline

timshen created this revision.May 2 2017, 2:58 PM
kbarton added inline comments.May 4 2017, 7:39 PM
llvm/lib/Target/PowerPC/PPCInstr64Bit.td
967 ↗(On Diff #97508)

Do we also need to mark Uses = [CR7]?

llvm/test/CodeGen/PowerPC/atomic-2.ll
112 ↗(On Diff #97508)

Please add a reg expression here to ensure the registers used by the compare are the same as the ones in the ld above (this is necessary for correctness of the isync).

114 ↗(On Diff #97508)

Please add CHECK-NEXT for the bne- and isync instructions. We want to ensure the cmp, bc, isync instructions stay together.

llvm/test/CodeGen/PowerPC/atomics-indexed.ll
15 ↗(On Diff #97508)

Same comments as above for reg expression and CHECK-NEXT

26 ↗(On Diff #97508)

And here as well

llvm/test/CodeGen/PowerPC/atomics.ll
31 ↗(On Diff #97508)

And here as well.

44 ↗(On Diff #97508)

And last but not least... :)

timshen added inline comments.May 5 2017, 8:53 AM
llvm/test/CodeGen/PowerPC/atomic-2.ll
114 ↗(On Diff #97508)

I'm not sure about that.

LLVM post-RA scheduler may stick instructions in between the cmp, bc, isync sequence, but the dependencies introduced by cmp and bc are perfectly preserved.

timshen added inline comments.May 5 2017, 11:45 AM
llvm/lib/Target/PowerPC/PPCInstr64Bit.td
967 ↗(On Diff #97508)

I'm not sure - what does Uses exactly mean?

I didn't add, because a search for "clobbers" in PPCInstr64Bit.td gives me examples that clobbers are added through "Defs" without "Uses".

timshen updated this revision to Diff 98032.May 5 2017, 3:40 PM

Rebase onto D32762.

Add associated register numbers in CHECK lines, but still doesn't use CHECK-NEXT as I commented.

timshen marked an inline comment as done.May 5 2017, 3:41 PM
timshen updated this revision to Diff 98302.May 9 2017, 9:15 AM

Rebase onto the AtomicExpandPass change.

kbarton added inline comments.May 12 2017, 12:55 PM
llvm/test/CodeGen/PowerPC/atomic-2.ll
114 ↗(On Diff #97508)

OK, that's fair. I've verified with the hardware designers that this sequence can have unrelated instructions interleaved.
However, we also want to check that any operations we are trying to order using the sync are not moved across this sequence. Can you add a use of VAL to the test and a check to ensure it doesn't get moved up past the isync?

timshen updated this revision to Diff 98842.May 12 2017, 2:11 PM

Add test case for ordering requirement.

timshen added inline comments.May 12 2017, 2:11 PM
llvm/test/CodeGen/PowerPC/atomic-2.ll
114 ↗(On Diff #97508)

Added I "test_ordering" into atomics-regression.ll.

I noticed that, according to the C++ load-acquire semantics (I assume that it's similar to LLVM's load-acquire), a succeeding load should never be scheduled across the previous load-acquire.

timshen updated this revision to Diff 99065.May 15 2017, 2:41 PM

Add a test case for store to an unrelated location after load.

kbarton accepted this revision.May 16 2017, 12:58 PM

LGTM.
Thanks for adding the additional test case!

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
967 ↗(On Diff #97508)

Yeah, I just verified this with Nemanja. We want to use the Def to mark it as clobbered. The Use of CR7 is internal to this pseudo instruction, but there is no way to indicate that. So if we mark this with an explicit use of CR7, the machinery may try to find the definition of that but won't consider the fact that this instruction is defining it also.

This revision is now accepted and ready to land.May 16 2017, 12:58 PM
This revision was automatically updated to reflect the committed changes.