This is an archive of the discontinued LLVM Phabricator instance.

Add support for part-word atomics for PPC
ClosedPublic

Authored by nemanjai on Mar 5 2015, 2:12 PM.

Details

Summary

This patch does two things:

  1. Adds support for part-word atomic instructions for Power7 (ISA 2.06) and Power8 (ISA 2.07)
  2. Adds the "lock" versions of all of the loads. These are the loads with the Exclusive Access hint bit set to 1

Atomic instructions in the IR will now use the part-word versions on platforms that support them (for types i8 and i16). The lock versions are only available through asm for now.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai updated this revision to Diff 21305.Mar 5 2015, 2:12 PM
nemanjai retitled this revision from to Add support for part-word atomics for PPC.
nemanjai updated this object.
nemanjai edited the test plan for this revision. (Show Details)
nemanjai added reviewers: wschmidt, hfinkel, kbarton.
nemanjai set the repository for this revision to rL LLVM.
nemanjai added a subscriber: Unknown Object (MLST).
nemanjai updated this revision to Diff 21316.Mar 5 2015, 2:38 PM

It would appear that in applying the patch to my repository from an anonymous one has somehow lost some of my changes. Moved the position of the stwcx definition and added the check prefix for all the Power7 and up checks in the IR test case.

hfinkel added inline comments.Mar 5 2015, 6:44 PM
lib/Target/PowerPC/PPCSubtarget.h
117

You need to initialize this in PPCSubtarget.cpp with the others.

nemanjai added inline comments.Mar 6 2015, 4:09 AM
lib/Target/PowerPC/PPCSubtarget.h
117

Ah yes, thank you. Doing so certainly helps with readability and consistency. Sorry I missed that. Added to the next patch that I'll upload here.

nemanjai updated this revision to Diff 21344.Mar 6 2015, 5:58 AM

Added the initialization for the bool variable to address Hal's comment.

One thing to note is that no Clang option is added for this (i.e. -mpartword-atomics). As a follow-on task for this, the quadword version will be implemented which does have a corresponding option in GCC and I plan to implement the same option in Clang. I don't think that the inconsistency is an issue since part word atomics can be implemented without the part word atomic instructions whereas I imagine that the quad word version would require locks if there is no instruction for it in the ISA. Furthermore, the quadword version is new for ISA 2.07 whereas the part word versions existed in ISA 2.06.

hfinkel added inline comments.Mar 6 2015, 6:43 AM
lib/Target/PowerPC/PPCInstrInfo.td
1452

This pattern is identical to the pattern for LBARX above; can they both really be selected?

(and the same as the LWARX pattern below)

Maybe you need PPClarx to carry a type node operand, like the extload nodes?

1496

Duplicate pattern here too.

nemanjai added inline comments.Mar 6 2015, 7:54 PM
lib/Target/PowerPC/PPCInstrInfo.td
1452

Please correct any of the below if my understanding of how this works is incorrect (I do not yet fully understand how we go from IR to SDAGs to actual instructions).

This is a good point, there would be no way to disambiguate which instruction to select if a selection DAG matched this pattern. However, I don't see anything in the PPC target implementation that would produce such a DAG node, so I am not sure how useful having this pattern is at all. As far as I can see, these instructions are selected only through the Pseudo instructions and are all emitted through

PPCTargetLowering::EmitInstrWithCustomInserter

And with my limited understanding of how things work, it would make sense that this can only be emitted through a custom code sequence since a load with reservation makes little sense without a subsequent store conditional (and likely the associated conditional branch). On the other hand, something in the PPC back end may decide to produce the PPClarx SD node so it would be good if we can match a likely pattern.
I understand what you mean about the SD Node carrying the type (much like all the versions of the [sign|zero] extend loads that you mention).
What I propose is the following (please comment on validity of the proposal):

  • Provide PPClarx8, PPClarx16, PPClarx32 and PPClarx64 SD nodes (PPClarx128 to follow with implementation of quadword version)
  • Provide PPCstcx8, PPCstcx16, PPCstcx32 and PPCstcx64 SD nodes (PPCstcx128 to follow)
  • Modify the patterns to use the respective version
  • Remove the existing PPClarx and PPCstcx SD nodes

In a subsequent patch, provide test coverage for all of the atomic instructions in the IR (currently we seem to only have testing for some of them and only for i32 and i64 types). Maybe even add another test case that contains the IR from existing tests of all of the gcc __sync_* builtins to ensure we produce expected code for all of those.

hfinkel added inline comments.Mar 8 2015, 12:28 PM
lib/Target/PowerPC/PPCInstrInfo.td
1452

This is a good point, there would be no way to disambiguate which instruction to select if a selection DAG matched this pattern.

Right, and we should not add useless patterns, SDAG nodes, etc. Relative to LLVM itself, the PPC backend is pretty old, and has accumulated some dead constructs no one has yet cleaned up. It seems that you have stumbled upon one...

PPCISD::LARX seems to be dead, as is PPCISD::STCX. We should remove them, and any associated patterns and nodes. The one thing to be careful about here is to make sure that any instruction properties (mayLoad, etc.) that are being implied by the default patterns may need to be explicitly added once the patterns are removed. The definitive way to do this is to check that lib/Target/PowerPC/PPCGenInstrInfo.inc in your build directory has no relevant changes from removing the patterns.

nemanjai updated this revision to Diff 21522.Mar 9 2015, 3:20 PM

Removed the SD nodes and patterns for the atomic instructions. These instructions are never emitted directly from a Selection DAG since there is a very close relationship between the load and the store. As a result, they are only emitted from the atomic Pseudo instructions using a custom inserter. It is therefore not necessary to maintain patterns and SD nodes for the new or existing ones.

hfinkel accepted this revision.Mar 9 2015, 4:33 PM
hfinkel edited edge metadata.

LGTM (aside from some minor formatting nits).

lib/Target/PowerPC/PPC.td
122

Indentation is off here.

lib/Target/PowerPC/PPCInstr64Bit.td
240

Indentation is off here and below (the " should be under the 3).

lib/Target/PowerPC/PPCInstrInfo.td
1447

Indentation looks off-by-one here too.

This revision is now accepted and ready to land.Mar 9 2015, 4:33 PM
nemanjai closed this revision.Mar 10 2015, 2:50 PM

Committed revision 231843.