Page MenuHomePhabricator

[RISCV] Implement codegen for cmpxchg on RV32IA

Authored by asb on Jun 13 2018, 8:21 AM.



Utilise a similar lowering strategy to D47882.

All cmpxchg are lowered as 'strong' currently and failure ordering is ignored. I believe this is conservative but correct.

Diff Detail


Event Timeline

asb created this revision.Jun 13 2018, 8:21 AM

You can introduce a target-specific SelectionDAG node without adding a corresponding IR intrinsic. See ISD::FIRST_TARGET_MEMORY_OPCODE.

I obviously haven't been following RISC-V enough to give final reviews, but this mostly looked sensible to me. Just one question:

1744 ↗(On Diff #151176)

Is this an intentional choice not to support 64-bit pointers? Even if so, an assert might be good.

asb added inline comments.Jun 14 2018, 6:11 AM
1744 ↗(On Diff #151176)

RV64I isn't upstreamed yet. Most other parts of the atomics handling have asserts, and I should definitely add one here. I could try and have it do the right thing, but I think asserting is better than committing code that hasn't been tested. Good catch, thanks.

t.p.northover added inline comments.Jun 14 2018, 6:20 AM
1744 ↗(On Diff #151176)

Sounds good to me. A quick grep gave a handful of riscv64 uses so I wasn't entirely sure.

asb updated this revision to Diff 165974.Sep 18 2018, 7:23 AM
asb marked 3 inline comments as done.

Updated so common code for generating masks etc is reused through AtomicExpandPass.

asb added a comment.Sep 18 2018, 7:26 AM

You can introduce a target-specific SelectionDAG node without adding a corresponding IR intrinsic. See ISD::FIRST_TARGET_MEMORY_OPCODE.

(Sorry, just noticed I had an unsubmitted comment here).

I think in this case I do need an intrinsic, as I want mask calculation and so on to be produced as early as possible (i.e. in IR with the help of AtomicExpandPass), and then want to pass this through to my target pseudo-instruction. The only way of doing so is via an intrinsic. If the mask generation were taking place when creating or transforming the SelectionDAG it seems ISD::FIRST_TARGET_MEMORY_OPCODE would be the best approach.

t.p.northover added inline comments.Sep 20 2018, 4:50 AM
485–489 ↗(On Diff #165974)

Might be better to keep the loophead/looptail labels? I was a little confused when I saw that what looks at first glance like a linear sequence of 4 instructions went into two different basic blocks.

asb updated this revision to Diff 166415.Sep 21 2018, 2:07 AM
asb marked an inline comment as done.

Update to consistently use loophead/looptail labels in comments.

rogfer01 added inline comments.Oct 3 2018, 6:34 AM
1705 ↗(On Diff #166415)

@asb, I have a minor question here:

At some point we will want to implement this for rv64a. I presume that for rv64a we still want to have setMinCmpXchgSizeInBits(32) so the i32 and the i64 cases are handled in RISCVExpandPseudoInsts.cpp as non-masked lr/sc loops. Currently AtomicExpand.cpp will choose to use the smaller size supported (i32) and the intrinsic here will be using i32 values. This will require some form of legalisation in 64-bit, won't it? Will we end custom legalising this intrinsic call for the 64-bit case to a riscv_masked_cmpxchg_i64? Or perhaps there is a neater way to achieve this?


asb added inline comments.Oct 3 2018, 7:10 AM
1705 ↗(On Diff #166415)

Hi Roger. If i64 is the only legal type, I don't see a way around having an i64 form of the intrinsic. You then need to promote the operands to i64.

So you want to do something like

unsigned XLen = Subtarget.getXLen();
Value *Ordering = Builder.getIntN(XLen, static_cast<uint64_t>(Ord));

and use Builder.CreateSExt on the other operands.

rogfer01 added inline comments.Oct 3 2018, 7:27 AM
1705 ↗(On Diff #166415)

Hi Alex,

If we have an i64 intrinsic, do we expect it to return i64 too? I'd say yes but then AtomicExpand.cpp (line 964), which has been working on the minimum supported CAS type of i32 will attempt to build a shr mixing an i64 and i32 and the builder complains.

Maybe I have give up the idea of having i32 as the minimum cas type in rv64a and set it to i64, and then devise a way to use a non-masked LL/SC loop later.

Anyways I don't want to bother you too much with rv64 at this point.

Thanks a lot for the prompt answer.

asb updated this revision to Diff 169192.Oct 11 2018, 4:51 AM
asb retitled this revision from [RISCV] Implement codegen for cmpxchg on RV32I to [RISCV] Implement codegen for cmpxchg on RV32IA.

Rebased patch. No changes, but I had to re-generate the new atomic-cmpxchg.ll lines as regalloc is slightly different after the recent enableMultipleCopyHints change.

asb added a comment.Nov 12 2018, 6:47 AM

I've re-checked this patch still applies cleanly against current HEAD.

Ping? I'd really like to get this landed to complete this atomics lowering work.

lewis-revill added inline comments.
263 ↗(On Diff #169192)

Is there a reason why this isn't first defined as a class independent of width, like PseudoCmpXchg is (and the other pseudos in this file for that matter)? I'm just thinking of what might be used in the RV64IA implementation.

asb added inline comments.Nov 13 2018, 5:10 AM
263 ↗(On Diff #169192)

The masked cmpxchg and atomicrmw are only useful for 1 and 2-byte accesses, so there's no need to ever produce a 64-bit masked cmpxchg or atomicrmw.

lewis-revill added inline comments.Nov 13 2018, 5:18 AM
263 ↗(On Diff #169192)

Understood, thanks!

Jim added a subscriber: Jim.Nov 27 2018, 7:15 PM
asb added a comment.Nov 29 2018, 6:25 AM

This patch still applies with minimal fuzz and all tests still pass.

I'd really appreciate a review. Ping?

jyknight accepted this revision.Nov 29 2018, 11:53 AM

Looks good, ship it.

This revision is now accepted and ready to land.Nov 29 2018, 11:53 AM
This revision was automatically updated to reflect the committed changes.