Utilise a similar lowering strategy to D47882.
All cmpxchg are lowered as 'strong' currently and failure ordering is ignored. I believe this is conservative but correct.
RV64I isn't upstreamed yet. Most other parts of the atomics handling have asserts, and I should definitely add one here. I could try and have it do the right thing, but I think asserting is better than committing code that hasn't been tested. Good catch, thanks.
(Sorry, just noticed I had an unsubmitted comment here).
I think in this case I do need an intrinsic, as I want mask calculation and so on to be produced as early as possible (i.e. in IR with the help of AtomicExpandPass), and then want to pass this through to my target pseudo-instruction. The only way of doing so is via an intrinsic. If the mask generation were taking place when creating or transforming the SelectionDAG it seems ISD::FIRST_TARGET_MEMORY_OPCODE would be the best approach.
Might be better to keep the loophead/looptail labels? I was a little confused when I saw that what looks at first glance like a linear sequence of 4 instructions went into two different basic blocks.
@asb, I have a minor question here:
At some point we will want to implement this for rv64a. I presume that for rv64a we still want to have setMinCmpXchgSizeInBits(32) so the i32 and the i64 cases are handled in RISCVExpandPseudoInsts.cpp as non-masked lr/sc loops. Currently AtomicExpand.cpp will choose to use the smaller size supported (i32) and the intrinsic here will be using i32 values. This will require some form of legalisation in 64-bit, won't it? Will we end custom legalising this intrinsic call for the 64-bit case to a riscv_masked_cmpxchg_i64? Or perhaps there is a neater way to achieve this?
Hi Roger. If i64 is the only legal type, I don't see a way around having an i64 form of the intrinsic. You then need to promote the operands to i64.
So you want to do something like
unsigned XLen = Subtarget.getXLen(); Value *Ordering = Builder.getIntN(XLen, static_cast<uint64_t>(Ord));
and use Builder.CreateSExt on the other operands.
If we have an i64 intrinsic, do we expect it to return i64 too? I'd say yes but then AtomicExpand.cpp (line 964), which has been working on the minimum supported CAS type of i32 will attempt to build a shr mixing an i64 and i32 and the builder complains.
Maybe I have give up the idea of having i32 as the minimum cas type in rv64a and set it to i64, and then devise a way to use a non-masked LL/SC loop later.
Anyways I don't want to bother you too much with rv64 at this point.
Thanks a lot for the prompt answer.