This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Teach ComputeNumSignBitsForTargetNode about masked atomic intrinsics
ClosedPublic

Authored by asb on Jul 20 2022, 11:50 AM.

Details

Summary

[RISCV] Teach ComputeNumSignBitsForTargetNode about masked atomic intrinsics

An unnecessary sext.w is generated when masking the result of the
riscv_masked_cmpxchg_i64 intrinsic. Implementing handling of the
intrinsic in ComputeNumSignBitsForTargetNode allows it to be removed.

Although this isn't a particularly important optimisation, removing the
sext.w simplifies implementation of an additional cmpxchg-related
optimisation in D130192.

Although I can't produce a test with different codegen for the other
atomics intrinsics, these are added as well for completeness.

Diff Detail

Event Timeline

asb created this revision.Jul 20 2022, 11:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2022, 11:50 AM
asb requested review of this revision.Jul 20 2022, 11:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2022, 11:50 AM

Do we care about other masked i64 atomic intrinsics, and do the i32 ones ever see a benefit from implementing an equivalent?

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9890

Should have extra { } so it's easy to add other cases

reames added inline comments.Jul 21 2022, 7:58 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9893

I don't follow this comment. Is there some documentation you can point to for what this intrinsic does? Or a good pointer in code to understand it?

If I'm gathering this correctly, the and-by-mask is to handle the zext of the type less than XLEN? Not following where the sign bit below that is coming from.

asb added inline comments.Jul 21 2022, 10:08 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9893

The intrinsics are underdocumented - I'll loop back round and fix that. For general context on our atomics lowering approach, see here.

As for this specific code comment: it may be because it's the end of the day, but I started trying to better explain and confused myself. So it may be I have a mistake, the code comment needs improving, or both. Just leaving this comment now, as it may be it's best for you to hold off on taking a proper look until I've had a chance to re-review.

craig.topper added inline comments.Jul 24 2022, 3:56 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9893

Isn't the output of the intrinsic only the result of the LR_W? There are ANDs in the expansion, but they are to the scratch register aren't they?

llvm/test/CodeGen/RISCV/atomic-cmpxchg-branch-on-result.ll
132

I don't think this AND is part of the intrinsic. It's a separate operation in the IR and DAG. The output of the intrinsic is a4, the lr.w.aqrl result.

asb updated this revision to Diff 449237.Aug 2 2022, 2:55 AM
asb edited the summary of this revision. (Show Details)

Update to fix logic error regarding the sign bits (it's always just 33, the intrinsic doesn't to any masking itself. I improved the documentation on the intrinsics in 85c6fab which should help avoid such mistakes.

This update also lists all masked intrinsics, even though I've been unable to produce test cases for the others that lead to different codegen.

asb marked an inline comment as done.Aug 2 2022, 2:56 AM
asb added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9893

I've improved docs for intrinsics in 85c6fab.

You're both completely right that I made a logic error here initially - the intrinsic doesn't mask its return value at all, so the number of sign bits is always just 33.

asb added a comment.Aug 2 2022, 2:57 AM

Do we care about other masked i64 atomic intrinsics, and do the i32 ones ever see a benefit from implementing an equivalent?

I've added in the other masked i64 atomics, though was unable to produce examples where it makes a difference. Now I've fixed the logic error on the number of sign bits, there's no potential for benefit for the i32 ones (there would just be 1 mask bit after LR_W or atomicrmw_W of a native width value).

asb retitled this revision from [RISCV] Teach ComputeNumSignBitsForTargetNode about Intrinsic::riscv_masked_cmpxchg_i64 to [RISCV] Teach ComputeNumSignBitsForTargetNode about masked atomic intrinsics.Aug 2 2022, 2:58 AM
reames accepted this revision.Aug 2 2022, 1:08 PM

LGTM w/requested changes.

As an optional follow up, I find myself wondering if the 64 bit versions should actually have a 64 bit return type. Having them instead have i32 operands and return types - to model the emulated access actually being done - and then explicitly sign extended afterwards would get the same effect here with less code and less confusion.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9899

Ok, I finally wrapped my head around all of the machinery here, and this is correct. However, I want to suggest some changes.

First, the comment:
// riscv_masked_atomicrmw_* represents an emulated unaligned atomicrmw operation at the minimum supported atomicrmw width whose result is then sign extended to XLEN. With +A, the minimum width is 32 for both 64 and 32.

Second, please add the asserts...
assert(XLenVT == 64)
assert(getMinCmpXchgSizeInBits() == 32);

This revision is now accepted and ready to land.Aug 2 2022, 1:08 PM

LGTM w/requested changes.

As an optional follow up, I find myself wondering if the 64 bit versions should actually have a 64 bit return type. Having them instead have i32 operands and return types - to model the emulated access actually being done - and then explicitly sign extended afterwards would get the same effect here with less code and less confusion.

Wouldn't that require the intrinsics to go through type legalization to be promoted to a new RISCVISD node with i64 types? And then we'd need to teach computeKnownSignBits about the new RISCVISD node so we could remove in sext_inreg that got created from the sign_extend.

Unless I'm misunderstanding what you're proposing?

asb added a comment.Aug 2 2022, 1:34 PM

Yes, my recollection is that the need to handle legalisation of the intrinsics is the main barrier to defining i32 versions on RV64. I'll have another think though.

Thanks for the review! I'll tweak as suggested and land tomorrow morning.

asb updated this revision to Diff 449642.Aug 3 2022, 5:38 AM
asb marked an inline comment as done.

Address review comments.

asb added inline comments.Aug 3 2022, 5:41 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9899

Hi Philip, I've drafted a description that is very similar to yours, but made a few small alterations:

  • Referencing cmpxchg as well at atomicrmw
  • unaligned => narrow.

I think this is close enough to your suggestion that I'm going to go ahead and commit, but obviously just shout if you think that didn't match your intent and we can resolve in a follow-up patch.

Thanks.

This revision was landed with ongoing or failed builds.Aug 3 2022, 5:46 AM
This revision was automatically updated to reflect the committed changes.
reames added a comment.Aug 3 2022, 7:28 AM

Yes, my recollection is that the need to handle legalisation of the intrinsics is the main barrier to defining i32 versions on RV64. I'll have another think though.

Yeah, you and Craig are correct here. I was thinking we had a legal i32 type on rv64 which we don't. So yeah, ignore me.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9899

Your tweak was totally fine.