This is an archive of the discontinued LLVM Phabricator instance.

[SVE][AArch64] Use WHILEWR to check write-after-read conflicts
AbandonedPublic

Authored by Allen on Nov 24 2022, 4:06 AM.

Details

Summary

In AArch64 target, the WHILEWR instructiuon is enable in sve2.

Diff Detail

Event Timeline

Allen created this revision.Nov 24 2022, 4:06 AM
Herald added a project: Restricted Project. · View Herald Transcript
Allen requested review of this revision.Nov 24 2022, 4:06 AM
Matt added a subscriber: Matt.Nov 30 2022, 5:41 PM

Hi

I think the idea of using whilewr is a nice one. I think how this should work is similar to the active.lane.mask intrinsics.

  • We define a generic intrinsic, argue about the name and the exact semantics, adding details to the language ref.
  • Add a target hook from the vectorizer to opt into using them for runtime checks.
  • Lower them generically to a series of compares and whatnot in DAG (this may be difficult depending on the exact semantics)
  • Under AArch64 we expand it to a whilewr and a csel last (I think). Which can then hopefully optimize to use b.last.

At least that is how I think I would expect it to work, with an intrinsic that accepts two pointers or integers of pointer size and produces an i1. The alternative would be just match it in the backend. Unfortunately the semantics of whilewr don't seem not super obvious. I think the b variant performs (VL -1) < zext(B) - zext(A)) | (zext(B) - zext(A)) > 0 for the last lane, which is a little odd for values where A+VL wraps around 0 and probably makes direct matching difficult.

We would also need to account for UF correctly, which might be possible using a different element size.

Allen added a comment.Feb 9 2023, 5:01 PM

Thanks very much for detail idea.

Allen abandoned this revision.Sep 20 2023, 7:19 PM