This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Add support for invariant stores of ordered reductions
ClosedPublic

Authored by malharJ on Jun 1 2022, 4:50 AM.

Diff Detail

Event Timeline

malharJ created this revision.Jun 1 2022, 4:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 4:50 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
malharJ requested review of this revision.Jun 1 2022, 4:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 4:50 AM
mgabka added a subscriber: mgabka.Jun 1 2022, 5:59 AM
david-arm added inline comments.
llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
1364–1375

Hi @malharJ, can you add some proper CHECK lines for this test please similar to the other positive test cases above, e.g. @fadd_strict?

malharJ updated this revision to Diff 434486.Jun 6 2022, 8:26 AM
  • Updated LIT test with more detailed CHECKs.

The main difference between this and "fadd_strict" is the presence of
a store (to the invariant destination) in the middle.block

malharJ updated this revision to Diff 434487.Jun 6 2022, 8:28 AM
  • Removed redundant blank lines from test
fhahn added a comment.Jun 6 2022, 8:48 AM

@igor.kirillov do you remember why this was restriction was added originally?

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
1020

this comment needs updating

malharJ updated this revision to Diff 434508.Jun 6 2022, 9:29 AM
  • Updated comment in source

@igor.kirillov do you remember why this was restriction was added originally?

I don't remember actually. I know that it didn't work properly half a year ago but it does work now.

david-arm accepted this revision.Jun 16 2022, 6:50 AM

LGTM! From private discussions I also understand you've run the LLVM test suite using scalable vectorisation on SVE hardware and all tests passed. Can you address my test comment before merging please?

llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
1371

I think this should be something like

store float %[[RDX]], float* %[[DEST_PTR]]

because otherwise the tests might break if the numbering changes in future.

This revision is now accepted and ready to land.Jun 16 2022, 6:50 AM
Matt added a subscriber: Matt.Jun 16 2022, 4:40 PM
This revision was landed with ongoing or failed builds.Jun 17 2022, 6:56 AM
This revision was automatically updated to reflect the committed changes.
malharJ added inline comments.Jun 17 2022, 7:13 AM
llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
1371

good spot, somehow I seem to have missed it.
I've corrected this in the submitted patch.