This is an archive of the discontinued LLVM Phabricator instance.

[LoopIdiom] Transform memmove-like loop into memmove (PR46179)
ClosedPublic

Authored by yurai007 on Jun 17 2021, 7:43 AM.

Details

Summary

The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns
in similar way like GCC does: https://godbolt.org/z/fh95e83od
LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend existing capabilities with minimal effort.

Diff Detail

Event Timeline

yurai007 created this revision.Jun 17 2021, 7:43 AM
yurai007 requested review of this revision.Jun 17 2021, 7:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2021, 7:43 AM
efriedma added inline comments.
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1287

Finally, someone trying to add this transform figured out this check... I think this is the fourth patch trying to add memmove to loopidiom.

Do you need to compare the offset to the size of the load/store operations? I think you might get some funny behavior if the operations overlap.

1343

Do we need to make sure we don't fall into this codepath?

yurai007 updated this revision to Diff 352979.Jun 18 2021, 5:09 AM
yurai007 added inline comments.
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1343

Yes, now this branch is ommited. Added tests to catch such mistakes.

yurai007 added inline comments.Jun 18 2021, 5:58 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1287

Finally, someone trying to add this transform figured out this check... I think this is the fourth patch trying to add memmove to loopidiom.

Well, hopefully there won't be fifth one :)

Do you need to compare the offset to the size of the load/store operations? I think you might get some funny behavior if the operations overlap.

Yes, it would be nice to prevent overlapping. I'm going to add protection against it and UT reproducer.

yurai007 updated this revision to Diff 353948.Jun 23 2021, 6:36 AM
yurai007 marked an inline comment as done.

Added protection against overlapping.

yurai007 added inline comments.Jun 23 2021, 6:52 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1287

Now load and store can't overlap each other and their sizes must be equal. However I'm not sure how reproducer should look like. Currently isLegalStore reject everything which has abs(Stride) != StoreSize so my understanding is that only one load and one store in loop are allowed. I ended up with following reproducer:

define void @do_not_form_memmove_for_overlapped_access(i32* %s, i64 %size) {
entry:
  %end.idx = add i64 %size, -1
  %end.ptr = getelementptr inbounds i32, i32* %s, i64 %end.idx
  br label %while.body

while.body:
  %phi.ptr = phi i32* [ %s, %entry ], [ %next.ptr, %while.body ]
  %next = bitcast i32* %phi.ptr to i16*
  %src.ptr = getelementptr i16, i16* %next, i64 1
  %src.ptr2 = bitcast i16* %src.ptr to i32*
  ; below misaligned load is overlapped with store.
  %val = load i32, i32* %src.ptr2, align 4
  %dst.ptr = getelementptr i32, i32* %phi.ptr, i64 0
  store i32 %val, i32* %dst.ptr, align 4
  %next.ptr = getelementptr i32, i32* %phi.ptr, i64 1
  %cmp = icmp eq i32* %next.ptr, %end.ptr
  br i1 %cmp, label %exit, label %while.body

exit:
  ret void
}

The thing is that I'm not sure whether or not above snippet is correct IR since we do misaligned load.

yurai007 updated this revision to Diff 353981.Jun 23 2021, 7:56 AM
yurai007 added inline comments.Jun 23 2021, 7:59 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1287

Ok, I checked documentation and mailing list for better understanding. Changing load just to "%val = load i32, i32* %src.ptr2, align 2" should be fine and give legal IR with underaligned access. Added improved overlapping access UT reproducer.

yurai007 updated this revision to Diff 354184.Jun 24 2021, 2:55 AM

@efriedma : Thank you for review! Now all comments are addressed and change pass on CI.

yurai007 added inline comments.Jun 29 2021, 6:59 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1378

Don't touch this counter when memmove is emitted.

yurai007 updated this revision to Diff 355223.Jun 29 2021, 7:02 AM

Fixed last comment.

xbolva00 added inline comments.Jun 29 2021, 7:42 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
110

Also you can add NumMemMove..

1378

… and increment it here

yurai007 added inline comments.Jun 29 2021, 7:53 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1378

Ok, I will add this counter.

Can you add a testcase for a case where both IsMemCpy and UseMemMove are true? I think you'll end up bailing out due to the alias check, but I'd like to be sure.

yurai007 updated this revision to Diff 355503.Jun 30 2021, 4:17 AM
yurai007 marked 2 inline comments as done.
yurai007 updated this revision to Diff 355631.Jun 30 2021, 11:03 AM

Added test covering IsMemCpy + UseMemMove path.

yurai007 added inline comments.Jun 30 2021, 11:06 AM
llvm/test/Transforms/LoopIdiom/basic.ll
1269

Fix test to emit memmove.

yurai007 marked an inline comment as not done.Jun 30 2021, 11:06 AM

Can you add a testcase for a case where both IsMemCpy and UseMemMove are true? I think you'll end up bailing out due to the alias check, but I'd like to be sure.

Wouldn't you mind if I added memcpy handling in follow-up change? While working on memcpy transformation I ended up with many unit tests (almost every memmove load-store UT has memcpy equivalent) and my change still gets bigger.
Current patch bailing out on memcpy should be enough to satisfy PR. I don't insist but if you agree I will leave fixme on do_not_form_memmove4 UT and will open follow-up review transforming memcpy.

efriedma accepted this revision.Jul 18 2021, 2:35 PM

LGTM

Can you add a testcase for a case where both IsMemCpy and UseMemMove are true? I think you'll end up bailing out due to the alias check, but I'd like to be sure.

Wouldn't you mind if I added memcpy handling in follow-up change?

That's fine; I just wanted to make sure it didn't crash or miscompile.

This revision is now accepted and ready to land.Jul 18 2021, 2:35 PM
yurai007 updated this revision to Diff 359775.Jul 19 2021, 6:51 AM
yurai007 marked an inline comment as done.

Updated just with short FIXME information added to do_not_form_memmove4 UT.

This revision was landed with ongoing or failed builds.Jul 22 2021, 4:06 AM
This revision was automatically updated to reflect the committed changes.