Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Avoid migrating existing patches. Phabricator shutdown timeline

[LoopIdiom] let the pass deal with runtime memset size
ClosedPublic

Authored by eopXD on Aug 3 2021, 7:46 AM.

Details

Summary

The current LIR does not deal with runtime-determined memset-size. This patch
utilizes SCEV and check if the PointerStrideSCEV and the MemsetSizeSCEV are equal.
Before comparison the pass would try to fold the expression that is already
protected by the loop guard.

Testcase file memset-runtime.ll, memset-runtime-debug.ll added.

This patch deals with proper loop-idiom. Proceeding patch wants to deal with SCEV-s
that are inequal after folding with the loop guards.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
eopXD added inline comments.Aug 5 2021, 7:32 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
923

NFC patch: D107570.

eopXD updated this revision to Diff 364491.Aug 5 2021, 8:19 AM
eopXD marked an inline comment as done.

Landed D107570, rebase.

eopXD updated this revision to Diff 364508.Aug 5 2021, 9:27 AM

Rebase.

eopXD updated this revision to Diff 364848.Aug 6 2021, 11:17 AM
eopXD marked 5 inline comments as done.

Address comments.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
954

isLegalStore already blocks non-zero address spaces. So this guard will never trigger and should be deleted.

958

Added memset-runtime-negative-cases.ll - @MemsetSize_LoopVariant.

959

isLegalStore already blocks non-affine SCEV for the pointer in the memset instruction. So this guard will never trigger and should be deleted.

966

Added memset-runtime.ll - @For_NegativeStride

973

Added memset-runtime-negative-cases.ll - @MemsetSize_Stride_Mismatch.

eopXD updated this revision to Diff 364934.Aug 6 2021, 11:24 PM

isLegalStore should be kept the same.

eopXD updated this revision to Diff 364966.EditedAug 7 2021, 9:18 AM

processLoopMemset does not goes through isLegalStore.
Checks and test cases added.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
954

Added memset-runtime-debug.ll - @NonZeroAddressSpace.

959

Added memset-runtime-debug.ll - @NonAffinePointer.

eopXD edited the summary of this revision. (Show Details)Aug 7 2021, 9:33 AM
Whitney added inline comments.Aug 9 2021, 6:53 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
925–978

initialize the variable?

929

Add a period at the end of a sentence.

948

Add a period at the end of a sentence.

955
// If the original StrideSCEV and MemsetSizeSCEV does not match, the pass
// will fold expressions that is covered by the loop guard at loop entry.
// The pass will compare again after the folding and proceed if equal.

That should be added in the future patch that implement that.

llvm/test/Transforms/LoopIdiom/memset-runtime-debug.ll
2

is that true? Or you manually generated this test case?
If not, please remove this line.

48

That's not ar[i]

llvm/test/Transforms/LoopIdiom/memset-runtime.ll
3

Why do we need assert for this test case?

Whitney added a subscriber: bmahjour.
bmahjour added inline comments.Aug 9 2021, 10:09 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
950

isn't -> aren't

951

handle -> handles

954

Is this part of the comment talking about what you plan to do in the future? If so please either move it to the TODO section on line 977, or put a TODO marker here.

llvm/test/Transforms/LoopIdiom/memset-runtime-debug.ll
44

int *ar is actually the first parameter in the IR below.

115

Would be useful to have the psudo-C comment like other tests.

llvm/test/Transforms/LoopIdiom/memset-runtime.ll
3

This one doesn't require asserts.

52

ar is the first parameter in the IR below

60

could you rename %1 and %2 to %n and %m and update the CHECKS?

eopXD updated this revision to Diff 365409.Aug 10 2021, 3:14 AM
eopXD marked 15 inline comments as done.

Address comments.

This revision is now accepted and ready to land.Aug 12 2021, 11:07 AM

Added some minor NFC comments.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
898–899

"with a constant size" can be removed, since the new code is trying to handle a non-constant size.

965

"MemsizeSizeSCEV" -> "MemsetSizeSCEV"

eopXD edited the summary of this revision. (Show Details)Aug 13 2021, 9:48 AM
eopXD updated this revision to Diff 366304.Aug 13 2021, 9:52 AM
eopXD marked 2 inline comments as done.

Address comments.

Thank you @lebedev.ri for guiding my huge code chunk in D104636 into this patch that is more concise.
Also thank you @Whitney, @bmahjour, @qianzhen for reviewing my code.
I am now going to commit it.

Next I would work on the case when MemsetSizeSCEV != PointerStrideSCEV. In this case, the SCEV-s can be folded given the loop entry is already protected by loop guard.

Thank you, i think this is about right for this chunk.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
930–934

Since non-constant case has to deal with this problem also, i suspect this should go away

llvm/test/Transforms/LoopIdiom/memset-runtime-debug.ll
4

It's not great that we don't check what the IR is.
At the very least i think you want FileCheck --implicit-check-not=memset %s

eopXD updated this revision to Diff 366313.Aug 13 2021, 11:00 AM
eopXD marked 2 inline comments as done.

Address comments.

Put debug output checks at very front, then IRs are checked.

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.
Can you point me at the code that does that?

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1075–1087

This doesn't look right.

  1. There absolutely should not be two almost identical functions
  2. Shouldn't this be saturating unsigned multiplication?

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.

May I ask what does IIRC mean?

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.

May I ask what does IIRC mean?

If I Remember Correctly

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.
Can you point me at the code that does that?

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

eopXD updated this revision to Diff 366318.Aug 13 2021, 11:24 AM
eopXD marked an inline comment as done.

Address comment.
There is only one getNumBytes now.

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.
Can you point me at the code that does that?

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

@efriedma does that sound right?

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

@efriedma does that sound right?

I think I mentioned this on the other bug... but sounds right to me; overflow implies undefined behavior. If the amount does overflow, that means the memset is writing at least once to every byte in the address space. Among other things, that implies a store to null.

lebedev.ri accepted this revision.Aug 13 2021, 12:28 PM

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

@efriedma does that sound right?

I think I mentioned this on the other bug... but sounds right to me; overflow implies undefined behavior. If the amount does overflow, that means the memset is writing at least once to every byte in the address space. Among other things, that implies a store to null.

Well, alrighy then.
@efriedma feel like stamping this too then?

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
930

Comment needs updating?

eopXD updated this revision to Diff 366408.Aug 14 2021, 1:00 AM

Rebase to main.

eopXD updated this revision to Diff 366412.Aug 14 2021, 1:41 AM
eopXD marked an inline comment as done.

Address comments.