This is an archive of the discontinued LLVM Phabricator instance.

[LoopIdiom] let the pass deal with runtime memset size
ClosedPublic

Authored by eopXD on Aug 3 2021, 7:46 AM.

Details

Summary

The current LIR does not deal with runtime-determined memset-size. This patch
utilizes SCEV and check if the PointerStrideSCEV and the MemsetSizeSCEV are equal.
Before comparison the pass would try to fold the expression that is already
protected by the loop guard.

Testcase file memset-runtime.ll, memset-runtime-debug.ll added.

This patch deals with proper loop-idiom. Proceeding patch wants to deal with SCEV-s
that are inequal after folding with the loop guards.

Diff Detail

Event Timeline

eopXD created this revision.Aug 3 2021, 7:46 AM
eopXD requested review of this revision.Aug 3 2021, 7:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2021, 7:46 AM
eopXD updated this revision to Diff 363752.Aug 3 2021, 8:24 AM

Fix clang-format warnings.

eopXD updated this revision to Diff 363764.Aug 3 2021, 9:07 AM

Fix more clang-format warning.

eopXD edited the summary of this revision. (Show Details)Aug 3 2021, 9:45 AM

Thanks.
I would also like to ask to split this up further.
In particular, SCEVExprConverter looks scary.
Can we handle PositiveStrideSCEV != MemsetSizeSCEV case later?

eopXD updated this revision to Diff 363787.Aug 3 2021, 9:59 AM

Address comments.

Abort optimization if PositiveStrideSCEV != MemsetSizeSCEV.
Added TODO and will deal with it later.

eopXD edited the summary of this revision. (Show Details)Aug 3 2021, 10:03 AM
eopXD added a comment.Aug 3 2021, 10:17 AM

In particular, SCEVExprConverter looks scary.

I agree that adding a bug chunk is quite scary.
Its purpose is to reduce SCEV-s for comparison (convert sext to zext or fold smax constants)
I think most expression cases in the converter is not touched (as LIR is dealing with loops here).

However I currently can't think of a more simpler structure though, do you have any suggestions?
(Or is it acceptable since its a local structure in LIR?)

It might be good do change the function signatures (to take the ScalarEvolution argument, to take a SCEV instead of unsigned) in a preparatory NFC patch.
Some more catch-all inline comments..

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
220

Can this be changed in a preparatory NFC patch?

785–787

Can this be changed in a preparatory NFC patch?

917–918

Can you make this change globally trouough the file in a preparatory NFC patch?

(Though, somehow it feels conceptually wrong to pass it into processLoopStridedStore().)

949

Negative test missing

953

Negative test missing

954

Negative test missing

961

Test missing

968

Test missing

993

Can this be changed in a preparatory NFC patch?

eopXD added a comment.Aug 4 2021, 2:12 AM

NFC patch for changing StoreSize from unsigned to SCEV: D104595

eopXD updated this revision to Diff 364156.Aug 4 2021, 9:22 AM

Rebase to main HEAD.

eopXD updated this revision to Diff 364380.Aug 5 2021, 1:33 AM

Landed D104595, rebase.

eopXD marked 3 inline comments as done.Aug 5 2021, 1:34 AM
eopXD added inline comments.Aug 5 2021, 7:32 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
917–918

NFC patch: D107570.

eopXD updated this revision to Diff 364491.Aug 5 2021, 8:19 AM
eopXD marked an inline comment as done.

Landed D107570, rebase.

eopXD updated this revision to Diff 364508.Aug 5 2021, 9:27 AM

Rebase.

eopXD updated this revision to Diff 364848.Aug 6 2021, 11:17 AM
eopXD marked 5 inline comments as done.

Address comments.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
949

isLegalStore already blocks non-zero address spaces. So this guard will never trigger and should be deleted.

953

Added memset-runtime-negative-cases.ll - @MemsetSize_LoopVariant.

954

isLegalStore already blocks non-affine SCEV for the pointer in the memset instruction. So this guard will never trigger and should be deleted.

961

Added memset-runtime.ll - @For_NegativeStride

968

Added memset-runtime-negative-cases.ll - @MemsetSize_Stride_Mismatch.

eopXD updated this revision to Diff 364934.Aug 6 2021, 11:24 PM

isLegalStore should be kept the same.

eopXD updated this revision to Diff 364966.EditedAug 7 2021, 9:18 AM

processLoopMemset does not goes through isLegalStore.
Checks and test cases added.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
949

Added memset-runtime-debug.ll - @NonZeroAddressSpace.

954

Added memset-runtime-debug.ll - @NonAffinePointer.

eopXD edited the summary of this revision. (Show Details)Aug 7 2021, 9:33 AM
Whitney added inline comments.Aug 9 2021, 6:53 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
920–973

initialize the variable?

924

Add a period at the end of a sentence.

943

Add a period at the end of a sentence.

950
// If the original StrideSCEV and MemsetSizeSCEV does not match, the pass
// will fold expressions that is covered by the loop guard at loop entry.
// The pass will compare again after the folding and proceed if equal.

That should be added in the future patch that implement that.

llvm/test/Transforms/LoopIdiom/memset-runtime-debug.ll
1 ↗(On Diff #364966)

is that true? Or you manually generated this test case?
If not, please remove this line.

47 ↗(On Diff #364966)

That's not ar[i]

llvm/test/Transforms/LoopIdiom/memset-runtime.ll
3

Why do we need assert for this test case?

Whitney added a subscriber: bmahjour.
bmahjour added inline comments.Aug 9 2021, 10:09 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
945

isn't -> aren't

946

handle -> handles

949

Is this part of the comment talking about what you plan to do in the future? If so please either move it to the TODO section on line 977, or put a TODO marker here.

llvm/test/Transforms/LoopIdiom/memset-runtime-debug.ll
43 ↗(On Diff #364966)

int *ar is actually the first parameter in the IR below.

114 ↗(On Diff #364966)

Would be useful to have the psudo-C comment like other tests.

llvm/test/Transforms/LoopIdiom/memset-runtime.ll
3

This one doesn't require asserts.

52

ar is the first parameter in the IR below

60

could you rename %1 and %2 to %n and %m and update the CHECKS?

eopXD updated this revision to Diff 365409.Aug 10 2021, 3:14 AM
eopXD marked 15 inline comments as done.

Address comments.

This revision is now accepted and ready to land.Aug 12 2021, 11:07 AM

Added some minor NFC comments.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
895

"with a constant size" can be removed, since the new code is trying to handle a non-constant size.

960

"MemsizeSizeSCEV" -> "MemsetSizeSCEV"

eopXD edited the summary of this revision. (Show Details)Aug 13 2021, 9:48 AM
eopXD updated this revision to Diff 366304.Aug 13 2021, 9:52 AM
eopXD marked 2 inline comments as done.

Address comments.

Thank you @lebedev.ri for guiding my huge code chunk in D104636 into this patch that is more concise.
Also thank you @Whitney, @bmahjour, @qianzhen for reviewing my code.
I am now going to commit it.

Next I would work on the case when MemsetSizeSCEV != PointerStrideSCEV. In this case, the SCEV-s can be folded given the loop entry is already protected by loop guard.

Thank you, i think this is about right for this chunk.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
925–929

Since non-constant case has to deal with this problem also, i suspect this should go away

llvm/test/Transforms/LoopIdiom/memset-runtime-debug.ll
3 ↗(On Diff #366304)

It's not great that we don't check what the IR is.
At the very least i think you want FileCheck --implicit-check-not=memset %s

eopXD updated this revision to Diff 366313.Aug 13 2021, 11:00 AM
eopXD marked 2 inline comments as done.

Address comments.

Put debug output checks at very front, then IRs are checked.

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.
Can you point me at the code that does that?

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
1069–1095

This doesn't look right.

  1. There absolutely should not be two almost identical functions
  2. Shouldn't this be saturating unsigned multiplication?

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.

May I ask what does IIRC mean?

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.

May I ask what does IIRC mean?

If I Remember Correctly

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.
Can you point me at the code that does that?

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

eopXD updated this revision to Diff 366318.Aug 13 2021, 11:24 AM
eopXD marked an inline comment as done.

Address comment.
There is only one getNumBytes now.

Actually, we've talked about not needing the overflow check, but IIRC just saturating the number of bytes.
Can you point me at the code that does that?

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

@efriedma does that sound right?

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

@efriedma does that sound right?

I think I mentioned this on the other bug... but sounds right to me; overflow implies undefined behavior. If the amount does overflow, that means the memset is writing at least once to every byte in the address space. Among other things, that implies a store to null.

lebedev.ri accepted this revision.Aug 13 2021, 12:28 PM

No code for saturation here.
I thought restricting to address space zero prevents the need of saturation.

@efriedma does that sound right?

I think I mentioned this on the other bug... but sounds right to me; overflow implies undefined behavior. If the amount does overflow, that means the memset is writing at least once to every byte in the address space. Among other things, that implies a store to null.

Well, alrighy then.
@efriedma feel like stamping this too then?

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
925

Comment needs updating?

eopXD updated this revision to Diff 366408.Aug 14 2021, 1:00 AM

Rebase to main.

eopXD updated this revision to Diff 366412.Aug 14 2021, 1:41 AM
eopXD marked an inline comment as done.

Address comments.