This is an archive of the discontinued LLVM Phabricator instance.

[DSE] Add value type checks for masked store candidates in Dead Store Elimination
ClosedPublic

Authored by mcberg2021 on Aug 25 2022, 12:53 PM.

Details

Summary

The types of the store values can diverge when checking for valid mask store candidates to eliminate via DSE.

Diff Detail

Event Timeline

mcberg2021 created this revision.Aug 25 2022, 12:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 12:53 PM
mcberg2021 requested review of this revision.Aug 25 2022, 12:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 12:53 PM
mcberg2021 retitled this revision from Add value type checks for masked store candidates in DSE to [DSE] Add value type checks for masked store candidates in Dead Store Elimination.Aug 25 2022, 8:20 PM
StephenFan added inline comments.Aug 27 2022, 1:38 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

Is it possible to relax the comparison from type to type size?

rui.zhang accepted this revision.Aug 29 2022, 11:33 AM

Your change looks great to me. Thanks for improving this case.

This revision is now accepted and ready to land.Aug 29 2022, 11:33 AM
nikic added a subscriber: nikic.Aug 29 2022, 11:36 AM
nikic added inline comments.
llvm/test/Transforms/DeadStoreElimination/masked-dead-store-no-merge.ll
3 ↗(On Diff #455690)

-tbaa and datalayout should not be necessary here.

fhahn added a subscriber: fhahn.Aug 30 2022, 1:03 PM
fhahn added inline comments.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
245

I am not sure I understand the meaning of the comment. Could you re-phrase?

250

Yeah it would be good to check the stored sizes instead of requiring the types to match. I think we should remove one store if both masked stores are flipped.

llvm/test/Transforms/DeadStoreElimination/masked-dead-store-no-merge.ll
11 ↗(On Diff #455690)

Could you instead extend the existing masked store test? llvm/test/Transforms/DeadStoreElimination/masked-dead-store.ll

Also, it would be good to pre-commit the test separately and only include the changed lines caused by the patch in this diff.

mcberg2021 added inline comments.Aug 30 2022, 2:23 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

Well, we have an ulterior motive, VLA uses VL with type to constrain size.

fhahn added inline comments.Aug 31 2022, 1:08 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

Right, so it sounds like this should also 1) have tests with scalable vectors and 2) different handling for scalable vs non-scalable vectors?

mcberg2021 added inline comments.Aug 31 2022, 3:18 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

We would like to keep this patch to just the fixed non-scalable vectors like in this example. I can add the type size and I think element count would also be useful, as it is within context of this intrinsic.

craig.topper added inline comments.Aug 31 2022, 6:24 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

@fhahn what does "flipped" mean in your comment. "I think we should remove one store if both masked stores are flipped."

I think there was a confusion about was meant by scalable here. When @mcberg2021 said scalable that was referring to a different intrinsic, vp_store, that we will also need to handle here eventually. When @fhahn said scalable that was referring to this intrinsic with a scalable vector type.

My suggestion is to check getElementCount and that getScalarSizeInBits for the type match. That will cover both fixed and scalable.

Alternatively, since the masks are the same, the element count is guaranteed the same. So we could check only that getPrimitiveSizeInBits() is the same.

mcberg2021 added inline comments.Aug 31 2022, 6:26 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

Query: What does "flipped" mean?

Right now I think we cover both fixed and scalable for masked_store with a type size check and ElementCount check being the same for both stores as is it precludes mixing fixed and scalable and will work for either scenario.

mcberg2021 added inline comments.Sep 1 2022, 11:45 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

I'm going to venture a reasonable guess for flipped: It's the test example turned around from:

tail call void @llvm.masked.store.v4i32.p0(<4 x i32> %v2, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
tail call void @llvm.masked.store.v4i8.p0(<4 x i8> %v1, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)

to

tail call void @llvm.masked.store.v4i8.p0(<4 x i8> %v1, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
tail call void @llvm.masked.store.v4i32.p0(<4 x i32> %v2, ptr %a, i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)

where if you applied a mask to supply the superset for the second store it would be : 1111 1111 1111 1111
and the first store would accordingly be a well formed subset that fits under it: 0001 0001 0001 0001

meaning the store is fully contained in a constant mask known at compile time.

fhahn added inline comments.Sep 2 2022, 1:10 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
250

Yes that's exactly what I had in mind!

mcberg2021 updated this revision to Diff 458806.EditedSep 8 2022, 11:15 AM

Updated with pre-commit test and changes we discussed.

fhahn added inline comments.Sep 8 2022, 1:20 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
252

This should be fine if KillingTy->getScalarSizeInBits >= DeadTy->getScalarSizeInBits(), right? I might have missed it, but t looks like there's no test case like the one @mcberg2021 mentioned above?

mcberg2021 added inline comments.Sep 8 2022, 1:51 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
252

That too should work. I will update for the subsumption case. However I would like to leave the mask decomposition support for a later change.

craig.topper added inline comments.Sep 8 2022, 1:57 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
252

If the mask is unknown then the store is not writing contiguous bytes. There can be gaps. So it is not sufficient to check that scalar size is larger.

fhahn accepted this revision.Sep 9 2022, 2:10 AM

LGTM, thanks! Please add the extra test case as discussed separately.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
252

Yeah good point, we will need to combine this with the mask analysis. Still good to add the extra test coverage. The new test seems to have constant masks, it would be good to also have variants where the mask is unknown (e.g. arg to function)

mcberg2021 updated this revision to Diff 459216.Sep 9 2022, 3:55 PM

Test added as requested.