This is an archive of the discontinued LLVM Phabricator instance.

[SROA] Try harder to find a vector promotion viable type when rewriting
ClosedPublic

Authored by vangthao on Jun 17 2022, 10:47 AM.

Details

Summary

We are seeing significant performance loss when an alloca fails to get promoted
to register. I have observed that this is due to the common type found when
attempting to rewrite partition users being unviable for promotion. While if we
would have continue looking for a type, we would have found a subtype in the
original allocated type that would have enabled promotion. Thus first check if
the initial common type found is promotion viable and if not then continue
looking instead of stopping with the initial common type found.

Diff Detail

Event Timeline

vangthao created this revision.Jun 17 2022, 10:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 10:47 AM
vangthao requested review of this revision.Jun 17 2022, 10:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 10:48 AM
arsenm added inline comments.Jun 17 2022, 11:03 AM
llvm/test/CodeGen/AMDGPU/sroa-common-type-fail-promotion.ll
2 ↗(On Diff #437969)

test should go in test/Transforms/SROA, not CodeGen

vangthao updated this revision to Diff 437975.Jun 17 2022, 11:09 AM

Move test to test/Transforms/SROA

arsenm added inline comments.Jun 27 2022, 9:31 AM
llvm/lib/Transforms/Scalar/SROA.cpp
4268–4269

Using isVectorTy + VectorType is ugly. Use dyn_cast for the type check too?

nikic added a comment.Jun 27 2022, 9:57 AM

I'm not sure I understand how this change works. isVectorPromotionViable() doesn't seem to use SliceTy at all, it computes its own candidate types. Why does the picked slice type matter for vector promotion?

llvm/test/Transforms/SROA/sroa-common-type-fail-promotion.ll
3

-mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 is likely unnecessary here. -opaque-pointers can also be dropped (it's the default).

21

The address spaces don't look essential for the test and add a lot of noise -- can probably be dropped.

I'm not sure I understand how this change works. isVectorPromotionViable() doesn't seem to use SliceTy at all, it computes its own candidate types. Why does the picked slice type matter for vector promotion?

The picked slice type matters because rewritePartition() will attempt to convert the slice users to that type. This won't matter for isVectorPromotionViable() during the first pass over the alloca partition but during the second pass. The second pass over the alloca partition, isVectorPromotionViable() will check the type of any loads or stores that are the exact size of the partition. If we were able to convert those loads and stores into our picked slice type of the first iteration, then it will be used here.

vangthao updated this revision to Diff 442654.Jul 6 2022, 11:54 AM

Addressed review comments.

arsenm added inline comments.Jul 11 2022, 12:20 PM
llvm/lib/Transforms/Scalar/SROA.cpp
4268

You didn't re-use the dyn_cast value

vangthao updated this revision to Diff 445138.Jul 15 2022, 2:47 PM

Reuse dyn_cast<VectorType> value for SliceTy and TypePartionTy.

arsenm added inline comments.Jul 16 2022, 7:10 AM
llvm/lib/Transforms/Scalar/SROA.cpp
1854

Lowercase Check

llvm/test/Transforms/SROA/sroa-common-type-fail-promotion.ll
25

If you're dropping the target and address spaces, should also replace the amdgcn intrinsics with something else (maybe just regular loads should suffice)

arsenm added inline comments.Jul 16 2022, 7:24 AM
llvm/lib/Transforms/Scalar/SROA.cpp
4268–4278

From the flow here, it looks like it's trying to split vectors before arrays. Would this work better if moved after the array handling? Can you add some additional tests with structs and arrays of vectors?

vangthao updated this revision to Diff 446888.Jul 22 2022, 10:05 AM

Add more test cases and fix function name.

vangthao updated this revision to Diff 447443.Jul 25 2022, 1:03 PM

Add more tests and change some test names to what they are testing.

vangthao updated this revision to Diff 447903.Jul 26 2022, 6:07 PM

Update a test and move checkVectorTypeForPromotion check after array handling.

arsenm added inline comments.Aug 1 2022, 12:18 PM
llvm/test/Transforms/SROA/sroa-common-type-fail-promotion.ll
339–343

I'm not seeing why these allocas were not eliminated

vangthao added inline comments.Aug 1 2022, 1:49 PM
llvm/test/Transforms/SROA/sroa-common-type-fail-promotion.ll
339–343

My change only works if there is a common type found and that common type happens to be a vectortype. In the other tests the common vectortype was found from a store <4 x float> ... instruction. If we remove such instruction then there is no common vectortype and my changes to look at the original allocated type and vector promotion check is not enabled.

From what I have observed, SROA fails to promote these allocas because it found common slicety of float while there are load halfs. This causes some offset issues and failure to promote when visitLoadInst() is called. If we change float type to half type by adjusting the stores to also be half type then we would have no issue promoting the allocas.

bcahoon added a subscriber: bcahoon.Aug 4 2022, 6:49 AM

Is anything else needed for this patch or any additional review comments?

arsenm accepted this revision.Aug 5 2022, 7:28 PM

LGTM

llvm/test/Transforms/SROA/sroa-common-type-fail-promotion.ll
339–343

Can you look into handling these cases in a follow on patch?

This revision is now accepted and ready to land.Aug 5 2022, 7:28 PM
This revision was landed with ongoing or failed builds.Aug 8 2022, 11:04 AM
This revision was automatically updated to reflect the committed changes.