This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Look through bitcast if possible
Needs RevisionPublic

Authored by ruiling on Nov 16 2022, 10:41 PM.

Details

Summary

When trying to simplify instruction based on demanded vector elements,
we are able to look through bitcast that does not change number of
vector elements.

Diff Detail

Event Timeline

ruiling created this revision.Nov 16 2022, 10:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 10:41 PM
ruiling requested review of this revision.Nov 16 2022, 10:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 10:41 PM
arsenm added inline comments.Nov 16 2022, 11:11 PM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
576–578

Is there not a more convenient way in PatternMatch to check for matching fixed elements?

ruiling added inline comments.Nov 17 2022, 5:48 AM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
576–578

Sorry I don't have a good idea to simplify the code. Please let me know if you have a good suggestion.

foad added a subscriber: spatel.Nov 22 2022, 6:21 AM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
369

It might be neater to handle BitCast in findDemandedEltsBySingleUser instead. In either case, you're introducing recursion, so perhaps it should have a depth limit.

571

It took me a long time to understand this part of the patch. It is handling cases like this:

%var = call ... // has multiple uses
%var1 = bitcast <4 x i32> %var to <4 x float> // has single use
%var2 = extractelement <4 x float> %var1, i64 0

If the operand of extractelement itself had multiple uses, then we would handle it correctly here by calling findDemandedEltsByAllUsers. But because it is wrapped inside a single-use bitcast, we call SimplifyDemandedVectorElts directly, which recurses through the bitcast but does not handle multiple uses at the recursive step.

Really all this code feels like a bit of a hack that we only call SimplifyDemandedVectorElts in certain places (when we think there's a good chance that not all elements are demanded) and we only handle multiple uses in certain places.

@spatel do you have any thoughts about this?

arsenm added inline comments.Nov 22 2022, 7:46 AM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
576–578

dyn_cast instead of isa+cast, but There probably should be a helper to check this somewhere

spatel added a subscriber: RKSimon.Nov 22 2022, 7:58 AM
spatel added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
369

Improving findDemandedEltsBySingleUser() sounds like a good idea.

@RKSimon might have more to say about that. There may be a codegen equivalent that we can use as a guideline.

llvm/test/Transforms/InstCombine/AMDGPU/demanded-vector-elts-multi-user.ll
4

I can't tell what changes are expected from this patch. The patch is target-independent, so the tests should also be target-independent and minimized as much as possible.

arsenm requested changes to this revision.Jun 7 2023, 5:25 AM
arsenm added inline comments.
llvm/test/Transforms/InstCombine/AMDGPU/demanded-vector-elts-multi-user.ll
18

Can you post patch against pre-commited base tests?

This revision now requires changes to proceed.Jun 7 2023, 5:25 AM