This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold extractelt with select of constants
ClosedPublic

Authored by tsymalla on Nov 14 2022, 4:11 AM.

Details

Summary

An extractelt with a constant index which extracts an element from the
two vector operands of a select can be directly folded into a select.

extractelt (select %x, %vec1, %vec2), %const ->
select %x, %vec1[%const], %vec2[%const]

Note: the implementation currently only works for constant vector operands.

Diff Detail

Event Timeline

tsymalla created this revision.Nov 14 2022, 4:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2022, 4:11 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
tsymalla requested review of this revision.Nov 14 2022, 4:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2022, 4:11 AM
foad added a comment.Nov 14 2022, 6:22 AM

extractelt (select %cond, <vec1>, <vec2>), %c1, %c2 ->
select %cond, <vec1>[c1], <vec2>[c2]

What are %c1 and %c2? Surely these is only one index?

Also your description does not make it clear that you are only doing this for constant vectors. For general vectors it could be: extractelt (select %cond, <vec1>, <vec2>), %c -> select %cond, (extractelt <vec1>, %c), (extractelt <vec2>, %c)

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
396

Why do you need to check this? Generally it is OK when the top-level thing has multiple uses.

403

In most case you should use an early "return nullptr" to reduce indentation.

nikic added a subscriber: nikic.

Can we extend FoldOpIntoSelect to handle this instead?

Can we extend FoldOpIntoSelect to handle this instead?

Okay.

extractelt (select %cond, <vec1>, <vec2>), %c1, %c2 ->
select %cond, <vec1>[c1], <vec2>[c2]

What are %c1 and %c2? Surely these is only one index?

Also your description does not make it clear that you are only doing this for constant vectors. For general vectors it could be: extractelt (select %cond, <vec1>, <vec2>), %c -> select %cond, (extractelt <vec1>, %c), (extractelt <vec2>, %c)

You are right, there is only one constant. For now, I'm only handling constant vectors, therefore I'll adjust the comment.

tsymalla updated this revision to Diff 475151.Nov 14 2022, 7:55 AM
tsymalla marked 2 inline comments as done.
  • Move logic to FoldOpIntoSelect.
  • Remove superfluous use check for the extractelt instruction.
  • Make use of early returns.
  • Add more / fix existing comments.
tsymalla edited the summary of this revision. (Show Details)Nov 14 2022, 7:57 AM
foad added inline comments.Nov 14 2022, 8:08 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1111

Why do these have to be ConstantInt? Surely it should work with floats too (should have a test for this) or any other kind of Constant?

This is missing at least one critical pattern-matching constraint (and regression test - because this will crash):

define i32 @extelt_vecselect_const_operand_vector(<3 x i1> %c) {
  %s = select <3 x i1> %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
  %r = extractelement <3 x i32> %s, i32 2
  ret i32 %r
}

This is missing at least one critical pattern-matching constraint (and regression test - because this will crash):

define i32 @extelt_vecselect_const_operand_vector(<3 x i1> %c) {
  %s = select <3 x i1> %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
  %r = extractelement <3 x i32> %s, i32 2
  ret i32 %r
}

Thanks!

tsymalla updated this revision to Diff 475210.Nov 14 2022, 10:51 AM
tsymalla marked an inline comment as done.

Add additional test to show folding for other types.
Use Constants instead of ConstantInt.
Conditions must be Integer types, no vector types.

tsymalla updated this revision to Diff 475217.Nov 14 2022, 10:59 AM

Update lit tests.

arsenm added a subscriber: arsenm.Nov 14 2022, 11:16 AM
arsenm added inline comments.
llvm/test/Transforms/InstCombine/extractelement.ll
816

Needs a test with an out of bounds vector index

tsymalla added inline comments.Nov 14 2022, 11:27 AM
llvm/test/Transforms/InstCombine/extractelement.ll
816

Hmm, I don't know if that makes sense. A trivial case with an extractelement with OOB behavior will not be caught by my code, as far as I can see, so it will be folded to a poison value.
I'll double-check and probably remove the OOB check in my code.

tsymalla added inline comments.Nov 15 2022, 12:03 AM
llvm/test/Transforms/InstCombine/extractelement.ll
816

Found a possible different issue with the OOB behavior:

define i32 @extelt_select_const_operand_vector_oob(i1 %c) {
  %oob = select i1 %c, i32 3, i32 4
  %s = select i1 %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
  %r = extractelement <3 x i32> %s, i32 %oob
  ret i32 %r
}

will make opt crash during constant folding.

tsymalla added inline comments.Nov 15 2022, 12:06 AM
llvm/test/Transforms/InstCombine/extractelement.ll
816

Simple OOB extractelements get optimized away before InstCombine reaches that code, and above snippet is part of a different issue, so I'll remove the OOB logic for now.

tsymalla updated this revision to Diff 475358.Nov 15 2022, 12:13 AM

Remove OOB logic. Simple cases will be handled earlier, so the relevant code
path will not be visited.

foad added inline comments.Nov 15 2022, 1:08 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1105

Don't need a cast. getAggregateElement already returns Constant.

foad added a comment.Nov 15 2022, 1:40 AM

You don't need your new function FoldExtractElementSelectConstVector at all. All you should need to do is to call FoldOpIntoSelect from visitExtractElementInst, and add (trivial) support for ExtractElementInst in foldOperationIntoSelectOperand. For example: https://reviews.llvm.org/differential/diff/475368/

tsymalla updated this revision to Diff 475394.Nov 15 2022, 2:24 AM

Remove superfluous dyn_cast for getAggregateElement.

You don't need your new function FoldExtractElementSelectConstVector at all. All you should need to do is to call FoldOpIntoSelect from visitExtractElementInst, and add (trivial) support for ExtractElementInst in foldOperationIntoSelectOperand. For example: https://reviews.llvm.org/differential/diff/475368/

Thanks. It needs to be done earlier, in constantFoldOperationIntoSelectOperand, but should nevertheless work.

tsymalla updated this revision to Diff 475413.Nov 15 2022, 3:30 AM

Removed the whole custom implementation as InstCombine is already able
to do the constant folding, but the Visitor for extractelement needs an
additional check to prevent folding the constants into select operands
when the select uses a condition vector.

foad added a comment.Nov 15 2022, 3:45 AM

This version fails assertions during check-llvm-transforms-instcombine.

tsymalla updated this revision to Diff 475443.Nov 15 2022, 6:19 AM

Fix assertion failures

This patch is more general than where it started. I'm not opposed to that, but it needs more tests.

This crashes:

define i32 @extelt_select_const_operand_vector_var_index(i1 %c, i32 %e) {
  %s = select i1 %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
  %r = extractelement <3 x i32> %s, i32 %e
  ret i32 %r
}

We can fold with a single constant select operand (the srem diff show this, but we should have minimal tests too):

define i32 @extelt_select_var_const_operand_vector(i1 %c, <3 x i32> %v) {
  %s = select i1 %c, <3 x i32> %v, <3 x i32> <i32 5, i32 6, i32 7>
  %r = extractelement <3 x i32> %s, i32 1
  ret i32 %r
}

define i32 @extelt_select_const_var_operand_vector(i1 %c, <3 x i32> %v) {
  %s = select i1 %c, <3 x i32> <i32 5, i32 6, i32 7>, <3 x i32> %v
  %r = extractelement <3 x i32> %s, i32 0
  ret i32 %r
}

Does this behave as expected when the select has another use besides the extract?

declare void @use(<3 x i32>)

define i32 @extelt_select_const_var_operands_vector_extra_use(i1 %c, <3 x i32> %x) {
  %s = select i1 %c, <3 x i32> <i32 42, i32 5, i32 4>, <3 x i32> %x
  call void @use(<3 x i32> %s)
  %r = extractelement <3 x i32> %s, i64 0
  ret i32 %r
}

define i32 @extelt_select_const_operands_vector_extra_use(i1 %c, <3 x i32> %x) {
  %s = select i1 %c, <3 x i32> <i32 42, i32 5, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
  call void @use(<3 x i32> %s)
  %r = extractelement <3 x i32> %s, i64 0
  ret i32 %r
}

@spatel Thanks for adding these tests. These work in general, except the case where the second operand of the ExtractElement instruction is not a constant. The reason is that currently constantFoldOperationIntoSelectOperand expects both operands to be either the original select or a constant, but not an additional select, so a new extractelement instruction can be created to extract the actual value. If one of the operands is another select, it could be obvious that the sequence shows OOB behavior:

%e = select i1 %c, i32 3, i32 4
%s = select i1 %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
%r = extractelement <3 x i32> %s, i32 %e
ret i32 %r

Even if I know how to make the regular case work, I'll wrap my head around that particular issue.

@spatel Thanks for adding these tests. These work in general, except the case where the second operand of the ExtractElement instruction is not a constant. The reason is that currently constantFoldOperationIntoSelectOperand expects both operands to be either the original select or a constant, but not an additional select, so a new extractelement instruction can be created to extract the actual value. If one of the operands is another select, it could be obvious that the sequence shows OOB behavior:

%e = select i1 %c, i32 3, i32 4
%s = select i1 %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
%r = extractelement <3 x i32> %s, i32 %e
ret i32 %r

Even if I know how to make the regular case work, I'll wrap my head around that particular issue.

Just bail out if the extract index value is not an immediate constant? The case where the index value is a select-of-constants that can be reduced might be another patch, but that doesn't seem like a common pattern.

@spatel Thanks for adding these tests. These work in general, except the case where the second operand of the ExtractElement instruction is not a constant. The reason is that currently constantFoldOperationIntoSelectOperand expects both operands to be either the original select or a constant, but not an additional select, so a new extractelement instruction can be created to extract the actual value. If one of the operands is another select, it could be obvious that the sequence shows OOB behavior:

%e = select i1 %c, i32 3, i32 4
%s = select i1 %c, <3 x i32> <i32 2, i32 3, i32 4>, <3 x i32> <i32 5, i32 6, i32 7>
%r = extractelement <3 x i32> %s, i32 %e
ret i32 %r

Even if I know how to make the regular case work, I'll wrap my head around that particular issue.

Just bail out if the extract index value is not an immediate constant? The case where the index value is a select-of-constants that can be reduced might be another patch, but that doesn't seem like a common pattern.

Yeah, I have that working, but it's not going to optimize such cases like in the shufflevector tests for now. I'll go with bailing out for now.

tsymalla updated this revision to Diff 476846.Nov 21 2022, 3:42 AM

Don't access unreachable path when trying to constant-fold an extractelement with a select as index operand.

This seems close to complete now. Please pre-commit the tests with baseline CHECKs as an NFC patch (no pre-commit review needed). That way, we'll just show test diffs in this patch.

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
401–403

Omit this description of a negative pattern. The formula under this line describes what we expect to happen.
One minor adjustment - c generally refers to a constant value in this code, so use some other letter for the condition:

// extractelt (select %x, <vec1>, <vec2>), %const ->
// select %x, <vec1>[%const], <vec2>[%const]
604

Remove whitespace changes, so the patch is minimal.
You could do a more extensive formatting cleanup as an NFC patch if you want.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1073

Tab settings are off. Adjust your editor to conform with existing code or use clang-format.

1166

Avoid dyn_cast if we're not using the captured value:

if (!isa<Constant>(EI->getIndexOperand()))
tsymalla updated this revision to Diff 476907.Nov 21 2022, 8:00 AM
tsymalla marked 4 inline comments as done.

Addressed review comments.

foad added inline comments.Nov 21 2022, 8:41 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1166

This whole check belongs in visitExtractElementInst, not here.

tsymalla updated this revision to Diff 476966.Nov 21 2022, 12:04 PM

Moving constant check to visitExtractElementInst

foad accepted this revision.Nov 21 2022, 10:49 PM

The code LGTM.

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
401–402

Why do you write <vec1> instead of %vec1? I found it confusing to have two different syntaxes for variables in the same line.

403–406

I don't really understand this comment. "FIXME" suggests that something is broken. What is broken? What would you like to fold extractelt (select %c, %<vec1>, %<vec2>), (select %c, %c1, %c2) into?

This revision is now accepted and ready to land.Nov 21 2022, 10:49 PM
tsymalla added inline comments.Nov 21 2022, 11:33 PM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
401–402

I wanted to use the vector-notation <v_1, ..., v_n, but I can adjust it.

403–406

Currently, cases like

extractelt (select %c, %<vec1>, %<vec2>), (select %c, 3, 4) are not handled. If such instruction sequence leads to OOB behavior at compile time, this could be caught here.
I'm going to adjust the comment.

tsymalla edited the summary of this revision. (Show Details)Nov 21 2022, 11:51 PM
tsymalla updated this revision to Diff 477083.Nov 22 2022, 12:09 AM

Made comments more concise.

spatel accepted this revision.Nov 22 2022, 4:55 AM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
409

nit: use "auto *" with dyn_cast to be consistent with the surrounding code.

This revision was automatically updated to reflect the committed changes.