Page MenuHomePhabricator

[InstCombine] use poison as placeholder for undemanded elems
ClosedPublic

Authored by aqjune on Dec 19 2020, 1:03 PM.

Details

Summary

Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement.
However, this is problematic because undef isn’t undefined enough.
Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison.
This makes a few straightforward optimizations incorrect, such as:

;  https://bugs.llvm.org/show_bug.cgi?id=44185 

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r ; %r[3] is undef
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison
}

Transformation doesn't verify!
ERROR: Target is more poisonous than source

I’d like to suggest

  1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too)
  2. Updating shufflevector’s semantics to return poison element if mask is undef

Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay.
m_Undef() matches PoisonValue as well, so existing optimizations will still fire.

The only concern is hidden miscompilations that will go incorrect when poison constant is given.
A conservative way is copying all tests having insertelement undef & replacing it with insertelement poison & run Alive2 on it, but it will create many tests and people won’t like it. :(

Instead, I’ll simply locally maintain the tests and run Alive2.
If there is any bug found, I’ll report it.

Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html

Diff Detail

Event Timeline

aqjune created this revision.Dec 19 2020, 1:03 PM
aqjune requested review of this revision.Dec 19 2020, 1:03 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 19 2020, 1:03 PM

Thank you for working on this! Looking back at the bug comments (and adding reviewers based on those comments), this is a step towards killing undef that has been discussed for a long time now. :)

Besides changing IRBuilder and shufflevector's definition, I think we'll also need updates in the vectorizers, InstSimplify, and other places in InstCombine that use UndefValue for InsertElement and shuffle transforms.

A conservative way is copying all tests having insertelement undef & replacing it with insertelement poison & run Alive2 on it, but it will create many tests and people won’t like it. :(

Do you have an estimate of how many tests are out there? If it's a temporary increase and not huge, I don't think there is much downside. But if we think the transition will take a long time, then maybe we don't want the duplication.

Besides changing IRBuilder and shufflevector's definition, I think we'll also need updates in the vectorizers, InstSimplify, and other places in InstCombine that use UndefValue for InsertElement and shuffle transforms.

Thank you for the pointer!

Do you have an estimate of how many tests are out there? If it's a temporary increase and not huge, I don't think there is much downside. But if we think the transition will take a long time, then maybe we don't want the duplication.

Using this command (which counts insertelement <..> undef in non-checks):

grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq | wc -l

There are 792 files in llvm/test, most of which are in test/Transform (119) and test/CodeGen(655).
The transition will be swiftly done (if there's no other issue hopefully) by the next weekend.

My concern is that some tests aren't using utils/update_test_checks.py, and this makes things complicated. :(
Simply replacing insertelement undef at CHECK: isn't enough (ex: Transforms/SLPVectorizer/X86/alternate-cast.ll).
What do you think?

The transition will be swiftly done (if there's no other issue hopefully) by the next weekend.

Oops, I meant this weekend hopefully.

There are 792 files in llvm/test, most of which are in test/Transform (119) and test/CodeGen(655).
The transition will be swiftly done (if there's no other issue hopefully) by the next weekend.

Thinking about these again, do we need to make a poison copy for test/CodeGen? I don't think so, since the backend won't be changed anyway.

I'm sorry I've only just started looking at this - are you saying that you want to handle all "insertelement undef" cases in one go and not just a series of patcches after this one?

I'm sorry I've only just started looking at this - are you saying that you want to handle all "insertelement undef" cases in one go and not just a series of patcches after this one?

It will be treated in a series of patches. There are places other than SimplifyDemandedVectorElts that create insertelement undef. I'll create these and link them as children.

There are 792 files in llvm/test, most of which are in test/Transform (119) and test/CodeGen(655).
The transition will be swiftly done (if there's no other issue hopefully) by the next weekend.

Thinking about these again, do we need to make a poison copy for test/CodeGen? I don't think so, since the backend won't be changed anyway.

It would be good to update those for consistency; the codegen tests are supposed to be representative of what comes out of the IR optimizer. IIUC, we could do the substitution on those files already, and it would not change anything. But let's sort out the IR changes first?

My concern is that some tests aren't using utils/update_test_checks.py, and this makes things complicated. :(
Simply replacing insertelement undef at CHECK: isn't enough (ex: Transforms/SLPVectorizer/X86/alternate-cast.ll).

Yes, tests that don't have scripted CHECK lines require more work to understand. That SLP test file is scripted though. Is there another problem with that one?

aqjune updated this revision to Diff 313645.Dec 23 2020, 7:39 PM

Rebase after adding poison tests
(rGdb7a2f347f13 , rG303654724896)

It would be good to update those for consistency; the codegen tests are supposed to be representative of what comes out of the IR optimizer. IIUC, we could do the substitution on those files already, and it would not change anything. But let's sort out the IR changes first?

Yep, let's start with IR changes first. I made undef->poison test commits already.

Yes, tests that don't have scripted CHECK lines require more work to understand. That SLP test file is scripted though. Is there another problem with that one?

Oh, I simply meant that replacing "insertelement undef" with "insertelement poison" won't work when updating CHECK part of non-scripted tests.
It seems things are okay because there aren't too many tests needed to be fixed.

I'll make two more patches - the instsimplify/vectorizer/.... changes that make insertelement poison, and the langref update to shufflevector.

yubing added a subscriber: yubing.Dec 26 2020, 5:54 AM
nikic accepted this revision.Dec 27 2020, 1:15 PM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1072–1073

undef -> poison

This revision is now accepted and ready to land.Dec 27 2020, 1:15 PM

There are 3 more patches:

https://reviews.llvm.org/D93793 (IRBuilder's CreateVectorSplat)
https://reviews.llvm.org/D93817 (Other transformations)
https://reviews.llvm.org/D93818 (LangRef)

Would it be desirable if I land all of these at once as well as this (93586) when they are accepted, or is incrementally landing accepted patches okay?

nikic added a comment.Dec 27 2020, 2:11 PM

There are 3 more patches:

https://reviews.llvm.org/D93793 (IRBuilder's CreateVectorSplat)
https://reviews.llvm.org/D93817 (Other transformations)
https://reviews.llvm.org/D93818 (LangRef)

Would it be desirable if I land all of these at once as well as this (93586) when they are accepted, or is incrementally landing accepted patches okay?

Unless I'm missing something, this patch in particular is a pure optimization improvement, that does not change semantics, so I don't see a problem with landing it right away.

Okay, I'll gently land this.
I was just wondering whether insertelements having heterogenous placeholder would be problematic (this patch makes some of insertelement use poison, but not all), but it may not matter very much.

aqjune updated this revision to Diff 313807.Dec 27 2020, 3:57 PM

Address a comment

This revision was landed with ongoing or failed builds.Dec 27 2020, 3:58 PM
This revision was automatically updated to reflect the committed changes.