Page MenuHomePhabricator

DAG: Fold extract_vector_elt (scalar_to_vector), K to undef
Needs ReviewPublic

Authored by RKSimon on Sep 3 2018, 5:43 AM.

Details

Reviewers
arsenm
Summary

This was unconditionally folding this to the source operand,
even if the access was out of bounds. Use undef instead of
the extract is not the first element.

This helps with some cases where 3-vectors are legalized
and avoids processing the 4th component.

Avoids regressions in a future commit.

Diff Detail

Repository
rL LLVM

Event Timeline

arsenm created this revision.Sep 3 2018, 5:43 AM

Avoids regressions in a future commit.

Do you have the patch available to show the test case?

arsenm added a comment.Sep 9 2018, 6:54 PM

Avoids regressions in a future commit.

Do you have the patch available to show the test case?

I think this was for D51736, but I somehow ended up avoiding needing this to avoid regressions.

Avoids regressions in a future commit.

Do you have the patch available to show the test case?

I think this was for D51736, but I somehow ended up avoiding needing this to avoid regressions.

So would you prefer to abandon this or try to add a different test?

I think this should be done, but I have a hard time coming up with AMDGPU test cases that actually use SCALAR_TO_VECTOR

The patch has rotted and needs rewriting, but putting in an assert causes hits in these tests:

Failing Tests (16):
    LLVM :: CodeGen/AMDGPU/copy-illegal-type.ll
    LLVM :: CodeGen/AMDGPU/cvt_f32_ubyte.ll
    LLVM :: CodeGen/AMDGPU/flat_atomics_i64.ll
    LLVM :: CodeGen/AMDGPU/global_atomics_i64.ll
    LLVM :: CodeGen/AMDGPU/load-constant-i1.ll
    LLVM :: CodeGen/AMDGPU/load-constant-i16.ll
    LLVM :: CodeGen/AMDGPU/load-global-i1.ll
    LLVM :: CodeGen/AMDGPU/load-global-i16.ll
    LLVM :: CodeGen/AMDGPU/load-local-i1.ll
    LLVM :: CodeGen/AMDGPU/load-local-i16.ll
    LLVM :: CodeGen/AMDGPU/max.i16.ll
    LLVM :: CodeGen/AMDGPU/min.ll
    LLVM :: CodeGen/AMDGPU/select.f16.ll
    LLVM :: CodeGen/AMDGPU/selectcc.ll
    LLVM :: CodeGen/AMDGPU/v_madak_f16.ll
    LLVM :: CodeGen/X86/vector-intrinsics.ll

If possible please can you use one of these to show a codegen change? I tried the x86 test but didn't find anything.

TBH, I think this patch can be accepted anyhow as we know the codepath is being tested, its just annoying that it doesn't show up in test codegen....

@arsenm Are you still looking at this? I haven't been able to make a test case yet but I'm expecting PR41097 to require this as well.

@arsenm Are you still looking at this? I haven't been able to make a test case yet but I'm expecting PR41097 to require this as well.

No, I don’t have time for this now

RKSimon commandeered this revision.Mar 17 2019, 1:54 PM
RKSimon updated this revision to Diff 191043.
RKSimon edited reviewers, added: arsenm; removed: RKSimon.

Finally got a test change, and its an increase in instructions....

Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2019, 1:55 PM

I think this is a regression because at some point the high element unnecessarily goes from undef to 0:
Combining: t158: v2i16 = extract_subvector t188, Constant:i32<2>
... into: t212: v2i16 = BUILD_VECTOR t134, Constant:i16<0>

This was undef originally

RKSimon updated this revision to Diff 221327.Sep 23 2019, 7:33 AM

rebase - still need to fix the extract subvector regression