This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT
ClosedPublic

Authored by RKSimon on Aug 7 2019, 8:40 AM.

Details

Summary

This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.

In particular this helps remove some unnecessary scalar->vector->scalar patterns.

@tlively The wasm shift patterns are annoying - we scalarize the vector masking of the shift amounts due to something to do with the promotion of the i16 elements to i32 - the extractions from the <8 x i16> create i32 scalars with undefined upper bits. A hack would be to only do this if we don't have implicit extension/truncation - what do you think?

Diff Detail

Event Timeline

RKSimon created this revision.Aug 7 2019, 8:40 AM

I should add - limiting this to EXTRACT_VECTOR_ELT that has no implicit extension loses the ARM/AARCH64 changes as well as avoiding the WEBASSEMBLY issues.

@tlively Another thing we could consider is adding WebAssemblyISD opcodes for zero extending extract vector element - similar to what X86ISD does with PEXTRW/PEXTRB - this would avoid the need for the (and, extract_element(v, c), 0xFFFF) pattern that is causing all the problems.

tlively accepted this revision.Aug 7 2019, 1:41 PM

@tlively Another thing we could consider is adding WebAssemblyISD opcodes for zero extending extract vector element - similar to what X86ISD does with PEXTRW/PEXTRB - this would avoid the need for the (and, extract_element(v, c), 0xFFFF) pattern that is causing all the problems.

Don't worry about bad shift codegen for now. We will actually be removing the scalarization of shifts entirely soon, since we were only ever scalarizing to get around a V8 bug with the shifts. I'll keep in mind the PEXTRW/PEXTRB example though, since I'm sure that pattern comes up elsewhere.

WebAssembly changes LGTM.

This revision is now accepted and ready to land.Aug 7 2019, 1:41 PM

Thanks @tlively - is everyone else OK with the arm/x86 changes?

X86 looks ok to me

This revision was automatically updated to reflect the committed changes.
RKSimon reopened this revision.Aug 13 2019, 3:48 AM

Reopening - rL368276 was reverted at rL368660 due to msan issue (PR42982)

This revision is now accepted and ready to land.Aug 13 2019, 3:48 AM
RKSimon planned changes to this revision.Aug 13 2019, 3:48 AM
RKSimon updated this revision to Diff 236171.Jan 4 2020, 4:54 AM

rebase now that the PR42982 issue has been fixed

This revision is now accepted and ready to land.Jan 4 2020, 4:54 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jan 24 2020, 6:45 AM

Heads-up: We're again seeing msan reports after this landed, https://bugs.chromium.org/p/chromium/issues/detail?id=1045291

Not clear yet if they are false positives, but given the history of this patch it's possible. I'll look some more.

Heads-up: We're again seeing msan reports after this landed, https://bugs.chromium.org/p/chromium/issues/detail?id=1045291

Not clear yet if they are false positives, but given the history of this patch it's possible. I'll look some more.

More likely SimplifyDemandedBits is managing to expose an existing issue.

I made a reduced repro: https://bugs.llvm.org/show_bug.cgi?id=44652

As far as I can tell, there's no uninitialized read in that code.