This is an archive of the discontinued LLVM Phabricator instance.

[x86] Make vector legalization of extloads work more like the "normal" vector operation legalization with support for custom target lowering and fallback to expand when it fails, and use this to implement sext and anyext load lowering for x86 in a...
ClosedPublic

Authored by chandlerc on Jul 24 2014, 5:31 AM.

Details

Summary

...more principled way.

Previously, the x86 backend relied on a target DAG combine to "combine
away" sextload and extload nodes prior to legalization, or would expand
them during legalization with terrible code. This is particularly
problematic because the DAG combine relies on running over non-canonical
DAG nodes at just the right time to match several common and important
patterns. It used a combine rather than lowering because we didn't have
good lowering support, and to expose some tricks being employed to more
combine phases.

With this change it becomes a proper lowering operation, the backend
marks that it can lower these nodes, and I've added support for handling
the canonical forms that don't have direct legal representations such as
sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases
for this behavior continue to pass even after the DAG combiner beigns
running more systematically over every node.

There is some noise caused by this in the test suite where we actually
use vector extends instead of subregister extraction. This doesn't
really seem like the right code, but is unlikely to be a critical
regression. However, we do regress in one case where by lowering to the
target-specific patterns early we were able to combine away logic.
However, those regressions are completely addressed by switching to
a widening based legalization which is what I'm working toward anyways,
so I've just switched the test to that mode.

Diff Detail

Event Timeline

chandlerc updated this revision to Diff 11838.Jul 24 2014, 5:31 AM
chandlerc retitled this revision from to [x86] Make vector legalization of extloads work more like the "normal" vector operation legalization with support for custom target lowering and fallback to expand when it fails, and use this to implement sext and anyext load lowering for x86 in a....
chandlerc updated this object.
chandlerc added reviewers: filcab, grosbach, hfinkel.
chandlerc added a subscriber: Unknown Object (MLST).
hfinkel added inline comments.Jul 24 2014, 5:54 AM
lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
213

I think that an assert would be helpful here, something like:
assert(Result.getValue(1).use_empty() && "Custom lowering did not kill chain dependencies on the load?");

lib/Target/X86/X86ISelLowering.cpp
12884

I would quickly note what this canonical form is, "... the canonical form of sextload, where it is folded with any sex/sext_in_reg users, to persist..." (or whatever is actually correct).

12914

Why are you creating a TokenFactor with one operand? Can't you just use the chain directly?

chandlerc added inline comments.Jul 24 2014, 6:14 AM
lib/Target/X86/X86ISelLowering.cpp
12914

\meme{i have no idea what i'm doing}

Are you saying I should RAUW the old chain value with the chain value of the load? That makes sense if so, and I'll do so.

hfinkel added inline comments.Jul 24 2014, 6:19 AM
lib/Target/X86/X86ISelLowering.cpp
12914

(but you're learning ;) )

Yes, that's what I'm saying.

I've made the requested changes. I can post a patch if you'd like to see another version (but the delta is completely unsurprising). Any further comments here?

hfinkel edited edge metadata.Jul 24 2014, 2:54 PM

I don't need to see a follow-up patch based on my comments, and I have no further comments at this time ;)

grosbach accepted this revision.Jul 24 2014, 2:56 PM
grosbach edited edge metadata.

LGTM, too.

This revision is now accepted and ready to land.Jul 24 2014, 2:56 PM
chandlerc closed this revision.Jul 24 2014, 3:18 PM
chandlerc updated this revision to Diff 11857.

Closed by commit rL213897 (authored by @chandlerc).

Sorry for the late comment, but why we need

if (Subtarget->is64Bit()) {

check? As I remember, instructions like PMOVSXWD are supported even in 32-bit mode?

I'm asking because I have a significant regression on EEMBC1.1 for autocorrelation tests due to lots of insert-like generated instructions for SLM target under -m32. Removing this check fixes the regression.