This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner][x86] add transform/hook to vectorize: cast(extract V, Y)
AbandonedPublic

Authored by spatel on Jan 16 2019, 10:10 AM.

Details

Summary

This is a fix for PR39974:
https://bugs.llvm.org/show_bug.cgi?id=39974

I didn't see any existing TLI hooks that capture what we need to know if this is profitable, so I'm proposing a new hook that includes the source and destination types of the cast op. This is enabled for x86 only here, but any target that wants to avoid a register file back-and-forth may find this useful.

The known bits diffs suggest that we can do better at simplifying based on vector demanded elements, but I'm assuming those are not the typical patterns.
We would also likely improve things by moving shuffles ahead of the cast in the case where we are not extracting from element 0.

Diff Detail

Event Timeline

spatel created this revision.Jan 16 2019, 10:10 AM

Do any other backends want something like this?
@t.p.northover, @asb, @uweigand, others?

asb added a subscriber: rkruppe.Jan 24 2019, 2:27 AM

Do any other backends want something like this?
@t.p.northover, @asb, @uweigand, others?

For RISC-V, there's no vector support upstream currently (the speci is still in flux). @rkruppe may be able to comment on whether it's likely this hook would be useful.

xbolva00 added inline comments.
test/CodeGen/X86/known-signbits-vector.ll
158

Looks bad

A later, target-specific alternative to this patch is proposed in D56864.
As I mentioned in the summary, I'm not that concerned about the knownbits regression, but the other patch does sidestep that problem.

Thanks for the heads-up! This may indeed be interesting for SystemZ, but I think it's still probably preferable to do it in the back-end like your alternative approach does, that will allow us to handle some special instruction selection issues we'll likely run into ...

spatel abandoned this revision.Jan 28 2019, 6:10 AM

Thanks for the heads-up! This may indeed be interesting for SystemZ, but I think it's still probably preferable to do it in the back-end like your alternative approach does, that will allow us to handle some special instruction selection issues we'll likely run into ...

Yes, I think we'll proceed with a target-specific approach for x86; it also has some custom opcode requirements.
Abandoning.