This is an archive of the discontinued LLVM Phabricator instance.

[X86] canonicalize a v2f64 blendi of 2 registers
ClosedPublic

Authored by spatel on Feb 19 2015, 4:08 PM.

Details

Summary

This canonicalization step saves us 3 pattern matching possibilities * 4 math ops for scalar math that uses xmm regs.

The tests in llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll cover this scenario already, so I don't think we need to add any more tests.

Is it possible to add an assert to make sure we're not bypassing this canonicalization? Where would it go?

Also, I'm assuming that we don't turn this into a 'movsd' because of its partial register update, but should there be a 'FIXME' comment here when optimizing for size because 'movsd' is 2 bytes shorter?

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 20350.Feb 19 2015, 4:08 PM
spatel retitled this revision from to [X86] canonicalize a v2f64 blendi of 2 registers.
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added a subscriber: Unknown Object (MLST).
chandlerc edited edge metadata.Feb 19 2015, 4:38 PM

Yea, there is a better way to do this: Add a target-specific combine (or even post-legalize transform in X86ISelDAGToDAG which does this canonicalization. That will run right before pattern matching through the tables, and we won't need to think about it in any other place we might form BLENDI.

spatel updated this revision to Diff 20398.Feb 20 2015, 7:03 AM
spatel edited edge metadata.

I don't know how I blanked on the target-specific combine option. Sorry about that. Thanks for the pointers!

Patch updated to include a PerformBLENDICombine() that handles the canonicalization. I also added a comment about the size optimization potential of 'movsd'.

chandlerc accepted this revision.Feb 20 2015, 7:16 AM
chandlerc edited edge metadata.
chandlerc added inline comments.
lib/Target/X86/X86ISelLowering.cpp
26211 ↗(On Diff #20398)

Should also mention that the backend knows how to commute this instruction again as needed to match the register allocation.

This revision is now accepted and ready to land.Feb 20 2015, 7:16 AM
This revision was automatically updated to reflect the committed changes.