Page MenuHomePhabricator

[DAGCombiner][X86] When promoting loads don't use ZEXTLOAD even its legal

Authored by craig.topper on Apr 12 2018, 12:36 PM.



We were previously prefering ZEXTLOAD over EXTLOAD if it is legal. This triggers during X86's promotion of i16->i32. Not sure about other targets.

Using ZEXTLOAD can prevent folding it to SEXTLOAD later if we were to promote a sign extended operand like we would need for SRA. However, X86 doesn't currently promote i16 SRA. I was looking into doing that which is how I found this issue.

This is also blocking our ability to fold 4 byte aligned EXTLOADs with "loadi32". This is what caused most of the test changes here.

There are couple tests where the instruction counts went up. I need to look closer at those.

Diff Detail


Event Timeline

craig.topper created this revision.Apr 12 2018, 12:36 PM
craig.topper added inline comments.Apr 12 2018, 12:40 PM
16 ↗(On Diff #142232)

There's some oddly repeated code in here. It looks like we managed to duplicate the load.

14 ↗(On Diff #142232)

Somethig odd happened here. There are 2 identical ands.

craig.topper added inline comments.Apr 12 2018, 1:01 PM
16 ↗(On Diff #142232)

Nevermind, we didn't replicate it we just failed to combine the two logic cones.

It seems DAG combine doesn't know these properties which accounts for some of the redundancies seen in the test output. We can either improve DAG combine or just trust that InstCombine would have prevented this in the first place

(A&B)&A -> A & B
(A^B)^A -> B

craig.topper accepted this revision.Apr 24 2018, 1:48 PM

approving based on Zvi's LGTM. Assuming he forgot to Accept.

This revision is now accepted and ready to land.Apr 24 2018, 1:48 PM
RKSimon added inline comments.Apr 24 2018, 3:12 PM
75 ↗(On Diff #142232)

These are likely to be performance regressions on most targets - the special case x86 'MulConstantOptimization' patterns in combineMul should still be used where possible.

craig.topper added inline comments.Apr 24 2018, 3:20 PM
75 ↗(On Diff #142232)

Agreed. But that issue wasn't caused by this patch. It's pretty easy to hit with 32-bit instructions too.

This revision was automatically updated to reflect the committed changes.