Page MenuHomePhabricator

fix a logic bug in x86 vector codegen: sext (zext (x) ) != sext (x) (PR20472)

Authored by spatel on Aug 14 2014, 10:00 AM.



[resending as new patch because I forgot to add llvm-commits initially]

As shown in InstCombine: sext ( zext ( x ) ) -> zext ( x ).

The code in X86ISelLowering that this patch proposes to remove mistakenly implemented the transform as:
sext ( zext ( x ) ) -> sext ( x )

Ie, what should be a zext output was turned into a sext.

I want to believe that the logic in the original patch has some value as an optimization for some other case and it's just not in the right place here. But the test cases from:
don't provide any evidence.

We just have miscompile bugs: (please see this one for more of my analysis)

The testcases that I've added here confirm that we (1) don't remove a zext op that is necessary and (2) generate a pmovz instead of punpck if SSE4.1 is available. Although pmovz is 1 byte longer, it allows folding of the load, and so saves 3 bytes overall.

We don't need to call LowerVectorIntExtend() from LowerSIGN_EXTEND_INREG() to get this codegen either - that is already handled in NormalizeVectorShuffle().

Diff Detail


Event Timeline

spatel updated this revision to Diff 12516.Aug 14 2014, 10:00 AM
spatel retitled this revision from to fix a logic bug in x86 vector codegen: sext (zext (x) ) != sext (x) (PR20472).
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added reviewers: nadav, chandlerc, hliao, rafael.
spatel added a subscriber: Unknown Object (MLST).

Ping 2.

This is a miscompile that apparently occurs in the wild.

Easy review - the fix doesn't add any lines of code other than comments. :)

nadav edited edge metadata.Aug 28 2014, 11:06 AM


Sanjay, you can commit changes like this without waiting for pre-commit code review (See item #3 in


spatel closed this revision.Aug 28 2014, 12:08 PM
spatel updated this revision to Diff 13051.

Closed by commit rL216679 (authored by @spatel).

Thanks, Nadav!
Checked in at r216679.