Previously when given (sub 0, load), we were prefering to fold the load and materializing the 0 in a register. I think we should instead use negate and do the load as a separate instruction.
Diff Detail
Diff Detail
Event Timeline
Comment Actions
I'll run our benchmark list. This was more of an observation that we were different than icc, gcc, and msvc.
There's a related question. Given the option of promoting a (i16 sub 0, load) to 32-bits, should we promote and use neg+movzwl or keep it as 16-bits so we can fold the load.