This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ::TTI] Improve costs for add, sub and mul i16 against memory
ClosedPublic

Authored by jonpa on Nov 27 2018, 2:27 AM.

Details

Reviewers
uweigand
Summary

AH, SH and MH costs are already covered in the cases where LHS is 32 bits and RHS is 16 bits of memory sign-extended to i32.

As these instructions are also used when LHS is i16, this patch handles this case also by recognizing that the loads in those cases also get folded.

This is NFC on SPEC, but silently affects the scalar loop cost estimates (in LoopVectorizer) of 26 times.

I'm not 100% sure about the implications of LHS being just 16 bits, but this seems to at least match what CodeGen is doing.

Diff Detail

Event Timeline

jonpa created this revision.Nov 27 2018, 2:27 AM
uweigand accepted this revision.Nov 27 2018, 11:18 AM

LGTM, thanks!

This revision is now accepted and ready to land.Nov 27 2018, 11:18 AM
jonpa closed this revision.Nov 28 2018, 12:36 AM

r347734