This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ::TTI] Improve cost for compare of i64 with extended i32 load
ClosedPublic

Authored by jonpa on Nov 27 2018, 4:09 AM.

Details

Reviewers
uweigand
Summary

CGF/CLGF compares an i64 register with a sign/zero extended loaded i32 value in memory.

This patch makes such a load considered foldable and so gets a 0 cost.

NFC on benchmarks. Changes just 8 instruction queries by loop vectorizer without affecting any decisions.

Diff Detail

Event Timeline

jonpa created this revision.Nov 27 2018, 4:09 AM
uweigand accepted this revision.Nov 27 2018, 11:19 AM

LGTM, thanks!

This revision is now accepted and ready to land.Nov 27 2018, 11:19 AM
jonpa closed this revision.Nov 28 2018, 1:07 AM

r347735.

Had to change patch just slightly to wrap the check for ICmp in the Mul case to cover also newly added lines, like:

+    if (UserI->getOpcode() != Instruction::ICmp) {
+      if (LoadedBits == 16 &&
+          (SExtBits == 32 ||
+           (SExtBits == 64 && ST->hasMiscellaneousExtensions2())))
+        return true;
+      if (LoadOrTruncBits == 16)
+        return true;
+    }