HomePhabricator

[AArch64][GlobalISel] Don't reconvert to p0 in convertPtrAddToAdd().

Authored by aemerson on Feb 3 2020, 10:32 AM.

Description

[AArch64][GlobalISel] Don't reconvert to p0 in convertPtrAddToAdd().

convertPtrAddToAdd improved overall code size and quality by a significant amount,
but on -O0 we generate some cross-class copies due to the fact that we emitted
G_PTRTOINT and G_INTTOPTR around the G_ADD. Unfortunately at -O0 we don't run any
register coalescing, so these cross class copies end up escaping as moves, and
we ended up regressing 3 benchmarks on CTMark (though still a winner overall).

This patch changes the lowering to instead directly emit the G_ADD into the
destination register, and then force changes the dest LLT to s64 from p0. This
should be ok, as all uses of the register should now be selected and therefore
the LLT doesn't matter for the users. It does however matter for the importer
patterns, which will fail to select a G_ADD if there's a p0 LLT.

I'm not able to get rid of the G_PTRTOINT on the source yet however. We can't
use the same trick of breaking the type system since that could break the
selection of the defining instruction. Thus with -O0 we still end up with a
cross class copy on source.

Code size improvements on -O0:
Program baseline new diff
test-suite :: CTMark/Bullet/bullet.test 965520 949164 -1.7%
test-suite...TMark/7zip/7zip-benchmark.test 1069456 1052600 -1.6%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 1213692 1199804 -1.1%
test-suite...:: CTMark/sqlite3/sqlite3.test 421680 419736 -0.5%
test-suite...-typeset/consumer-typeset.test 837076 833380 -0.4%
test-suite :: CTMark/lencod/lencod.test 799712 796976 -0.3%
test-suite...:: CTMark/ClamAV/clamscan.test 688264 686132 -0.3%
test-suite :: CTMark/kimwitu++/kc.test 1002344 999648 -0.3%
test-suite...Mark/mafft/pairlocalalign.test 422296 421768 -0.1%
test-suite :: CTMark/SPASS/SPASS.test 656792 656532 -0.0%
Geomean difference -0.6%

Differential Revision: https://reviews.llvm.org/D73910