Page MenuHomePhabricator

[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only

Authored by aemerson on Apr 11 2019, 1:55 PM.



Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded.

This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it.

I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0.

Compile time:
Program                                        base   cse    diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test     9.04   9.12  0.8%
test-suite...Mark/mafft/pairlocalalign.test     2.68   2.66 -0.7%
test-suite...-typeset/consumer-typeset.test     5.53   5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test         5.30   5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test        25.82  25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test     6.92   6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test    34.24  34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test           6.25   6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test     1.66   1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test         13.61  13.60 -0.0%
Geomean difference                                          -0.2%

Code size:
Program                                        base     cse      diff
test-suite...-typeset/consumer-typeset.test    1315632  1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test    1313892  1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test        1439504  1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test    2936980  2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test        3478276  3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test    8082868  8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test         3870380  3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test          1434904  1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test    764528   764528   0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test    782092   782092   0.0%
Geomean difference                                              -0.9%

Diff Detail


Event Timeline

aemerson created this revision.Apr 11 2019, 1:55 PM

Looks mostly good - there are some minor comments from my side.

218 ↗(On Diff #194748)

Nit : {} not necessary

1228 ↗(On Diff #194748)

Maybe we should add a hook so targets can return the appropriate CSEConfigs for the opt level instead of the core passes making this decision for them?

18 ↗(On Diff #194748)

Did you look why these copies are generated?

307 ↗(On Diff #194748)

Similar here.

aemerson marked 3 inline comments as done.Apr 11 2019, 3:04 PM
aemerson added inline comments.
1228 ↗(On Diff #194748)

Given that we're only doing constants here, I'd like all targets to be consistent as much as possible.

18 ↗(On Diff #194748)

This was a custom legalization, I'll fix it.

307 ↗(On Diff #194748)

Another custom legalization.

aemerson updated this revision to Diff 194774.Apr 11 2019, 3:09 PM

Fixed the places where custom legalization was not using full CSE.

1228 ↗(On Diff #194748)

The reason I mentioned adding a hook is that there's no easy way for a target to override the opcodes that are CSEd for any opt level (the pass chooses it for them) and the only option would be to modify it out of tree. Instead I was wondering something like

std::unique_ptr<CSEConfig> TargetPassConfig::getCSEConfigForOptLevel() {
 return O0? make_unique<ConstantOnlyConfig>() : make_unique<CSEConfig>();

This would mean you wouldn't need to update both passes to check for opt level, and additionally this allows targets to override this with their own CSEConfig..

aemerson marked an inline comment as done.Apr 11 2019, 3:16 PM
aemerson added inline comments.
1228 ↗(On Diff #194748)

Ah, fair enough, I'll do that.

aemerson updated this revision to Diff 194980.Apr 12 2019, 4:26 PM

New patch uses a new CSEConfigBase class as the interface between TargetPassConfig and GlobalISel targets, since CodeGen can't depend on the GlobalISel lib.

Unfortunately it does mean that targets need to opt-in to CSE.

Looks reasonable. Could you please split out the CSEConfigBase into it's own commit?
Also I only see the includes for the CSEConfigBase.h but not the header file in this patch. Maybe you forgot to include it?

aemerson updated this revision to Diff 194981.Apr 12 2019, 4:40 PM

Yep, forgot to git add it. It's in this new diff, but I'll add it as a separate commit. Ok to proceed?

LGTM. Thanks - please address the minor nit and proceed with two commits.

14 ↗(On Diff #194981)

Could you please add a comment on why this class is defined here? Otherwise LGTM.

This revision is now accepted and ready to land.Apr 12 2019, 5:14 PM
This revision was automatically updated to reflect the committed changes.