This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants.

Authored by craig.topper on Jul 4 2021, 7:44 PM.



If we need to shift left anyway we might be able to take advantage
of LUI implicitly shifting its immediate left by 12 to cover part
of the shift. This allows us to use more bits of the LUI immediate
to avoid an ADDI.

I believe this is the same or similar to one of the optimizations
from D79492.

Diff Detail

Event Timeline

craig.topper created this revision.Jul 4 2021, 7:44 PM
craig.topper requested review of this revision.Jul 4 2021, 7:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2021, 7:44 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
craig.topper added inline comments.Jul 4 2021, 7:50 PM

As noted in D79492 this is a regression for the C extension as lui a1, 4095 isn't compressible, but the lui 1 and addiw -1 were.

The cost model used by isDesirableToCommuteWithShift doesn't understand compressed instructions.


This is a regression. We lose out on coincidental CSE of part of the materialization of two similar constants 0x00F000F000F000F0 and 0x0F000F000F000F00. They have the same number of 1s but are shifted 4 bits relative to each other.

arcbbb accepted this revision.Jul 13 2021, 2:03 AM


This revision is now accepted and ready to land.Jul 13 2021, 2:03 AM

@luismarques were you able to find the change you were looking at to fix the add-before-shl.ll regression?

@luismarques were you able to find the change you were looking at to fix the add-before-shl.ll regression?

I just checked that I indeed have a commit where that isn't an issue, but it's not a localized fix for that regression. I was computing the cost of instruction sequences and backing off if they were more "costly" (based on instruction count and length). I was working part-time these last two weeks, so I didn't get around to properly dig that work up, refresh it and share it, but I'll try to do it by ~Sunday.

@luismarques were you able to find the change you were looking at to fix the add-before-shl.ll regression?

You can find it in D106318. If you find it useful just take it. Or let me know if you want me to elaborate more on some of the unexercised implementation details.

BTW, ignore my previous comment about it not being a localized fixed for the isDesirableToCommuteWithShift-related regression. When I wrote that I missed that (1) we did end up adopting a backtracking approach and that (2) the proposed cost function only needs to be used in getIntMatCost to address the regression, anyway.

craig.topper planned changes to this revision.Jul 19 2021, 3:43 PM

Use portions of D106318. I added a new flag to the cost function so that the call site in RISCVTargetTransformInfo for immediate cost isn't affected. This new cost model only seems useful for comparing two sequences, but that's not what immediate cost is doing.

I eft out the optsize since that wasn't needed for the current regression.
And I left out the parts for choosing different alternatives for the same constant
as it wasn't directly related to the goal of this change either.

This revision is now accepted and ready to land.Jul 19 2021, 5:00 PM
craig.topper added inline comments.Jul 19 2021, 5:02 PM

I added new RVC RUN lines so we can see that the code with compression enabled isn't affected.

luismarques accepted this revision.Jul 20 2021, 1:54 AM

You added RVC RUN lines to a test because you spotted an issue there *before* the changes were gated by HasRVC. The next time around, how will we detect such changes, without adding RVC RUN lines to more test files? Doesn't that mean that we should add RVC RUN lines in other potentially problematic test files (e.g. rv64zb*.ll)? Or are we confident that more targeted tests will always detect all such issues...?

Other than that, LGTM. Thanks.


Nit: wrap before column 80.