Page MenuHomePhabricator

[RISCV] Add pre-emit pass to make more instructions compressible
Needs ReviewPublic

Authored by craigblackmore on Nov 25 2020, 8:03 AM.



When optimizing for size, this pass searches for instructions that are
prevented from being compressed by one of the following:

  1. The use of a single uncompressed register.
  2. A base register + offset where the offset is too large to be compressed and the base register may or may not already be compressed.

In the first case, if there is a compressed register available, then the
uncompressed register is copied to the compressed register and its uses
replaced. This is only done if there are enough uses that code size
would be improved.

In the second case, if a compressed register is available, then the
original base register is copied and adjusted such that:

new_base_register = base_register + adjustment
base_register + large_offset = new_base_register + small_offset

and the uses of the base register are replaced with the new base
register. Again this is only done if there are enough uses for code size
to be improved.

This patch includes the logic for optimizing SW/LW instructions but the
logic for other instructions could also be added in the same manner.

This pass was authored by Lewis Revill, with large offset optimization
added by Craig Blackmore.

Diff Detail

Event Timeline

craigblackmore created this revision.Nov 25 2020, 8:03 AM
craigblackmore requested review of this revision.Nov 25 2020, 8:03 AM
jrtc27 added inline comments.Nov 25 2020, 8:08 AM
1 ↗(On Diff #307605)

This should be a MIR test running just the new pass so it's less fragile (and can also test more interesting cases where it's hard to coerce CodeGen into creating uncompressible instructions)

craigblackmore edited the summary of this revision. (Show Details)Nov 25 2020, 9:24 AM
frasercrmck added inline comments.Nov 25 2020, 9:28 AM

Is this a strict enough check for "definition"? What about super- and sub-registers or clobbers? Would MI.modifiesRegister work better (and be simpler)?


Same question about defined...ness here.


Nit: redundant parens.

khchen added a subscriber: khchen.Nov 25 2020, 5:24 PM

Updated to address feedback from @jrtc27 and @frasercrmck and fix lint warnings.

craigblackmore marked 4 inline comments as done.Dec 11 2020, 9:46 AM
craigblackmore added inline comments.

Thanks, MI.modifiesRegister is better and makes this check much simpler.


At the moment isDef is sufficient as we are only expecting LW or SW. Given we are post regalloc, that means only physical integer registers (which do not have subregisters). I have added an assert so that it is clear we are only expecting LW or SW and a comment to say that the definedness check may need to be strengthened if the pass is extended to support more instructions.

1 ↗(On Diff #307605)

Thanks, I have replaced this with a MIR test.

craigblackmore marked 3 inline comments as done and 2 inline comments as done.Dec 11 2020, 9:52 AM
jrtc27 added inline comments.Dec 11 2020, 9:57 AM

Please indent continuation lines with an additional 4 spaces.

jrtc27 added inline comments.Dec 11 2020, 9:59 AM

Or maybe the style is just 2. For | it seems to be generally 2, but for arguments that wrap it seems to be any of 2, 3 or 4 depending on where you look.

Indented continuation lines in the MIR test. I couldn't see a consistent style, so opted for an indent of 2.

craigblackmore marked 2 inline comments as done.Dec 17 2020, 2:32 AM
craig.topper added inline comments.

I think these were in alphabetical order


Use SmallVectorImpl so you don't need to repeat the 8 from the caller.

lenary resigned from this revision.Thu, Jan 14, 9:43 AM

Addressed comments from @craig.topper.

craigblackmore marked 2 inline comments as done.Mon, Jan 18, 8:59 AM

Some comments about current hard-coded assumptions. Whilst only supporting LW/SW at the moment is fine, I don't think it's a good idea to have such assumptions embedded in the design; instead it should be approximately general enough that other widths and types can be added at a later date without overhauling its core (i.e. just adding some more cases). Though at that point you might as well just support LD/SD and FLW/FSW/FLD/FSD given it'd then be trivial...


Compute this based on the size of the register being loaded/stored, as this approach does not generalise.


You'll need the type here.


Ditto; compute this based on the size of the register.




You need the type of the register (int or float) in order to support compressing floating-point loads/stores (and so we can support capabilities in our downstream CHERI fork, both for the base and for the source/destination).