This is an archive of the discontinued LLVM Phabricator instance.

[mips[microMIPS]] Adding code size reduction pass for MicroMIPS
ClosedPublic

Authored by milena.vujosevic.janicic on Dec 2 2015, 2:52 AM.

Details

Summary

The code implements size reduction pass for MicroMIPS.

Load and store instructions are examined and transformed, if possible.
lw32 instruction is transformed into 16-bit instruction lwsp
sw32 instruction is transformed into 16-bit instruction swsp

Arithmetic instrcutions are examined and transformed, if possible.
addu32 instruction is transformed into 16-bit instruction addu16
subu32 instruction is transformed into 16-bit instruction subu16

Diff Detail

Event Timeline

milena.vujosevic.janicic retitled this revision from to [mips[microMIPS]] Adding code size reduction pass for MicroMIPS.
milena.vujosevic.janicic updated this object.

New patch version rebased to revision 259635.
Any comments to this work?

New patch version rebased to revision 264141.
Any comments?

dsanders edited edge metadata.Mar 23 2016, 6:13 AM

Sorry, I was part way through writing them a few weeks ago but was distracted by other things.

There appears to be two optimizations in this pass with very different requirements at the moment. The first optimization is a simple substitution of an MI for an equivalent MI with a smaller encoding. This part is generally heading in the right direction. The second is a peephole optimization that reduces two or more MI's into a single MI and this is where most of my concerns are. I don't believe it's checking enough to be able to prove that this reduction is safe. For example, ReduceMIToLwpSwp checks for interfering register uses but fails to check for interfering register defs (including implicit defs and sub/super-registers), memory reads/writes (including aliases), volatile accesses, side effects, etc. I think we should remove this portion for now and proceed with the simple size reductions to begin with.

For the testing in general: We ought to make use of the MIR (http://llvm.org/docs/MIRLangRef.html) so that we're only testing this pass. However, I'm not going to make that a requirement for this patch because I haven't used it myself yet.

I haven't looked too closely at the test cases yet but they will need to check the operands since this is a key part of whether your optimization is working as intended. I'd also like the tests to be more focused than they currently are. They look like they were generated from C examples and as such have a lot of unnecessary noise.

The rest of the comments below are things I noted while reading the patch. I've included them because I've already written them but some will most likely be made moot by the above changes. If the line numbers seem odd it's because they were written for the previous diff.

lib/Target/Mips/MicroMips32SizeReduction.cpp
25

Do we really need std::vector or can we use one of the alternatives from http://llvm.org/docs/ProgrammersManual.html#sequential-containers-std-vector-std-list-etc?

For example, I see there's a std::vector<MachineOperand> below. This would probably be better as a SmallVector<MachineOperand, 4> or similar.

38

What do these enumerators mean?

Naming nit: Types and enumerators should begin with a capital. They should also have a prefix such as 'ON_'. More information can be found at http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

602

Is the double-N meaninful?

605

Please delete the commented out code.

dsanders added inline comments.Mar 23 2016, 6:13 AM
lib/Target/Mips/MicroMips32SizeReduction.cpp
40–43

We can make this explanation appear in the doxygen documentation by writing this with '/' and '/<' comments like so:

/// Reduction type:
enum ReduceType {
  SeveralInstr, ///< Several instructions into lwm/swm.
  TwoInstr, ///< Two instructions into one.
  OneInstr ///< 32-bit instruction into 16-bit instruction.
};

Similarly for the other description comments below.

I notice that our doxygen config currently has 'EXTRACT_ANON_NSPACES=NO' but I'm going to propose that we change that.

47

Opperand -> Operand. This typo appears in a few other places too

48

Variables should begin with a capital and should be descriptive. With this style of constructor it's not ambiguous to use the same name for both the argument and the member (e.g. 'Shift(Shift)').

Similarly for the other constructors below.

59

The snr argument is never used.

64

I think I know what you're trying to say but the comment isn't very clear. I think you're referring to the way LWM16 only allows a subset of the registers that LVM32 allows.

Can we describe this in terms of register classes?

73–75

We normally use 'unsigned' for opcodes.

Also, what's the purpose of the second instruction? It's not clear from the comment

83

Why 'void *'? It seems we always pass in a 'struct ReduceEntryFA *'. We also de-reference it and take a copy immediately without nullptr checks so a reference would be better to avoid the copy and explicitly say it can't be nullptr at the same time.

124–125

Formatting.

143

Naming nit: We should probably drop the '32' so that we can re-use it for microMIPS64 in the future.

158

New code shouldn't repeat the function name in the comments.

207–208

Given that this is a static table, we should define the table as a normal array and use ArrayRef in this class

212

Is this redundant?

214–269

This table should probably be tablegen-erated but we can leave that for now and address it in later patches.

272–288

This is equivalent to GPRMM16RegClass.contains(Reg)

290–306

Similarly, this is equivalent to GPRMM16ZeroRegClass.contains(Reg)

307–312

This is only correct for the o32 and n32 ABIs. If you check for Mips::SP64 as well then it will cover the n64 ABI too.

343–344

I'd expect this to be indicative of a bug somewhere else. Should it be an assertion?

353

Operand indices are 'unsigned' rather than uint8_t

354–361

Am I right in thinking this is to check the pointer registers of each instruction are the same? If so, this should be ok but the function name should indicate that it's only suitable for pointers.

If integers are a possibility then we will also need to handle the fact that V0 != V0_64 despite being the same register.

366–381

I haven't tested this but something like:

const auto &End = GPR32RegClass.end();
const auto &I = std::find(GPR32RegClass.begin(), End, Reg1);
if (I == End || *I != Reg1)
  return false;
I++;
if (*I == Reg2)
  return true;
return false;

should be the equivalent without duplicating our register classes.

We ought to account for the '*_64' versions of these registers too which can be handled using GPR64RegClass.

385–386

Line wrapping

387–393

MathExtras.h has isShiftedInt() and isShiftedUInt() templates that are equivalent to this function

474–485

This is equivalent to this function:

MI->readsRegister(reg1) || MI->readsRegister(reg2)

If you pass the TRI argument then it will check for reads that occur because of super-register reads too. I don't think it can check for reads caused by sub-register reads though.

492

We should use C++11's range based for loop

for (const auto &I : MI->operands())
507–508

According to the tablegen definition, it's not guaranteed to be operand 2 when variable_ops for the Lwm/Swm is non-empty. It will be operand NumOps-1

516–517

Similarly, it's not guaranteed to be operand 1 when variable_ops for the Lwm/Swm is non-empty. It will be operand NumOps-2

532–533

Likewise

539

At minimum we have two sources/results ($16 and $31) along with a base address and offset so shouldn't the lower bound be 3.

Similarly: At most, we have five sources/results ($16-$19, and $31) along with the base address and offset so shouldn't the upper bound be 7?

551–560

std::find using GPR16MMRegClass and std::distance should be equivalent to this.

564

Rather than sort at startup, can we just keep the table sorted and assert std::is_sorted()? If we do want to std::sort() then the best place to put it would be in tablegen when we start tablegen-erating the array.

760

Use range-based for loop

941–942

Could you add a comment explaining why instrs[9] is special? What does '9' correspond to?

1004–1053

I think this is just mutating one MachineInst into another similar one. Do we really need to build a new instruction and transfer everything or can we just call MI->setDesc()?

lib/Target/Mips/MicroMipsInstrInfo.td
542–568

If we have explicit operands for the variable-length portion, do we still want the reglist16 operands? I believe the variable length portion covers the same operands as the reglist16's.

test/CodeGen/Mips/micromips-lwm-swm-lwp-swp-sw16.ll
1

(filename) Could you move this into a subdirectory for testing this pass? I'm thinking that the number of tests is going to grow over time and we ought to make it easy to tell which tests cover this pass.

test/CodeGen/Mips/micromips-lwsp-swsp.ll
1

(filename) Could you move this into a subdirectory for testing this pass?

sdardis added a subscriber: sdardis.

I believe this work should be implemented in a similar manner to ARM's codesize reduction passes, Thumb2SizeReduction.cpp and ARMLoadStoreOptimizer.cpp.

Their load store optimizer should be modifiable to work for microMIPS. Reusing their logic should avoid the tricky issue of moving loads and stores past other instructions. I'd suggest dropping all the load/store bundling from this patch and focus on the replacing a instruction with a smaller form.

Some of my comments may overlap with Daniel's as we've both looked this but I've tried to delete any ones that overlapped.

lib/Target/Mips/MicroMips32SizeReduction.cpp
1

This file should be called MicroMipsSizeReduction.cpp. This patch is for microMIPS32 but should be sufficiently general that it can be trivially extended to microMIPS64. microMIPS64 support should be a separate patch.

8

Please include a description of this pass, any relevant deficiencies and restrictions. Such as the fact is does not supprt microMIPS64. That comment should be at the bottom of the description as a TODO:.
It should look like:

<Usual LLVM boiler plate.>
//===----------------------------------------------------------------------===//
/// \file
/// This pass is used to reduce the size of instructions where applicable ...
/// ....
/// TODO: implement microMIPS64 support.
//===----------------------------------------------------------------------===//
28

"MicroMips-reduce-size" should be "micromips-reduce-size".

30

'instrs' should be 'instructions', no need to abbreviate it.

32

Here too.

155

microMIPS is the preferred spelling.

400

This predicate is too lax. It has to check at least the same candidates as Filler:terminateSearch in MipsDelaySlotFiller.cpp, and also has to check it is not crossing control flow instructions such as wait, pause and branches or instructions such as sync which act as ordering barriers.

595

Don't use void * and casts. Instead take a pointer/reference to the relevant type.

668–681

All this post loop code should be integrated into the loop body. Rather than 'break'ing out of the loop, in case when you've identified a candidate instruction, I believe you should check the rest of your conditions and if you cannot continue, and immediately return false. If the instruction was an invalid candidate but you can continue the search, update the use set and continue, otherwise you can return ReplaceInstruction(...). Outside the loop body, you should have 'return false'.

946

This can be reduced to a unsigned Opcode = <nested ternary operator>; <newline>MIB = BuildMI(...MipsII->get(Opcode));

958

Rather than packing the operands into a vector before picking the opcode, pick the opcode then iterate over instrs structure and add the operands from that directly.

971

Rename flag to something like 'CopyOperandsForward'.

991

Check for illegal cases first before building an instruction.

test/CodeGen/Mips/micromips-lwm-swm-lwp-swp-sw16.ll
3

Can you add CHECK-LABEL: <function name> here to match the function and in all the others?

milena.vujosevic.janicic edited edge metadata.
milena.vujosevic.janicic marked an inline comment as done.

The code is simplified: everything about transforming several instructions into one instruction is removed (i.e. lwm/swm and lwp/swp). Therefore, most of the comments are not applicable for this code, but will be taken into account later. Test-cases are also simplified.

lib/Target/Mips/MicroMips32SizeReduction.cpp
389–395

These functions are similar but are not equivalent and cannot be used in this case. isShiftedInt is a template which should be instantiated with a constant shift value, while here the value of shift is a parameter to the function. Also, in this case, low bound and high bound does not necessary correspond to bit width.

sdardis edited edge metadata.Apr 14 2016, 3:27 AM

Comments inlined. Most of them are small issues, and an omission from the reduction table for LW16, which I think should go into this patch.

There is a second short form load instruction, lwgp. That should be done as a separate patch rather than including in this revision. It will be a small patch anyway.

Thanks.

lib/Target/Mips/MicroMipsSizeReduction.cpp
13 ↗(On Diff #52209)

Doesn't this patch do this? :)

14 ↗(On Diff #52209)

I think we should borrow ARM's load store optimise pass rather than implementing it here.

26–27 ↗(On Diff #52209)

Please avoid unnecessary includes.

30 ↗(On Diff #52209)

Unnecessary include.

41 ↗(On Diff #52209)

By convention, there should be a colon after 'TODO'. Also, spelling of extended.

151–153 ↗(On Diff #52209)

I'm not seeing this function used anywhere. Since there are predicates for stack relative accesses and short form memory accesses, is it required?

172–183 ↗(On Diff #52209)

LW16 is missing from this table.

212–213 ↗(On Diff #52209)

This should be an assert as calling this function with an out of range Op is an error.

MI->getOperand(Op)

in

if (!MI->getOperand(Op).isImm())

will assert that Op < MI->getNumOperands() anyway. Returning false covers a potential bug.

220–221 ↗(On Diff #52209)

Capitalise Value and Shift as they refer to arguments of this function.

330–334 ↗(On Diff #52209)

These two cases can be joined together for clarity.

The second case should use isTransient(). This catches cases where MI is a debug value and other pseudo operations like EHLABEL which do not correspond to physical instruction(s).

Put a comment this check saying something like 'Don't reduce bundled instructions or pseudo operations.' so the intention is obvious.

sdardis requested changes to this revision.Apr 14 2016, 3:51 AM
sdardis edited edge metadata.
This revision now requires changes to proceed.Apr 14 2016, 3:51 AM
milena.vujosevic.janicic edited edge metadata.

All the comments from the previous revision are taken into account.
lw16/sw16 support was excluded because it is not necessary in this moment.

The code implements size reduction pass for MicroMIPS.
Load and store instructions are examined and transformed, if possible.

lw32 instruction is transformed into 16-bit instruction lwsp
sw32 instruction is transformed into 16-bit instruction swsp

Arithmetic instrcutions are examined and transformed, if possible.

addu32 instruction is transformed into 16-bit instruction addu16
subu32 instruction is transformed into 16-bit instruction subu16
sdardis added inline comments.Feb 24 2017, 8:58 AM
lib/Target/Mips/MicroMipsSizeReduction.cpp
158 ↗(On Diff #79229)

Can this be reduced in size to 16 or 8 entries?

263–285 ↗(On Diff #79229)

These two functions can be folded together as ReduceXWtoXWSP with a comment stating it covers lwsp, swsp.

310–312 ↗(On Diff #79229)

Style: drop the '{' '} when it is a single line.

315–317 ↗(On Diff #79229)

Modifed |= Reduced(MI); is clearer.

327 ↗(On Diff #79229)

This should be dbgs() << ...

test/CodeGen/Mips/micromips-sizereduction/micromips-lwsp-swsp.ll
6 ↗(On Diff #79229)

Colon after the function name so that it matches properly.

milena.vujosevic.janicic edited the summary of this revision. (Show Details)

All the comments from the previous revision are taken into account.

sdardis accepted this revision.Apr 10 2017, 6:13 AM

LGTM. Some small nits inlined.

lib/Target/Mips/MicroMipsSizeReduction.cpp
18 ↗(On Diff #90327)

Spurious empty line.

185 ↗(On Diff #90327)

This shouldn't have SP_64 in the conditional as this pass doesn't support microMIPS in 64 bit mode yet..

test/CodeGen/Mips/llvm-ir/add.ll
27 ↗(On Diff #90327)

Add -verify-machineinstrs to the parameters here. We want to know early if the generated machine code is malformed.

test/CodeGen/Mips/llvm-ir/sub.ll
13 ↗(On Diff #90327)

Add -verify-machineinstrs to the parameters here. We want to know early if the generated machine code is malformed.

test/CodeGen/Mips/micromips-sizereduction/micromips-lwsp-swsp.ll
1 ↗(On Diff #90327)

Add -verify-machineinstrs to the parameters here. We want to know early if the generated machine code is malformed.

This revision is now accepted and ready to land.Apr 10 2017, 6:13 AM
This revision was automatically updated to reflect the committed changes.