This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Hooks for enabling instruction compression
ClosedPublic

Authored by sabuasal on Jan 10 2018, 6:20 PM.

Details

Summary

This patch inserts calls in RISCV backend to call instruction
compression\uncompression logic.

Calls were inserted in three locations:
1) RISCVAsmParser::MatchAndEmitInstruction:
    Inserted a call to compressInst() to compresses instructions
    parsed by llvm-mc coming from an ASM input.
 2) RISCVAsmPrinter::EmitInstruction:
     Inserted a call to compressInst() to compress instructions coming
     that were lowered from Machine Instructions (MachineInstr).
  3) RVInstPrinter::printInst:
      Inserted a call to uncompressInst() to print the expanded
      version of the instruction instead of the compressed one (e.g,
      add s0, s0, a5 instead of c.add s0, a5) when -riscv-no-aliases
      is not passed.

Diff Detail

Event Timeline

apazos created this revision.Jan 10 2018, 6:20 PM
niosHD added a subscriber: niosHD.Jan 11 2018, 8:37 AM
asb added a comment.Jan 11 2018, 8:41 AM

Thanks Ana for outlining your current thinking. Without reviewing the fine details, this seems in the direction we were all discussing.

Although in the current GNU tools control of compression/decompression is intrinsically linked to the -M no-aliases flag, we might want to add some more fine-grained control here (after all, it seems reasonable that I'd want to disable automatic compression but still use pseudoinstructions like mv). Something we can think about later though.

lib/Target/RISCV/InstPrinter/RISCVInstPrinter.cpp
47

I think the logic should actually be the other way - we uncompress if !NoAliases (i.e. by default if we decode a c.addi we print it as addi)

apazos updated this revision to Diff 130341.Jan 17 2018, 6:06 PM
apazos updated this revision to Diff 130358.Jan 17 2018, 8:39 PM
apazos updated this revision to Diff 131327.Jan 24 2018, 12:42 PM

Rebased - depends on Shiva's https://reviews.llvm.org/D41658

apazos updated this revision to Diff 132297.Jan 31 2018, 5:05 PM
sabuasal added inline comments.Feb 16 2018, 6:25 PM
lib/Target/RISCV/InstPrinter/RISCVInstPrinter.cpp
47

Hi Alex,

I think we are actually in agreement. If we have an encoded c.addi and -riscv-noaliases is passed, NoAlias will be true, uncompress will not be called so we print the c.addi.

apazos retitled this revision from [RISCV WIP] Hooks for compressing instructions from MCPat to [RISCV] Hooks enabling instruction compression.Feb 21 2018, 1:02 PM
apazos retitled this revision from [RISCV] Hooks enabling instruction compression to [RISCV] Hooks for enabling instruction compression.
efriedma added inline comments.
lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
52 ↗(On Diff #132297)

I'm not sure it makes sense to do compression here. Software using the MC layer should have full control over what instructions it's emitting, where possible. And to get the right output with "-riscv-no-aliases", you have to run compression before you output assembly.

If the backend is okay delaying compression until really late, maybe you could call it in LowerRISCVMachineInstrToMCInst() or something like that?

asb added a reviewer: asb.Feb 27 2018, 12:57 PM
asb added inline comments.
lib/Target/RISCV/InstPrinter/RISCVInstPrinter.cpp
47

We are now in agreement - the logic was the other way around in the original diff. I don't see a way to mark my comment on someone else's patch as 'done' - but it's fine now.

50–52

I think const_cast would be preferable here. Pehaps we could get rid of Res ? (const MCInst *)&NewMI : MI which makes it harder to read - maybe using a MCInst* that is initialised to MI and updated to &NewMI if necessary.

lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
52 ↗(On Diff #132297)

It's important that we reuse the same MCInst->MCInst mapping both for the codegen and assembly/disassembly path, but you're right - it seems like it would be viable to call this in LowerRISCVMachineInstrToMCInst - doing the usual lowering and then trying to transform OutMI. Do you see any issues with that Ana/Sameer?

apazos added inline comments.Feb 28 2018, 3:55 PM
lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
53 ↗(On Diff #132297)

The instruction will only be transformed if compression extension is enabled.
If the user enables compression, but does not want a particular instruction to be compressed, RISCV has no assembler directive to allow that (does it Alex? Maybe we can request that feature?). As Eli knows ARM supports "adds.w" to allow that kind of fine grain control.
We need to also check if LowerRISCVMachineInstrToMCInst is invoked when coming from assembly/disassembly path.

sabuasal added inline comments.Feb 28 2018, 4:01 PM
lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
52 ↗(On Diff #132297)

@efriedma @asb

"LowerRISCVMachineInstrToMCInst" Is not on the path for the Assembler\Disassembler (llvm-mc). Adding compression logic there will result in llvm-mc not compressing instructions passed from an ASM file. The reason compression changes where added to the streamer is because it is the onlt location where all paths (Assembler. MI lowering) meet.

apazos updated this revision to Diff 136448.Feb 28 2018, 5:28 PM

Addressed formatting comments.

sabuasal added inline comments.Feb 28 2018, 5:36 PM
lib/Target/RISCV/InstPrinter/RISCVInstPrinter.cpp
50–52

Done.

Adding llvm-commits as a subscriber. A couple more comments inline

lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
52 ↗(On Diff #132297)

I recognise LowerRISCVMachineInstrToMCInst won't be called on the llvm-mc path, but this patch does add calls to the asm parser and instprinter, which at first glance seems sufficient for the llvm-mc case. Is that not so? It's been a while since I looked at this part of the instruction compression discussion, so I might be missing/forgetting something.

Updating this diff with a patch summary briefly explaining the locations compressInst is called would be helpful.

53 ↗(On Diff #132297)

There is no current directive to control automatic compression separately from support for the C extension. If you want to compile an assembly file with support for the C extension but have complete control over when compressed instructions are chosen you currently need to use .option rvc and .option norvc (in gas - not yet implemented in llvm-mc). e.g.

.option norvc
addi x1, x1, 1
.option rvc
c.addi x1, 1
asb added inline comments.Feb 28 2018, 6:19 PM
lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
53 ↗(On Diff #132297)

I do think it would be good to have an option for this. I identified this lack in https://github.com/riscv/riscv-toolchain-conventions#issues-for-consideration-4, and it would probably be worth trying again to engage the GCC devs and RISC-V community in trying to address some of the holes detailed in my toolchain-conventions document.

apazos planned changes to this revision.Feb 28 2018, 6:43 PM

We need to put in the debugger to double check the flow, I recall this flow:
.s ->.o
AsmParser.cpp ->...-> RISCVAsmParser::MatchAndEmitInstruction-> MatchInstructionImpl-> Out.EmitInstruction…
.o->.s
Disassemble.cpp -> ... ->Streamer.EmitInstruction(Inst, STI);-> InstPrinter->printInst(&Inst, OS, "", STI);-> printInstruction(MI, STI, O) in RISCVInstrPrinter

The goal was to try to compress the instr before emitting the instr. And uncompress the instr when printing, depending on the alias flag.

sabuasal edited the summary of this revision. (Show Details)Mar 19 2018, 5:58 PM
sabuasal edited the summary of this revision. (Show Details)
sabuasal commandeered this revision.Mar 21 2018, 12:01 PM
sabuasal added a reviewer: apazos.
sabuasal added inline comments.Mar 21 2018, 1:24 PM
lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
52 ↗(On Diff #132297)

I added a summary for why every hook was added to the patch.

For the llvm-mc and llvm-objdump path having the hooks in AsmParser and in Instr Printer is sufficient for what we are trying to achieve.

The hook to the Elfstreamer was added for the path on 1.c ---> 1.o. This could be moved earlier like @efriedma suggested (I tried it by calling compression in RISCVAsmPrinter::EmitInstruction right after it calls LowerRISCVMachineInstrToMCInst). This also has another advantage of saving a call to compressInst for the path (1.s ---> 1.o); as is compressInst will be called both at the AsmParster and at the ELfStreamer.

However, the only possible side effect is that we will no longer be matching the gcc fronted in this. going from a (1.c ---> 1.s) gcc never emits a compressed asm instruction. While clang can do that if we pass -riscv-no-aliases. The gcc equivilent flag for this (-M -no-aliases) only works with gcc-objdump and not the fronted.

If we don't forsee this as an issue then we can move the hook with no problem.

asb added inline comments.Mar 22 2018, 3:50 AM
lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
52 ↗(On Diff #132297)

Thanks for confirming Sameer. Moving earlier, to e.g. LowerRISCVMachineInstrToMCInst seems preferable.

I think the fact we can have -riscv-no-aliases affect .s emission is an advantage vs GCC.

sabuasal updated this revision to Diff 139685.Mar 23 2018, 5:48 PM
sabuasal edited the summary of this revision. (Show Details)

Removed the call to EmitInstr in ELFStream.
Updated test cases.

asb accepted this revision.Mar 29 2018, 1:02 AM

This diff seems to have accidentally picked up changes to a number of tests.

Other than that, this looks good to me. I expect that in the future we'll want to have the option of having the C extension enabled but not performing automatic compression, but that can always be added later.

This revision is now accepted and ready to land.Mar 29 2018, 1:02 AM

Hi Alex,

Thanks for the review,

Yes, the changes in the test cases are caused by the fact that I rebased on the latest llvm (with mc relaxation enabled).