This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] AsmParser support for the li pseudo instruction
ClosedPublic

Authored by niosHD on Apr 26 2018, 6:20 AM.

Details

Summary

The implementation follows the MIPS backend and expands the pseudo instruction directly during asm parsing. As the result, only real MC instructions are emitted to the MCStreamer. The actual expansion to real instructions is similar to the expansion performed by the GNU Assembler.

This patch supersedes D41949.

Note that I marked this patch as WIP because I am currently traveling but plan to add compression tests to this patch in an update next week. Comments on the current state are appreciated though.

Diff Detail

Repository
rL LLVM

Event Timeline

niosHD created this revision.Apr 26 2018, 6:20 AM
asb added a comment.Apr 30 2018, 5:38 AM

I'll wait for the version with compression tests for a full review, but at a first read-through I think this is looking good. Thanks!

niosHD updated this revision to Diff 146330.May 11 2018, 8:44 AM
niosHD retitled this revision from [RISCV] [WIP] AsmParser support for the li pseudo instruction to [RISCV] AsmParser support for the li pseudo instruction.

I finally found the time to work on this patch, sorry for the delay.

This revision contains two notable updates. First, compression tests have been added. Second, I replaced the ADDIW, which was emitted for the first ADDI (similar to gas) in 64-bit mode, through a regular ADDI. Doing so provides more compression opportunities and also makes the code in 64-bit mode more similar to the one in 32-bit mode (i.e., enables shared tests in rvi-aliases-valid.s and rvc-aliases-valid.s).

With these updates, this patch is now ready for review.

PS Even though I am confident that replacing the first ADDIW with ADDI is correct, a second opinion is still appreciated. ;)

Ping! Some feedback would be nice before I rebase this patch (which is less fun as it should be due to the recent improvements on the ASM parser)...

asb added a comment.May 22 2018, 2:47 AM

The approach looks good. I'm out of office today, but will get a full review for you tomorrow.

niosHD updated this revision to Diff 148175.May 23 2018, 2:30 AM

Rebased the patch. Luckily, it was not as bad as I anticipated. :)

There is no rush. I just wanted to make sure that the patch is not forgotten.

asb added a comment.May 24 2018, 3:21 AM

By way of update: the only thing stopping me from committing this right now is convincing myself that there's never a reason to use addiw rather than addi for the first constant when expanding li for RV64. I'm just going to step through some more examples...

In case it helps, my reasoning is as follows. There are basically two situations which can occur:

  1. ADDI(W) is the first instruction of the expanded li and the source register is x0. The ADDIW instruction, as used before, would calculate the result as SignExtend64<32>(x0 + SignExtend64<12>(imm)). Considering that x0 is always 0, the result is therefore simply `SignExtend64<32>(SignExtend64<12>(imm)) = SignExtend64<12>(imm) which is also what the ADDI instruction gives us. (i.e., x0+SignExtend64<12>(imm) = SignExtend64<12>(imm) )
  2. The ADDI(W) instruction follows a LUI instruction. The LUI instruction itself already performs sign extension to 64-bit meaning that the input register to the ADDI(W) instruction is a proper 64-bit signed value where the upper 32 bits are a copy of bit 31 (the MSB of the LUI immediate). Using LUI+ADDIW therefore calculates SignExtend64<32>(SignExtend64<32>(lui_imm << 12) + SignExtend64<12>(addi_imm))). Using LUI+ADDI gets us SignExtend64<32>(lui_imm << 12) + SignExtend64<12>(addi_imm)) which should be the same result in the end.

HTH,
Mario

asb added a comment.May 29 2018, 5:23 AM

Ok, there _is_ a problem using addi rather than addiw. Consider loading the immediate 0x7fffffff.

# GCC approach:
lui a0, 524288
addiw a0, a0, -1
# Approach in this pass
lui a1, 524288
addi a1, a1, -1

Using addiw means that the value 0x000000007fffffff ends up in the register, but using addi we end up with the incorrect 0xffffffff7fffffff in register a1.

We could probably have some logic that determines whether for a particular constant using addi is safe, but addiw seems like a safe general approach.

Thank you for finding the flaw in my thinking. I obviously missed that everything falls apart due to the sign extension of the LUI instruction...

I'll update the patch to use the ADDIW instruction again.

sabuasal added inline comments.May 29 2018, 3:17 PM
lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1188 ↗(On Diff #148175)

Nit: why is this a switch, are you expecting to be handling more Pseudo instructions?

niosHD added inline comments.May 30 2018, 12:23 AM
lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1188 ↗(On Diff #148175)

Yes, I expected that other pseudo instructions that expand to multiple instructions (e.g., la, symbol versions of lb, sb, ...) will be expanded here in the future too. Anyway, since we currently do not have the code and given that it is a small change anyway, I will go with the if instead.

Thank you for the comment!

niosHD updated this revision to Diff 149090.May 30 2018, 5:01 AM

Updated the patch to emit ADDIW again and rebased it.

Especially for RV64, there are definitely a several missed opportunities regarding compression as can be seen by the check lines of the rv64c-aliases-valid.s test file. However, having a correct baseline is definitely more important than performing aggressive optimisation at the moment. Thanks again to Alex for finding the flaw in my reasoning.

niosHD added a comment.Jun 7 2018, 8:24 AM

@asb I just missed my opportunity in the call to ping you regarding this patch. ;) Are there any remaining issues with it that I missed?

asb accepted this revision.Jun 7 2018, 8:38 AM

Looks good to me. Many thanks for your work on this - I'm committing now.

This revision is now accepted and ready to land.Jun 7 2018, 8:38 AM
This revision was automatically updated to reflect the committed changes.