This is an archive of the discontinued LLVM Phabricator instance.

[mips] Accept 32-bit offsets for lb and lbu commands
ClosedPublic

Authored by atanasyan on Mar 29 2018, 5:48 AM.

Details

Summary

lb and lbu commands accepts 16-bit signed offsets. But GAS accepts larger offsets for these commands. If an offset does not fit in 16-bit range, lb command is translated into lui/lb or lui/addu/lb series. It's interesting that initially LLVM assembler supported this feature, but later it was broken.

This patch restores support for 32-bit offsets. It replaces mem_simm16 operand for LB and LBu definitions by the new mem_simmptr operand. This operand is intended to check that offset fits to the same size as using for pointers. Later we will be able to extend this rule and accepts 64-bit offsets when it is possible.

Some issues remain:

  • The regression also affects LD, SD, LH, LHU commands. I'm going to fix them by a separate patch.
  • GAS accepts any 32-bit values as an offset. Now LLVM accepts signed 16-bit values and this patch extends the range to signed 32-bit offsets. In other words, the following code accepted by GAS and still triggers an error by LLVM:
lb      $4, 0x80000004

# gas
lui     a0, 0x8000
lb      a0, 4(a0)
  • In case of 64-bit pointers GAS accepts a 64-bit offset and translates it to the li/dsll/lb series of commands. LLVM still rejects it. Probably this feature has never been implemented in LVVM. This issue is for a separate patch.
lb      $4, 0x800000001

# gas
li      a0, 0x8000
dsll    a0, a0, 0x14
lb      a0, 4(a0)

Diff Detail

Repository
rL LLVM

Event Timeline

atanasyan created this revision.Mar 29 2018, 5:48 AM

I don't think this is quite the correct approach, especially if you're going to be implementing support for 64-bit pointers with those instructions, as we would need to modify the existing lb64 and friends to be available to the AsmParser with a different set of operands.

Instead, we should use the 'mem' operand for the lb and lbu instructions. That will allow the parser to match larger offsets than those instructions permit.

Then in MipsAsmParser during the expansion of the instruction, we can check the size of the offset and error out there if >32 bits. There's currently a bug there in that offsets are truncated to signed 32bit values before checking if they're out of range.

The other instructions in the base MIPS architecture which appear to have this issue are: ld, lh, lhu and sd.

I don't think this is quite the correct approach, especially if you're going to be implementing support for 64-bit pointers with those instructions, as we would need to modify the existing lb64 and friends to be available to the AsmParser with a different set of operands.

Instead, we should use the 'mem' operand for the lb and lbu instructions. That will allow the parser to match larger offsets than those instructions permit.

Thanks for review. I will provide another patch soon.

we should use the 'mem' operand for the lb and lbu instructions. That will allow the parser to match larger offsets than those instructions permit.

BTW do you suggest to escape sub-classing the 'mem' operand at all? If so, we won't be able to diagnose that an operand does not fit 32 or 64 bit range.

we should use the 'mem' operand for the lb and lbu instructions. That will allow the parser to match larger offsets than those instructions permit.

BTW do you suggest to escape sub-classing the 'mem' operand at all? If so, we won't be able to diagnose that an operand does not fit 32 or 64 bit range.

We can diagnose the case if the operand does not fit a 32 or 64 bit range though the chain of processInstruction() -> expandMemInst() -> expand(Load|Store)Inst(). If we accept a full 64bit offset during parsing, we can determine if we should reject it after parsing while processing the instructions.

atanasyan edited the summary of this revision. (Show Details)

I created an alternative patch D46105 which replaces mem_simm16 by mem and performs range checking in the expandLoadInst routine. But there is a problem with that patch - in the expandLoadInst routine we do not know position of invalid operand, so an error message points to the beginning of the incorrect instructions. This patch attempts to use new mem_simmptr operand for lb and lbu instructions and performs range checking in a uniform way with other instructions which use ParserMatchClass.

atanasyan edited the summary of this revision. (Show Details)Apr 25 2018, 11:09 PM
sdardis accepted this revision.Apr 26 2018, 8:32 AM

LGTM.

In comparison to the other patch you submitted, this is clearly the better option as it points to the problematic operand.

Nit: You should correct the spelling of LLVM in the last sentence in the summary.

This revision is now accepted and ready to land.Apr 26 2018, 8:32 AM

Thanks for review.

This revision was automatically updated to reflect the committed changes.