This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Fix 8-bit immediate overflow in the instruction of segmented stack prologue.
ClosedPublic

Authored by ZhiyaoMa98 on Jan 29 2022, 4:04 PM.

Details

Summary

It fixes the overflow of 8-bit immediate field in the emitted instruction that allocates large stacklet.

For thumb2 targets, load large immediate by a pair of movw and movt instruction. For thumb1 and ARM targets, load large immediate by reading from literal pool.

Diff Detail

Event Timeline

ZhiyaoMa98 created this revision.Jan 29 2022, 4:04 PM
ZhiyaoMa98 requested review of this revision.Jan 29 2022, 4:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 29 2022, 4:04 PM

Please add a test for this. llvm/test/CodeGen/ARM/segmented-stacks.ll already has some tests related to this, so it looks like you could add a couple of RUN lines for thumb and thumb2 targets, and add test_very_large that's like test_large but with an integer that's too large for ARM::MOVi.

llvm/lib/Target/ARM/ARMFrameLowering.cpp
2459

We already have pseudo-instructions for mov of 32-bit immediate in arm and thumb2, so instead of having a function to emit the code sequence here it would be better I think to create a tMOVi32imm pseudo-instruction and generate the instruction sequence when expanding it.

2642

This isn't a conditional move, so this should be t2MOVi32imm.

2656–2657

We'll also have incorrect code generation when AlignedStackSize is larger than the max immediate of ARM::SUBri (and similar with ARM::MOVi down below), so we should similarly be making use of ARM::MOVi32imm to generate the immediate here.

For thumb1, the number of emitted instructions varies with the value of the 32-bit immediate. How should we set the cost for the new pseudo-instruction tMOVi32imm? For the existing t2MOVi32imm, the cost is set as IIC_iMOVix2. I understand it comes from one movw and one movt. But for variable cost, I am not sure what to do.

I see the logic of expanding t2MOVi32imm in ARMExpandPseudo::ExpandMOV32BitImm, but I feel hard to understand them fully. Is there any tutorial that focuses on this part?

Thank you for your review and comments, and help.

Not sure about the discussion of MOVi32imm etc. In ARM mode, the existing MOVi32imm can't be used for arbitrary constants without armv6t2. For the general case, we emit a constant pool load. Similarly, on Thumb, we just never bothered to implement tMOVi32imm.

Can we use emitThumbRegPlusImmediate/emitT2RegPlusImmediate/emitARMRegPlusImmediate here, instead of reimplementing it?

Yes, emitThumbRegPlusImmediate works. Thank you for the suggestion. @efriedma

I am looking into adding more tests, but existing test strikes me as strange. test_large() in llvm/test/CodeGen/Thumb/segmented-stacks.ll already tests for large stacklet allocation, but the test case is vain.

The test case expects the following output:

...
push 	{r4, r5}
mov 	r5, sp
sub  	r5, #40192
...

Even if without this patch, the output machine assembly code matches the expected output string and passes the test. But sub r5, #40192 is not a valid instruction on thumb. After passing through the assembler, the sub instruction becomes subs r5, #0:

00000078 <test_large>:
  78:   b430            push    {r4, r5}
  7a:   466d            mov     r5, sp
  7c:   3d00            subs    r5, #0
...

How can I perform the test effectively?

You'll want to have the RUN lines for the thumb tests to have a --check-prefix argument to FileCheck that's unique, and use that to check for the thumb-specific codegen, e.g.

; RUN: llc < %s -mtriple=arm-linux-androideabi -mattr=+v4t -verify-machineinstrs | FileCheck %s -check-prefix=ARM-android
; RUN: llc < %s -mtriple=arm-linux-unknown-gnueabi -mattr=+v4t  -verify-machineinstrs | FileCheck %s -check-prefix=ARM-linux
; RUN: llc < %s -mtriple=thumb-linux-androideabi -mattr=+v4t -verify-machineinstrs | FileCheck %s -check-prefix=THUMB-android
; RUN: llc < %s -mtriple=thumb-linux-unknown-gnueabi -mattr=+v4t  -verify-machineinstrs | FileCheck %s -check-prefix=THUMB-linux

then later

; THUMB-android:      test_large:

; THUMB-android:      push    {r4, r5}
; THUMB-android-NEXT: whatever the expected sp-setting instruction sequence is
ZhiyaoMa98 added a comment.EditedFeb 7 2022, 12:41 PM

Let me clarify the problem. The test compares the output at the machine assembly level, i.e. ARM (thumb) assembly instruction. However, some syntactically correct ARM (thumb) assembly does not generate binary instruction with the same semantic.

For example, sub r5, #40192 is a syntactically correct ARM (thumb) assembly, but #40192 does not fit into 8-bit immediate field. Surprisingly the assembler accepts the assembly instruction and emits the binary instruction subs r5, #0, without reporting an error.

The following command tests at the machine assembly level. It sees sub r5, #40192 and considers it correct.
; RUN: llc < %s -mtriple=thumb-linux-androideabi -verify-machineinstrs | FileCheck %s -check-prefix=Thumb-android

The following command generates binary object. It should trigger an error when it tries to assemble sub r5, #40192 in thumb1, but does not.
; RUN: llc < %s -mtriple=thumb-linux-androideabi -filetype=obj

So my point is the current test code has already included the large stack frame test case, but the bug is not detected and we are unable to detect it in this way.

We should rewrite the test, but I do not know how the test can be written to reveal the bug I am trying to resolve.

When the integrated assembler is used directly by the compiler, as opposed to parsing textual assembler, it uses a different codepath which passes the operands directly, which skips converting them to/from text. This codepath ends up skipping some kinds of operand validation; we don't expect to be generating invalid instructions.

We do have a method ARMBaseInstrInfo::verifyInstruction which is used by the machine verifier to validate target-specific properties of instructions. It currently only verifies immediates for load/store instructions, not arithmetic, so it doesn't currently catch this. Improvements are definitely welcome, though.

To clarify, if you try to assemble a file containing the line sub r5, #40192 on a Thumb target, it will print an error. The issue you're seeing with an impossible instruction only shows up for instructions generated internally by the compiler itself.

ARMBaseInstrInfo::verifyInstruction can be tested using a MachineIR (MIR) test. See llvm/test/CodeGen/ARM/machine-verifier.mir .

The pseudo-instruction ARM::MOVi32imm seems to be broken.
BuildMI(McrMBB, DL, TII.get(ARM::MOVi32imm), ARM::R4).addImm(0xa0000); generates the following.

c0:   e3e04fff        mvn     r4, #1020       ; 0x3fc
c4:   e24444ff        sub     r4, r4, #-16777216      ; 0xff000000

No existing code is using ARM::MOVi32imm. Is the pseudo-instruction really broken or I misunderstood something?

When we don't have movw/movt, there is no two-instruction sequence to materialize an arbitrary immediate, so we only use ARM::MOVi32imm in very limited cases. See https://github.com/llvm/llvm-project/blob/807e2f12fab52c6abf3e89c02eec0f585b3b8f22/llvm/lib/Target/ARM/ARMInstrInfo.td#L836 and https://github.com/llvm/llvm-project/blob/807e2f12fab52c6abf3e89c02eec0f585b3b8f22/llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp#L970 . (Maybe ARMExpandPseudo::ExpandMOV32BitImm should have an assertion to make this clear.)

ZhiyaoMa98 marked 3 inline comments as done.
ZhiyaoMa98 edited the summary of this revision. (Show Details)
  • Changed t2MOVCCi32imm to unconditional t2MOVi32imm .
  • Updated thumb1 and ARM to read from literal pool.
  • Added tests.

Ping. Is there anything else I should fix?

efriedma accepted this revision.Mar 8 2022, 12:03 PM

LGTM

@john.brawn Any further comment?

@ZhiyaoMa98 I assume you don't have commit access; how do you want to be credited in the "author" line of the git commit?

This revision is now accepted and ready to land.Mar 8 2022, 12:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2022, 12:03 PM

@efriedma
Thanks. Please set my name as "Zhiyao Ma" and my email as "zhiyao.ma.98@gmail.com".

LGTM as well.

This revision was landed with ongoing or failed builds.Mar 10 2022, 3:17 PM
This revision was automatically updated to reflect the committed changes.