ARM Constant Islands would not detect values that
could not be represented by SO immediates,
breaking only during the emission of object
files.
Details
Diff Detail
Event Timeline
llvm/test/CodeGen/Thumb2/LowOverheadLoops/fast-fp-loops.ll | ||
---|---|---|
326 ↗ | (On Diff #281843) | I've not looked into any of the details yet, but what's happening with these changes? I wouldn't expect any of them to be hitting the limit on a adr. |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
1791–1792 | I guess, should this be true? |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
1169 | IsSoImm and NegOk arguments are in the wrong order. |
Fix order of arguments in 1 call to isCPEntryInRange()
Supply the optional final argument in another instance of call to isCPEntryInRange(),
for readability.
Thanks. That does fix the problems I was seeing in thumb2 codesize test cases.
I am still unsure about the PCOffset change. Is that for correctness or an optimisation? Or correctness specific to SoImm due to it's non-uniform nature? I would be fairly surprised if we were always getting the limit of a t2ADR wrong and it had not caused problems up to this point. Either way, can we add some tests for it? And if it is just an independent optimization, consider moving it to a separate patch.
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
1010–1013 | long int -> int64_t |
After review feedback, I removed the change to pc offset estimates from this patch.
This patch is now just about fixing the implementation of SO immediates.
Rewrite long int as int64_t.
Constant island pass is notorious for subtle bugs and being difficult to test.
I created this test case that breaks, by iterating over all values of SPACE.
I think this is why Diogo would have added the PCOffset handling. The pass is usually conservative about the ends of the offset ranges. Now there is a point where we need to be exact where the SOImm range changes from being 2byte addressable to 4byte addressable.
Dave
# RUN: llc --filetype=obj -start-before=arm-cp-islands -o - %s | \ # RUN: llvm-objdump --arch=armv8a --disassemble - | FileCheck %s --- | target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64" target triple = "armv8.2a-arm-none-eabi" define dso_local i32 @main() #0 { ret i32 0 } attributes #0 = { "frame-pointer"="all" } !4 = !{i32 210} ... --- name: main alignment: 4 tracksRegLiveness: true constants: - id: 0 value: half 0xH5440 alignment: 2 machineFunctionInfo: {} body: | bb.0 (%ir-block.0): liveins: $lr $sp = frame-setup STMDB_UPD $sp, 14, $noreg, killed $r11, killed $lr frame-setup CFI_INSTRUCTION def_cfa_offset 8 frame-setup CFI_INSTRUCTION offset $lr, -4 frame-setup CFI_INSTRUCTION offset $r11, -8 $r11 = frame-setup MOVr killed $sp, 14, $noreg, $noreg frame-setup CFI_INSTRUCTION def_cfa_register $r11 $sp = frame-setup SUBri killed $sp, 80, 14, $noreg, $noreg renamable $r1 = LEApcrel %const.0, 14, $noreg renamable $r1 = LDRH killed renamable $r1, $noreg, 0, 14, $noreg :: (load 2 from constant-pool) INLINEASM &nop, 1, !4 renamable $r0 = SPACE 1024, undef renamable $r0 $sp = frame-destroy MOVr $r11, 14, $noreg, $noreg $sp = frame-destroy LDMIA_RET $sp, 14, $noreg, def $r11, def $pc, implicit killed $r0 ...
Reinstated the change to pc offset.
This is necessary for ARM-state SO immediates.
Added a test case which tries to place a constant pool after a gap of a specified count of bytes,
or when that is not a valid offset it falls back to inserting the constant pool locally.
This test is invoked several times at significant offsets to demonstrate that the limits are correct.
Thanks. Nice test.
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
286 | We should also include ARM::LEApcrelJT, as they can use SoImm's too. | |
287 | Why 4 and not 8? I think it's meant to be "2 instructions widths" | |
288 | You needn't add t2LEApcrel here. There are lots of opcodes that could be pc-relative (as in initializeFunctionInfo). We only need to include the ones we are being careful about the adjustment - the SoImm codes. |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
287 | Value of 4 was determined empirically. |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
287 | You fill me with confidence. And would 8 empirically have problems too? |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
287 | Values of 0 or 8 cause: error: out of range pc-relative fixup value Why 4 and not 8? |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
287 | From the ARM reference manual: • When executing an ARM instruction, PC reads as the address of the current instruction plus 8. So it sort of makes sense that we'd need to distinguish between ARM-mode and Thumb-mode... but it's suspicious that this is only getting applied to LEApcrel, and not all ARM-mode instructions. | |
1015 | I'm a little concerned that using getSOImmVal is going to throw off the algorithm. The algorithm is based around there being some range of legal addresses, and moving constants into that range. The current code doesn't generate discontinuities like this, and I'm worried adding them will cause problems for convergence. Can we split this change off from whatever changes are necessary for correctness? It should be conservatively correct to treat an soimm offset as "a multiple of 4 between -1020 and 1020". |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
1015 |
I take on board your observations about a range of legal addresses. | |
1015 |
The current treatment of an SoImm offset as "a multiple of 4 between -1020 and 1020" is part of the cause of this bug. I will split this patch into smaller parts and resubmit. |
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
287 | Just as insight, I only applied to the one instruction there was a problem for the moment. |
We should also include ARM::LEApcrelJT, as they can use SoImm's too.