Currently ARM backend validates the range of branch targets before the
layout of fragments is finalized. This causes build failure if symbolic
expressions are used, with the exception of a single symbolic value.
For example, "b.w ." works but "b.w . + 2" currently fails to
assemble. This fixes the issue by delaying this check (in
ARMAsmParser::validateInstruction) of b.w instructions until the symbol
expressions are resolved (in ARMAsmBackend::adjustFixupValue).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This is created based on the discussion at https://reviews.llvm.org/D97501#inline-914358.
llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
---|---|---|
7955 | if you don't use the return value from dyn_cast, you should use isa. or rewrite this as two nested ifs: if (auto foo = dyn_cast<...): if (!foo.isSignedOffset<...): return Error(...); |
I made a very similar patch for RISC-V: D92293.
Patching isSignedOffset is fine since it fixes more instructions. It is also fine to restrict the lift on t2B currently as you don't have tests for others.
llvm/test/MC/ARM/thumb2-b.w-target.s | ||
---|---|---|
1 | Use consistent comment marker in a test. | |
5 | No need to say GNU objdump behavior. The specific llvm-objdump tests for the symbolization will do it. | |
11 | start+0xfffff is undeeded detail. There should be specific llvm-objdump tests for the symbolization. | |
llvm/test/MC/ARM/thumb2-branch-ranges.s | ||
101 | [[@LINE]] is deprecated FileCheck syntax. [[#@LINE+1]]:[[#]]: error: ... |
llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
---|---|---|
7954 | Add a period after a complete sentence. |
llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
---|---|---|
7967 | It's definitely not great seeing this problem across multiple different instructions; in particular I suspect that t2Bcc is the same operation as t2B (branch), just with a condition code (cc) and smaller immediate. I still think the callers should defer checks for MCBinaryExpr (as this patch does), but then seeing things like isThumbBranchTarget/isARMBranchTarget makes me nervous that we can't validate MCBinaryExpr operands until after adjusting fixups (very late) for many other instructions. If MCSymbolRefExpr is just the . operator in assembler, and MCBinaryExpr is for expressions, I wonder if isUnsignedOffset, isSignedOffset, isLEOffset, and isThumbMemPC should all return true for MCBinaryExpr? isARMBranchTarget/isThumbBranchTarget default to true (for example for MCSymbolRefExpr and MCBinaryExpr, but the list above do not. And that's curious to me. We can't know if these immediates satisfy some range constraints until after fixup; right now these methods are a mix of what they permit when expressions involving . are used. |
llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
---|---|---|
7967 | And part of that is that these isSignedOffset and friends all return bool, when for MCSymbolRefExpr and MCBinaryExpr operands the answer has a third possible state which is "maybe, but I won't know until adjusting fixups" which is one of the last things the assembler does, without rerunning instruction validation. isUnsignedOffset, isSignedOffset, isLEOffset, and isThumbMemPC all respond "no" conservatively; isARMBranchTarget/isThumbBranchTarget respond "sure" perhaps too eagerly. I suspect given a MCSymbolRefExpr or MCBinaryExpr, isARMBranchTarget/isThumbBranchTarget both respond true, when after fixups the answer perhaps should have been "no." |
llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
---|---|---|
7967 | Agreed, I will take a look at the code and see if we could refactor it for a more permanent solution. Also . gets converted into a symbol like any regular label, so MCSymbolRefExpr covers any expressions with a single label or .. |
Add a period after a complete sentence.