This is an archive of the discontinued LLVM Phabricator instance.

Avoid cross-section branches in AArch64 inline asm
AbandonedPublic

Authored by dhoekwater on Jun 5 2023, 6:17 PM.

Details

Summary

Because inline assembly can define labels and branch to them,
placing basic blocks in a different section may cause the linker
to move the branch out of range when performing section layout.
If the branch is out of range, the binary can fail to link or worse,
the linker will try to relax the branch and clobber the X16 register.

Diff Detail

Event Timeline

dhoekwater created this revision.Jun 5 2023, 6:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2023, 6:17 PM
dhoekwater requested review of this revision.Jun 5 2023, 6:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2023, 6:17 PM
dhoekwater updated this revision to Diff 528658.Jun 5 2023, 6:18 PM

Don't format code we're not touching

dhoekwater updated this revision to Diff 528659.Jun 5 2023, 6:21 PM

Rebase onto main

dhoekwater planned changes to this revision.Jun 5 2023, 6:22 PM

Will add tests tomorrow before sending for review.

efriedma added a subscriber: efriedma.

I'm not sure I understand what you mean by "inline assembly can define labels and branch to them". Any given inline asm can only be in one section, and branching from one inline asm to another is undefined behavior.

If there's some interaction with "asm goto", I'd prefer to check specifically for that.

I'm adding this so that, once enabled, function splitting and basic block sections won't create link failures for AArch64.

If we have two basic blocks: BB1 and BB2 where BB1 contains an inline assembly label and BB2 contains inline assembly that branches to that label, function splitting and fbasic-block-sections=all can both place BB1 and BB2 in different text sections. Ex: https://godbolt.org/z/z8KareGn7. With function splitting, if the beq label1 instruction is rarely taken, label1 will be placed in the .text.split section, while beq label1 remains in .text. If the linker places the two sections more than 1MB away from each other, the program will fail to link because the offset doesn't fit in beq machine code instruction.

gcc documentation explicitly states "asm statements may not perform jumps into other asm statements". I don't think there's an equivalent statement in LangRef, but there are obvious reasons we have to restrict it: even if it appears to work in simple cases, in general it will blow up because the compiler assumes control can't flow that way. For example, the compiler can hoist and sink operations across the inline asm. (This is fundamental to the way we currently represent inline asm in LLVM IR.)

So if a user has code like your example, they should fix their code; it's a ticking time bomb with or without function splitting.

I'm adding this so that, once enabled, function splitting and basic block sections won't create link failures for AArch64.

If we have two basic blocks: BB1 and BB2 where BB1 contains an inline assembly label and BB2 contains inline assembly that branches to that label, function splitting and fbasic-block-sections=all can both place BB1 and BB2 in different text sections. Ex: https://godbolt.org/z/z8KareGn7. With function splitting, if the beq label1 instruction is rarely taken, label1 will be placed in the .text.split section, while beq label1 remains in .text. If the linker places the two sections more than 1MB away from each other, the program will fail to link because the offset doesn't fit in beq machine code instruction.

Perhaps the label should be defined in C, then asm goto should be used for the first inline asm statement: https://godbolt.org/z/4fMT4cWPG

--- /tmp/x.c	2023-06-06 13:29:37.883592223 -0700
+++ /tmp/y.c	2023-06-06 13:29:44.555615299 -0700
@@ -3,13 +3,14 @@
 __attribute__((noinline))
 int abs_diff(volatile int a, volatile int b) {
     int result = -1;
-    asm volatile (
+    asm goto (
             "cmp %w0, %w1\n"      // Compare the values of a and b
-            "beq label1\n"         // Branch if a == b
+            "beq %l3\n"           // Branch if a == b
             "sub %w2, %w0, %w1\n" // Calculates a - b
             : "=r" (result)
             : "r" (a), "r" (b)
             : "cc"
+            : label1
         );
     if (a > b) {
         result = a - b;
@@ -17,8 +18,8 @@
     if (a < b) {
         result = b - a;
     }
+label1: // Provide a target for the other inline asm
     asm volatile (
-        "label1:\n"   // Provide a target for the other inline asm
         "mov %w0, #0\n"
         : "=r" (result) 
         :

Then it's obvious to the compiler that label1 has its address taken, and probably shouldn't be moved around (by whatever basic-block-sections=all is)?

nickdesaulniers requested changes to this revision.Jun 6 2023, 1:33 PM

Perhaps the label should be defined in C, then asm goto should be used for the first inline asm statement

Sure, but we don't need to disable function splitting and bb sections to handle that case. The distance between an "asm goto" and its destination doesn't matter; we can apply branch relaxation like we would for normal conditional branches.

Perhaps the label should be defined in C, then asm goto should be used for the first inline asm statement

Sure, but we don't need to disable function splitting and bb sections to handle that case. The distance between an "asm goto" and its destination doesn't matter; we can apply branch relaxation like we would for normal conditional branches.

Right; so we're in agreement the code in question (outside of LLVM) should be rewritten.

dhoekwater added a comment.EditedJun 6 2023, 8:01 PM

Sure, but we don't need to disable function splitting and bb sections to handle that case. The distance between an "asm goto" and its destination doesn't matter; we can apply branch relaxation like we would for normal conditional branches.

Ah, my mistake. My main understanding of inline assembly was that LLVM treats it as a black box and for that reason we couldn't make any correctness guarantees about cross-section branches. I chatted with @nickdesaulniers; he helped clarify things and pointed me to inline assembly documentation. Here are some better examples: https://godbolt.org/z/9fv1TYs37.

asm_goto_conditional may fail to link if bar is placed in a different section than the entry block because "A linker may use a veneer to implement a relocated branch if the relocation is either R_<CLS>_CALL26, R_<CLS>_JUMP26, or R_<CLS>_PLT32 (...) In all other cases a linker shall diagnose an error if relocation cannot be effected without a veneer." In this context, the limitation means that an inline assembly statement cannot rely on conditional branches to Goto Labels, as the compiler or linker may place the blocks such that the branch target is out of range. This is already the case, as the compiler is free to rearrange instructions, insert padding, etc. (https://godbolt.org/z/WbzY9xsxY demonstrates this without function splitting). However, MFS is much more likely to expose inline assembly that misuses conditional jumps because it's more likely that the Goto Label will be out of B.cond's range if it's in a different text section. (The same limitation makes conditional tail calls to functions in a different section generate very suboptimal assembly, but that's a separate conversation.)

asm_goto_unconditional is similar, but the onus on the inline assembly user is less strict. If an inline asm statement contains an unconditional branch to a Goto Label, it must clobber X16 and X17 or else the branch may be out of range (for the same reason as with conditional branches), and the linker will overwrite the value of X16, which may be needed after the branch. https://godbolt.org/z/srjYqbE4Y demonstrates this without function splitting.

So all this to say: I agree that this patch isn't needed for Basic Block Sections or Machine Function Splitting. Every AArch64 inline assembly statement that branches to a Goto Label should do so unconditionally and should clobber X16 and X17.

dhoekwater abandoned this revision.Jun 22 2023, 10:38 AM

Dropping for the time being seeing as valid inline assembly shouldn't cause any problems for MFS.