This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html
This is the clang side. The llvm version is in D53765
|  Differential  D56571  
[RFC prototype] Implementation of asm-goto support in clang Authored by jyu2 on Jan 10 2019, 5:25 PM. Tokens 
Details This patch accompanies the RFC posted here: This is the clang side. The llvm version is in D53765 
Diff Detail Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes Comment Actions Also of note: glib, a super popular library used across many Linux distro's will benefit from the implementation of asm goto (I will send them a patch to fix their version detection once we land in Clang): https://github.com/GNOME/glib/blob/1ba843b8a0585f20438d5617521f247071c6d94c/glib/gbitlock.c#L181 Comment Actions Commandeering so I can update to match an IRBuilder interface change in the llvm patch Comment Actions Pass FunctionType into the IRBuilder::CreateCallBr to avoid needing to make an opaque pointer update later 
 
 Comment Actions Forgot about protected scopes... This patch is missing code and testcases for JumpScopeChecker. (The behavior should be roughly equivalent to what we do for indirect goto.) 
 Comment Actions 1> Add code for scope checking 
 
 
 
 
 
 
 Comment Actions 1>emit better error for use of output constraints. 
 
 Comment Actions Not sure if this is the fault of the LLVM half or the Clang half, but I'm seeing mis-compilations in the current patches (llvm ca1e713fdd4fab5273b36ba6f292a844fca4cb2d with D53765.185490 and clang 01879634f01bdbfac4636ebe03b68e85b20cd664 with D56571.185489). My earlier builds were okay (llvm b1650507d25d28a03f30626843b7b133796597b4 with D53765.183738 and clang 61738985ebe78eeff6cfae7f97543d3456bac25a with D56571.181973). I narrowed the failure down to the kernel's move_addr_to_user() function: static int move_addr_to_user(struct sockaddr_storage *kaddr, int klen,
                             void __user *uaddr, int __user *ulen)
{
        int err;
        int len;
        BUG_ON(klen > sizeof(struct sockaddr_storage));
        err = get_user(len, ulen);
        if (err)
                return err;
        if (len > klen)
                len = klen;
        if (len < 0)
                return -EINVAL;
        if (len) {
                if (audit_sockaddr(klen, kaddr))
                        return -ENOMEM;
                if (copy_to_user(uaddr, kaddr, len))
                        return -EFAULT;
        }
        return __put_user(klen, ulen);
}Working build produces: ffffffff81c217a0 <move_addr_to_user>: ffffffff81c217a0: e8 5b fc 3d 00 callq ffffffff82001400 <__fentry__> ffffffff81c217a5: 81 fe 81 00 00 00 cmp $0x81,%esi ffffffff81c217ab: 0f 83 c8 00 00 00 jae ffffffff81c21879 <move_addr_to_user+0xd9> ffffffff81c217b1: 55 push %rbp ffffffff81c217b2: 48 89 e5 mov %rsp,%rbp ffffffff81c217b5: 41 57 push %r15 ffffffff81c217b7: 41 56 push %r14 ffffffff81c217b9: 41 55 push %r13 ffffffff81c217bb: 41 54 push %r12 ffffffff81c217bd: 53 push %rbx ffffffff81c217be: 49 89 ce mov %rcx,%r14 ffffffff81c217c1: 49 89 d7 mov %rdx,%r15 ffffffff81c217c4: 41 89 f5 mov %esi,%r13d ffffffff81c217c7: 49 89 fc mov %rdi,%r12 ffffffff81c217ca: 48 c7 c7 36 35 88 82 mov $0xffffffff82883536,%rdi ffffffff81c217d1: be da 00 00 00 mov $0xda,%esi ffffffff81c217d6: e8 25 8b 5c ff callq ffffffff811ea300 <__might_fault> ffffffff81c217db: 4c 89 f0 mov %r14,%rax ffffffff81c217de: e8 2d f6 2f 00 callq ffffffff81f20e10 <__get_user_4> ffffffff81c217e3: 85 c0 test %eax,%eax ffffffff81c217e5: 75 68 jne ffffffff81c2184f <move_addr_to_user+0xaf> ffffffff81c217e7: 48 89 d3 mov %rdx,%rbx ffffffff81c217ea: 44 39 eb cmp %r13d,%ebx ffffffff81c217ed: 41 0f 4f dd cmovg %r13d,%ebx ffffffff81c217f1: 85 db test %ebx,%ebx ffffffff81c217f3: 78 65 js ffffffff81c2185a <move_addr_to_user+0xba> ffffffff81c217f5: 74 48 je ffffffff81c2183f <move_addr_to_user+0x9f> ffffffff81c217f7: 65 48 8b 04 25 00 4e mov %gs:0x14e00,%rax ffffffff81c217fe: 01 00 ffffffff81c21800: 48 8b 80 38 07 00 00 mov 0x738(%rax),%rax ffffffff81c21807: 48 85 c0 test %rax,%rax ffffffff81c2180a: 74 05 je ffffffff81c21811 <move_addr_to_user+0x71> ffffffff81c2180c: 83 38 00 cmpl $0x0,(%rax) ffffffff81c2180f: 74 50 je ffffffff81c21861 <move_addr_to_user+0xc1> ffffffff81c21811: 48 63 db movslq %ebx,%rbx ffffffff81c21814: 4c 89 e7 mov %r12,%rdi ffffffff81c21817: 48 89 de mov %rbx,%rsi ffffffff81c2181a: ba 01 00 00 00 mov $0x1,%edx ffffffff81c2181f: e8 3c 75 5f ff callq ffffffff81218d60 <__check_object_size> ffffffff81c21824: 4c 89 ff mov %r15,%rdi ffffffff81c21827: 4c 89 e6 mov %r12,%rsi ffffffff81c2182a: 48 89 da mov %rbx,%rdx ffffffff81c2182d: e8 ae 84 99 ff callq ffffffff815b9ce0 <_copy_to_user> ffffffff81c21832: 48 89 c1 mov %rax,%rcx ffffffff81c21835: b8 f2 ff ff ff mov $0xfffffff2,%eax ffffffff81c2183a: 48 85 c9 test %rcx,%rcx ffffffff81c2183d: 75 10 jne ffffffff81c2184f <move_addr_to_user+0xaf> ffffffff81c2183f: 90 nop ffffffff81c21840: 90 nop ffffffff81c21841: 90 nop ffffffff81c21842: b8 f2 ff ff ff mov $0xfffffff2,%eax ffffffff81c21847: 45 89 2e mov %r13d,(%r14) ffffffff81c2184a: 31 c0 xor %eax,%eax ffffffff81c2184c: 90 nop ffffffff81c2184d: 90 nop ffffffff81c2184e: 90 nop ffffffff81c2184f: 5b pop %rbx ffffffff81c21850: 41 5c pop %r12 ffffffff81c21852: 41 5d pop %r13 ffffffff81c21854: 41 5e pop %r14 ffffffff81c21856: 41 5f pop %r15 ffffffff81c21858: 5d pop %rbp ffffffff81c21859: c3 retq ffffffff81c2185a: b8 ea ff ff ff mov $0xffffffea,%eax ffffffff81c2185f: eb ee jmp ffffffff81c2184f <move_addr_to_user+0xaf> ffffffff81c21861: 44 89 ef mov %r13d,%edi ffffffff81c21864: 4c 89 e6 mov %r12,%rsi ffffffff81c21867: e8 f4 1c 52 ff callq ffffffff81143560 <__audit_sockaddr> ffffffff81c2186c: 89 c1 mov %eax,%ecx ffffffff81c2186e: b8 f4 ff ff ff mov $0xfffffff4,%eax ffffffff81c21873: 85 c9 test %ecx,%ecx ffffffff81c21875: 75 d8 jne ffffffff81c2184f <move_addr_to_user+0xaf> ffffffff81c21877: eb 98 jmp ffffffff81c21811 <move_addr_to_user+0x71> ffffffff81c21879: 0f 0b ud2 ffffffff81c2187b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) The bad compilation produces: ffffffff81c21700 <move_addr_to_user>: ffffffff81c21700: e8 fb fc 3d 00 callq ffffffff82001400 <__fentry__> ffffffff81c21705: 81 fe 81 00 00 00 cmp $0x81,%esi ffffffff81c2170b: 0f 83 cc 00 00 00 jae ffffffff81c217dd <move_addr_to_user+0xdd> ffffffff81c21711: 55 push %rbp ffffffff81c21712: 48 89 e5 mov %rsp,%rbp ffffffff81c21715: 41 57 push %r15 ffffffff81c21717: 41 56 push %r14 ffffffff81c21719: 41 55 push %r13 ffffffff81c2171b: 41 54 push %r12 ffffffff81c2171d: 53 push %rbx ffffffff81c2171e: 49 89 ce mov %rcx,%r14 ffffffff81c21721: 49 89 d7 mov %rdx,%r15 ffffffff81c21724: 41 89 f5 mov %esi,%r13d ffffffff81c21727: 49 89 fc mov %rdi,%r12 ffffffff81c2172a: 48 c7 c7 36 35 88 82 mov $0xffffffff82883536,%rdi ffffffff81c21731: be da 00 00 00 mov $0xda,%esi ffffffff81c21736: e8 a5 8b 5c ff callq ffffffff811ea2e0 <__might_fault> ffffffff81c2173b: 4c 89 f0 mov %r14,%rax ffffffff81c2173e: e8 bd f5 2f 00 callq ffffffff81f20d00 <__get_user_4> ffffffff81c21743: 85 c0 test %eax,%eax ffffffff81c21745: 0f 85 87 00 00 00 jne ffffffff81c217d2 <move_addr_to_user+0xd2> ffffffff81c2174b: 48 89 d3 mov %rdx,%rbx ffffffff81c2174e: 44 39 eb cmp %r13d,%ebx ffffffff81c21751: 41 0f 4f dd cmovg %r13d,%ebx ffffffff81c21755: 85 db test %ebx,%ebx ffffffff81c21757: 78 57 js ffffffff81c217b0 <move_addr_to_user+0xb0> ffffffff81c21759: 74 48 je ffffffff81c217a3 <move_addr_to_user+0xa3> ffffffff81c2175b: 65 48 8b 04 25 00 4e mov %gs:0x14e00,%rax ffffffff81c21762: 01 00 ffffffff81c21764: 48 8b 80 38 07 00 00 mov 0x738(%rax),%rax ffffffff81c2176b: 48 85 c0 test %rax,%rax ffffffff81c2176e: 74 05 je ffffffff81c21775 <move_addr_to_user+0x75> ffffffff81c21770: 83 38 00 cmpl $0x0,(%rax) ffffffff81c21773: 74 42 je ffffffff81c217b7 <move_addr_to_user+0xb7> ffffffff81c21775: 48 63 db movslq %ebx,%rbx ffffffff81c21778: 4c 89 e7 mov %r12,%rdi ffffffff81c2177b: 48 89 de mov %rbx,%rsi ffffffff81c2177e: ba 01 00 00 00 mov $0x1,%edx ffffffff81c21783: e8 b8 75 5f ff callq ffffffff81218d40 <__check_object_size> ffffffff81c21788: 4c 89 ff mov %r15,%rdi ffffffff81c2178b: 4c 89 e6 mov %r12,%rsi ffffffff81c2178e: 48 89 da mov %rbx,%rdx ffffffff81c21791: e8 4a 85 99 ff callq ffffffff815b9ce0 <_copy_to_user> ffffffff81c21796: 48 89 c1 mov %rax,%rcx ffffffff81c21799: b8 f2 ff ff ff mov $0xfffffff2,%eax ffffffff81c2179e: 48 85 c9 test %rcx,%rcx ffffffff81c217a1: 75 2f jne ffffffff81c217d2 <move_addr_to_user+0xd2> ffffffff81c217a3: 90 nop ffffffff81c217a4: 90 nop ffffffff81c217a5: 90 nop ffffffff81c217a6: b8 f2 ff ff ff mov $0xfffffff2,%eax ffffffff81c217ab: 45 89 2e mov %r13d,(%r14) ffffffff81c217ae: 31 c0 xor %eax,%eax ffffffff81c217b0: b8 ea ff ff ff mov $0xffffffea,%eax ffffffff81c217b5: eb 1b jmp ffffffff81c217d2 <move_addr_to_user+0xd2> ffffffff81c217b7: 44 89 ef mov %r13d,%edi ffffffff81c217ba: 4c 89 e6 mov %r12,%rsi ffffffff81c217bd: e8 9e 1d 52 ff callq ffffffff81143560 <__audit_sockaddr> ffffffff81c217c2: 89 c1 mov %eax,%ecx ffffffff81c217c4: b8 f4 ff ff ff mov $0xfffffff4,%eax ffffffff81c217c9: 85 c9 test %ecx,%ecx ffffffff81c217cb: 75 05 jne ffffffff81c217d2 <move_addr_to_user+0xd2> ffffffff81c217cd: eb a6 jmp ffffffff81c21775 <move_addr_to_user+0x75> ffffffff81c217cf: 90 nop ffffffff81c217d0: 90 nop ffffffff81c217d1: 90 nop ffffffff81c217d2: 5b pop %rbx ffffffff81c217d3: 41 5c pop %r12 ffffffff81c217d5: 41 5d pop %r13 ffffffff81c217d7: 41 5e pop %r14 ffffffff81c217d9: 41 5f pop %r15 ffffffff81c217db: 5d pop %rbp ffffffff81c217dc: c3 retq ffffffff81c217dd: 0f 0b ud2 ffffffff81c217df: 90 nop The bad sequences starts at the je following js (at ffffffff81c217f5 in the good build and at ffffffff81c21759 in bad build). In the good build, the jump leads to: ffffffff81c2183f: 90 nop ffffffff81c21840: 90 nop ffffffff81c21841: 90 nop ffffffff81c21842: b8 f2 ff ff ff mov $0xfffffff2,%eax ffffffff81c21847: 45 89 2e mov %r13d,(%r14) ffffffff81c2184a: 31 c0 xor %eax,%eax ffffffff81c2184c: 90 nop ffffffff81c2184d: 90 nop ffffffff81c2184e: 90 nop ffffffff81c2184f: 5b pop %rbx ffffffff81c21850: 41 5c pop %r12 ... In the bad build, the mov, mov, xor block does not fall through to function exit, instead it overwrites %eax before jumping to function exit: ffffffff81c217a3: 90 nop ffffffff81c217a4: 90 nop ffffffff81c217a5: 90 nop ffffffff81c217a6: b8 f2 ff ff ff mov $0xfffffff2,%eax ffffffff81c217ab: 45 89 2e mov %r13d,(%r14) ffffffff81c217ae: 31 c0 xor %eax,%eax ffffffff81c217b0: b8 ea ff ff ff mov $0xffffffea,%eax ffffffff81c217b5: eb 1b jmp ffffffff81c217d2 <move_addr_to_user+0xd2> ... ffffffff81c217d2: 5b pop %rbx ffffffff81c217d3: 41 5c pop %r12 ... I've not been able to narrow this down further -- most changes to the C code cause the behavior to vanish, so I assume some optimization pass or something is moving things around to avoid the bug. Comment Actions I reduced the C code to this: volatile int mystery = 0;
static int noinline demo(int klen)
{
        int err;
        int len;
        err = mystery * klen;
        if (err)
                return err;
        if (len > klen)
                len = klen;
        if (len < 0)
                return -EINVAL;
        if (len)
                return -ENOMEM;
        return __put_user(klen, (int __user *)NULL);
}Something in the compilation of the asm-goto in__put_user() broke here. 
 
 
 Comment Actions Remove check for %lN 
 
 
 Comment Actions I find some ambiguous error for asm-goto out of scope. Let me know if you see any problems. Thanks for all the review. Comment Actions I think this is in good shape. 
 Comment Actions Review comment addressed 
 
 
 Comment Actions Should the langref be updated (specifically the section on blockaddress): This value only has defined behavior when used as an operand to the ‘indirectbr’ instruction, or for comparisons against null. https://reviews.llvm.org/D53765 touched the langref, but I think the blockaddress section can be cleaned up. Comment Actions @rsmith, @efriedma, @chandlerc, is this in good enough shape that we can commit and move to incremental fixes/improvement? Jennifer has had to rebase this a couple times which is making things hard for the folks testing this with the Linux kernel and needing to apply this patch themselves. Comment Actions I'm fine merging in this state... but please wait for rsmith to respond before you merge. Comment Actions 
 If we can get inlining of callbr working first before landing this, then all of our CI won't go immediately red (as it would from landing this first, then getting inlining support). Otherwise, we'll have to come up with hacks in the kernel to keep the CI green (I have one for x86. The one for aarch64 quickly became infeasible). I'm anxious for this to land, and the rebases are work indeed. But if we can just wait a little longer and get inlining support working, then land this, it will be much less painful than the other way around. I'm focused on inlining support in https://reviews.llvm.org/D58260 (which doesn't look like much now, but I have changes locally and have been actively working on it this week). Comment Actions Note that with: 
 inline-asm according to getBooleanType") 
 blockaddresses if sole uses are callbrs") 
 asm-goto support in clang") I can compile a mainline x86 defconfig Linux kernel and boot it in QEMU. Comment Actions This code: ; ModuleID = 'arch_static_branch.bc'
source_filename = "arch/x86/entry/vsyscall/vsyscall_64.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"
%struct.atomic_t = type { i32 }
%struct.jump_entry = type { i64, i64, i64 }
%struct.tracepoint_func = type { i8*, i8*, i32 }
%struct.static_key = type { %struct.atomic_t, %union.anon }
%union.anon = type { i64 }
@__tracepoint_emulate_vsyscall = external hidden global { i8*, { %struct.atomic_t, { %struct.jump_entry* } }, i32 ()*, void ()*, %struct.tracepoint_func* }, section "__tracepoints", align 8
; Function Attrs: alwaysinline noredzone nounwind sspstrong
define zeroext i1 @arch_static_branch(%struct.static_key* nocapture readnone %key, i1 zeroext %branch) {
entry:
  callbr void asm sideeffect "1:.byte 0x0f,0x1f,0x44,0x00,0\0A\09.pushsection __jump_table,  \22aw\22 \0A\09 .balign 8 \0A\09 .quad 1b, ${2:l}, ${0:c} + ${1:c} \0A\09.popsection \0A\09", "i,i,X,~{dirflag},~{fpsr},~{flags}"(%struct.static_key* bitcast (i32* getelementptr inbounds ({ i8*, { %struct.atomic_t, { %struct.jump_entry* } }, i32 ()*, void ()*, %struct.tracepoint_func* }, { i8*, { %struct.atomic_t, { %struct.j
ump_entry* } }, i32 ()*, void ()*, %struct.tracepoint_func* }* @__tracepoint_emulate_vsyscall, i64 0, i32 1, i32 0, i32 0) to %struct.static_key*), i1 false, i8* blockaddress(@arch_static_branch, %return))
          to label %asm.fallthrough [label %return]
asm.fallthrough:                                  ; preds = %entry
  call void asm sideeffect "", "~{dirflag},~{fpsr},~{flags}"()
  br label %return
return:                                           ; preds = %asm.fallthrough, %entry
  %retval.0 = phi i1 [ false, %asm.fallthrough ], [ true, %entry ]
  ret i1 %retval.0
}gives me this with the asm-goto patches: $ clang -o /dev/null arch_static_branch.ll warning: overriding the module target triple with x86_64-unknown-linux-gnu [-Woverride-module] <inline asm>:4:15: error: expected a symbol reference in '.quad' directive .quad 1b, "", __tracepoint_emulate_vsyscall+8 + 0 ^ error: cannot compile inline asm 1 warning and 1 error generated. Comment Actions I think things don't work right unless you disable the integrated assembler. I'm not sure why though. Comment Actions Using -no-integrated-as does allow it to compile, but it doesn't link (with ld.lld) if LTO is specified. Is there an equivalent flag / plugin-opt for lld? Comment Actions There shouldn't be an empty string ("") in the asm output -- that should be a reference to the "l_yes" label, not the empty string. That seems very weird... Even odder: running clang -S on the above without -fno-integrated-as emits a ".quad .Ltmp00" (note the extra "0"!), while with -fno-integrated-as, it properly refers to ".Ltmp0". Comment Actions The changes in ASTImporter.cpp looks good to me! Comment Actions Hi, First, thanks for working on this. I am the author of the "rseq" system call in the Linux kernel, and the user-space code required to interact with that system call requires asm goto. I therefore look forward to getting asm goto support in clang. I tried this patch on top of current clang on the following Linux kernel rseq selftests, and it fails to build. Hopefully this feedback can help you improve the current implementation. The kernel rseq selftests can be found at: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/rseq from the kernel source tree (on x86-64): make headers_install cd tools/testing/selftests/rseq make CC=/path/to/clang build output: /home/efficios/git/llvm-project/build/bin/clang -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -o /home/efficios/git/linux/tools/testing/selftests/rseq/librseq.so
/home/efficios/git/llvm-project/build/bin/clang -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ basic_test.c -lpthread -lrseq -o /home/efficios/git/linux/tools/testing/selftests/rseq/basic_test
/home/efficios/git/llvm-project/build/bin/clang -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ basic_percpu_ops_test.c -lpthread -lrseq -o /home/efficios/git/linux/tools/testing/selftests/rseq/basic_percpu_ops_test
In file included from basic_percpu_ops_test.c:12:
In file included from ./rseq.h:71:
./rseq-x86.h:90:26: error: expected a symbol reference
                "cmpq %[v], %[expect]\n\t"
                                       ^
<inline asm>:13:8: note: instantiated into assembly here
        jnz ""
              ^
In file included from basic_percpu_ops_test.c:12:
In file included from ./rseq.h:71:
./rseq-x86.h:102:3: error: expected a symbol reference
                RSEQ_ASM_DEFINE_ABORT(4, "", abort)
                ^
./rseq-x86.h:67:25: note: expanded from macro 'RSEQ_ASM_DEFINE_ABORT'
                __rseq_str(label) ":\n\t"                               \
                                      ^
<inline asm>:20:8: note: instantiated into assembly here
        jmp ""
              ^
In file included from basic_percpu_ops_test.c:12:
In file included from ./rseq.h:71:
./rseq-x86.h:148:30: error: expected a symbol reference
                "cmpq %%rbx, %[expectnot]\n\t"
                                           ^
<inline asm>:14:7: note: instantiated into assembly here
        je ""
             ^
In file included from basic_percpu_ops_test.c:12:
In file included from ./rseq.h:71:
./rseq-x86.h:164:3: error: expected a symbol reference
                RSEQ_ASM_DEFINE_ABORT(4, "", abort)
                ^
./rseq-x86.h:67:25: note: expanded from macro 'RSEQ_ASM_DEFINE_ABORT'
                __rseq_str(label) ":\n\t"                               \
                                      ^
<inline asm>:24:8: note: instantiated into assembly here
        jmp ""
              ^
4 errors generated.
Makefile:22: recipe for target '/home/efficios/git/linux/tools/testing/selftests/rseq/basic_percpu_ops_test' failed
make: *** [/home/efficios/git/linux/tools/testing/selftests/rseq/basic_percpu_ops_test] Error 1Comment Actions @compudj email me the preprocessed output of basic_percpu_ops_test.c and I'll take a look. (Should be able to find my email via git log). Comment Actions @nickdesaulniers, thanks for looking into this. It look like error come from: llvm/lib/MC/MCParser/AsmParser.cpp:1118. Please let me know if that is clang problem. Thanks. Comment Actions It's an orthogonal issue with Clang's integrated assembler. @compudj confirmed that setting -no-integrated-as solves the issue (which is what we do throughout the kernel, except in a few places mostly by accident). I'm following up with @compudj off thread. If we can get help w/ code review of: Then I think we'll have everything we need to land this patch, for the Linux kernel's consumption. Comment Actions Ok, from the Linux kernel's perspective, I believe we have worked out all underlying issues in LLVM wrt callbr that would prevent us from using asm goto successfully. This patch now has my blessing; thanks for all the hard work that went into this large feature. Please wait for a final LGTM from @rsmith . Comment Actions Looks good with a few largely-mechanical changes. 
 
 Comment Actions Was this committed accidentally? Today in master I see: 
 and yet this Phabricator review is still open. It may be easier to rebase this into a monorepo checkout, then commit with git llvm push. As @kees noted, for new test files that contain x86 assembly, they need a -triple that specifies an x86 target, otherwise the implicit target is based on the host, which may not be x86. We still want to verify that non-x86 host can cross compile to x86 correctly. Sorry we did not catch this earlier in code review. Comment Actions jyu2 committed rG954ec09aed4f: clang support gnu asm goto Thanks Kees and Nick! I am currently work on those tests to see if I can make them more generic. All my test using x86 instruction. So that is not working for arm or arm64 target. Will update you once I am done for this. Comment Actions jyu2 committed rGb8fee677bf8e: Re-check in clang support gun asm goto after fixing tests. (authored by jyu2). Comment Actions There is a bug.. I took GCC’s example for asm goto and trunk emits: Wrong jb instruction target.. Comment Actions I think Jennifer just forgot the Differential Revision line, so this was never closed. The test failures resulted in her wanting the patch reverted until she could fix it (hence my revert) Hmm... I don't see the jb instruction target name reflected in the IR, so I suspect @craig.topper should be made aware of the descrepancy (author of https://reviews.llvm.org/D53765). Comment Actions There's still something weird in the backend, but things seem to generally work if you pass -fno-integrated-as to clang which the linux kernel does. Comment Actions For future travelers, we're tracking 2 related bugs: 
 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Ugh. This cast violates strict aliasing, and even in the absence of strict aliasing won't work unless the Stmt base class is at offset 0 in T. The preceding assert is also wrong, as it's asserting that *I is an Expr, not that it's a T.
After r352925, you can use CastIterator<AddrLabelExpr> instead; that should substantially reduce the churn in this patch.