Inline asm doesn't use labels when compiled as an object file. Therefore, we
shouldn't create one for the (potential) callbr destination. Instead, use the
symbol for the MachineBasicBlock.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
test/CodeGen/X86/callbr-asm-obj-file.ll | ||
---|---|---|
3 ↗ | (On Diff #210433) | ; RUN: |
I'm curious too, what exactly this solves, maybe a test will elucidate.
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
435 ↗ | (On Diff #210430) | range based for? for (const MachineBasicBlock& MBB : MF) ... might even be able to replace MF with the expression above and still fit on 80 lines. |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
433 ↗ | (On Diff #210433) | getBasicBlock() would be a more explicit method to call, preferable to getOperand(1). |
Fix syntax errors and use better accessor methods.
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
433 ↗ | (On Diff #210433) | Good call! :-) |
435 ↗ | (On Diff #210430) | I tried that, but never was able to get the compiler to believe that MF had a "begin()" and "end()". Eventually just gave up... I'll try again though. |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
435 ↗ | (On Diff #210430) | maybe *MF? |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
435 ↗ | (On Diff #210430) | That gives me this error *sigh*: error: use of deleted function ‘llvm::MachineBasicBlock::MachineBasicBlock(const llvm::MachineBasicBlock&)’ |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
435 ↗ | (On Diff #210430) | That sounds like you forgot to make the iteration variable a reference, so a deleted copy constructor is being invoked. diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp index 7721e996aca5..692877bdeb84 100644 --- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp +++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp @@ -429,6 +429,10 @@ static void EmitGCCInlineAsmStr(const char *AsmStr, const MachineInstr *MI, // PrintAsmOperand? if (Modifier[0] == 'l') { // Labels are target independent. if (MI->getOperand(OpNo).isBlockAddress()) { + const MachineFunction *MF = MI->getParent()->getParent(); + for (const MachineBasicBlock& MBB : *MF) { + errs() << "XXX\n"; + } compiles just fine for me. |
test/CodeGen/X86/callbr-asm-obj-file.ll | ||
---|---|---|
7 ↗ | (On Diff #210453) | Can you add ; CHECK-NOT: cases so I get a better idea of what LLVM used to print before this change? |
test/CodeGen/X86/callbr-asm-obj-file.ll | ||
---|---|---|
7 ↗ | (On Diff #210453) | It used to emit an error with no output. |
test/CodeGen/X86/callbr-asm-obj-file.ll | ||
---|---|---|
1 ↗ | (On Diff #210453) | is -pc-linux actually recognized? I would have expected x86_64-linux-gnu? |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
436 ↗ | (On Diff #210466) | I'm not sure that either side of this if are being tested/covered by your new test in callbr-asm-obj-file.ll. In particular, the case that worries me is passing the address of a label as an input to asm goto. Consider: asm goto(".quad %l0\n\t.quad %l1" :: "i"(&&baz) :: bar); bar:; baz:; where 2 blockaddresses are parameters to a callbr, one is the address of a label, one is the jump target list. I'd be more comfortable signing off on code review if that case was also tested here (and considered in further changes related to callbr) |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
436 ↗ | (On Diff #210466) | We need to get clang to accept your code snippet first. :-) I'm not sure I understand what you're saying here. What are you worried will be the outcome? Despite the looping and whatnot, what this change does is fairly straightforward: instead of getting the IR basic block and creating a label for it, it uses the machine basic block's label directly. Similar to the code on line 436-437. |
lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
436 ↗ | (On Diff #210466) | Sorry, maybe a better example: int foo(void) { asm goto(".quad %l0\n\t.quad %l1" :: "i"(&&baz) :: bar); baz: return 1; bar: return -1; } before your patch clang -O2 x.c -c -S produces: foo: # @foo # %bb.0: movl $1, %eax #APP .quad .Ltmp0 .quad .Ltmp1 #NO_APP jmp .LBB0_2 .Ltmp1: # Block address taken .LBB0_1: movl $-1, %eax .Ltmp0: # Block address taken .LBB0_2: retq after: foo: # @foo # %bb.0: movl $1, %eax #APP .quad .LBB0_2 .quad .LBB0_1 #NO_APP jmp .LBB0_2 .Ltmp0: # Block address taken .LBB0_1: movl $-1, %eax .Ltmp1: # Block address taken .LBB0_2: retq So that looks like an improvement (and there's more we can do in a follow up to be better). Before your patch clang -O2 x.c -c fails outright, with it, it succeeds and the disassembly looks good. Ok, sorry for the noise, LGTM. I think it would be good for us further to not emit the .LtmpX labels, since they both point to the same location. |
Thanks! Though your example pointed out a separate problem: not accepting the label as a valid operand. I'll do some work on that.
So two regressions were immediately reported:
- https://github.com/ClangBuiltLinux/linux/issues/614
- https://github.com/ClangBuiltLinux/linux/issues/615
so honestly I'd rather revert it until we understand the exact issue better.
@void has also authored https://reviews.llvm.org/D64982.
For the life of me I swear there's a bug that @craig.topper commented on that explains an issue with symbols not being added to the symbol table at the MC layer, but I cannot find it (and it's killing me). I think that should be pursued (why were we generating a .tmp symbol anyways).
llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp | ||
---|---|---|
434 | So another issue I though of with this approach: It makes sense to me as implemented; we need some mapping from BlockAddress of the BasicBlock to the corresponding MachineBasicBlock (we have the reverse mapping; given a MachineBasicBlock we can get the BasicBlock it comes from, but we don't have the reverse mapping; GetBlockAddressSymbol kind of does this but has other issues). This approach works for the test case provided, and probably all C/C++ that could ever hit this path BUT it is not safe to assume that the MachineBasicBlock corresponding to the BlockAddress is within this MachineFunction. Stated another way, I could have blockaddress(@foo, %bar) used in MachineFunction @baz. The code in this patch would iterate the MachineBasicBlocks of the MachineFunction corresponding to @baz since the MachineInstruction using the blockaddress constant was in @baz. It would never find %bar since it needs to iterate MachineBasicBlocks of @foo, not @baz. I think this would be fixed by initializing MF to BB->getParent(). |
Reverted in https://reviews.llvm.org/rL366600. While GetBlockAddressSymbol has some complicated machinery related to managing symbols of temporary addresses, I get the feeling a complete fix will lie in modifications there.
Sounds like this was reverted on trunk. Please let me know if there are follow-ups that might be candidates for merging.
I have a patch in my local branch that should fix some of the issues. However, I can't seem to support the situation Nick pointed out. If you have something like:
define void @foo() { entry: callbr void asm sideeffect ".word ${0:l}", "X"(i8* blockaddress(@bar, %next)) to label %if.end10 [label %l_yes] if.end10: br label %l_yes l_yes: ret void } define i32 @bar() { entry: br label %next next: ret i32 42 }
This will fail, because @bar hasn't been created yet, so it can't get the address of next. Or, rather, it *can*, but the assembly parser won't realize that it's a label and will create a new label for it. So it goes from this:
foo: .word .Ltmp0 ... bar: .Ltmp0:
to this:
foo: .word .Ltmp00 ; <- Notice the '0' suffix, which is applied to a label to uniquify it. ... bar: .Ltmp0:
when parsing the inline assembly. I *think* we need to convince MCContext that .Ltmp0 is a label that it doesn't need to uniquify—i.e., register it as a symbol in MCContext instead of treating it as a temporary label (there's a difference, and I'm not sure why).
The question is whether this is a case we need to care about at this point in time.
Thoughts?
So another issue I though of with this approach:
It makes sense to me as implemented; we need some mapping from BlockAddress of the BasicBlock to the corresponding MachineBasicBlock (we have the reverse mapping; given a MachineBasicBlock we can get the BasicBlock it comes from, but we don't have the reverse mapping; GetBlockAddressSymbol kind of does this but has other issues). This approach works for the test case provided, and probably all C/C++ that could ever hit this path BUT it is not safe to assume that the MachineBasicBlock corresponding to the BlockAddress is within this MachineFunction.
Stated another way, I could have blockaddress(@foo, %bar) used in MachineFunction @baz. The code in this patch would iterate the MachineBasicBlocks of the MachineFunction corresponding to @baz since the MachineInstruction using the blockaddress constant was in @baz. It would never find %bar since it needs to iterate MachineBasicBlocks of @foo, not @baz.
I think this would be fixed by initializing MF to BB->getParent().