Otherwise we may fail to predicate the INLINEASM_BR later in
the "If Converter" pass, particularly for targets that rely heavily
on it.
pr/42012
Link: https://github.com/ClangBuiltLinux/linux/issues/494
Differential D62555
[TailDuplicator] prevent tail duplication for INLINEASM_BR nickdesaulniers on May 28 2019, 3:59 PM. Authored by
Details
Otherwise we may fail to predicate the INLINEASM_BR later in pr/42012
Diff Detail
Event TimelineComment Actions I can see how this change would prevent the TailDuplication pass from duplicating blocks with callbr instructions, however I'm not sure if that is sufficient, or if it is too restrictive.
If there are many other things that are broken by duplicating callbr then this patch is definitely the right thing to do. If it is just if conversion then it may be best to fix it there? Apologies I'm not an expert in this area, so I'm happy to defer. May be best to try some more reviewers to take a look.
Comment Actions
Excellent point. Thinking more about this last night, and your comment; I think we may have 2 bugs here, and the current patch works but for the wrong reasons. I should have read through IfConverter instead of just focusing on TailDuplicator. Towards your point quoted above (and without having looked at IfConverter), I'd naively say that:
Either way, the test case can be reused for both issues (with different and improved CHECKs. That's the first issue, observed w/ Debug builds. The second issue which was first observed with Release builds is that TailDuplicator was leaving behind references to removed branches for INLINEASM_BR. We can either skip this transformation for MachineBasicBlocks when encountering an INLINEASM_BR (as the patch currently does) with the backwards iteration improvement (you suggested), or I can see if we can update the reference correctly when we duplicate an INLINEASM_BR. I think what might make the latter difficult is that INLINEASM_BR's operands are of type BlockAddress, not MachineBlockAddress (IIRC, might be wrong). Thanks for the code review; it definitely provided food for thought and will result in improved fixes!
Comment Actions
So I have a patch in hand that basically does this. When I then try to build an allyesconfig arm kernel, I get the same "unable to predicate instruction" failure in a different translation unit for a BL instruction. It seems that BL is not considered isUncondBranchOpcode() in llvm/lib/Target/ARM/ARMBaseInstrInfo.h. Looking at its def in llvm/lib/Target/ARM/ARMInstrInfo.td, BL seems defined differently than Bcc (for instance) causing llvm::getMatchingCondBranchOpcode() to choke. I'm not sure if this is worth pursuing at this point.
This is actually IfConverter (which only does transforms based on patterns from TailDuplicator, which is why the current patch as written works; block TailDuplicator then IfConverter can't mess up). I'm going to try now to see if I can find where this is going wrong; I think if we don't predicate MachineInstr`s when isInlineAsm, then don't remove address taken MachineBasicBlocks, that might be the ultimate solution. Comment Actions
We shouldn't try to predicate ARM::BL, probably, given it isn't predicable. (In theory, it should be possible to predicate a call in ARM mode, I think, but it isn't implemented, and it's not clear it's actually worth implementing; it's a little tricky to handle the relocations correctly.) I'm pretty sure the ARM backend correctly reports this to if conversion, so probably a bug in if conversion. Maybe the same issue as https://bugs.llvm.org/show_bug.cgi?id=41121 . Comment Actions
Interesting, and does look related. While I try to make If Converter more robust in the face of INLINEASM_BR, I still think curtailing Tail Duplicator from duplicating INLINEASM_BR (and INLINEASM) is probably also worthwhile, and kind of makes robustness changes to If Converter less important. I'm going to spend the rest of the day trying to improve If Converter, but ultimately this is holding up asm goto from working perfectly (which is more important to me), so if I don't have anything working by EOD, then tomorrow I will make the suggested edits here and focus on this patch.
In particular, I'm not super happy with it, which is why I haven't posted it yet. I think there's yet another/better way to solve the problem in If Converter, but requires more work.
Comment Actions
I don't think this makes sense. We can, and do, clone inline asm at the IR level; it should also be legal at the MIR level. (It's possible the transform isn't profitable if we're optimizing for size, but that's a completely different issue.) Comment Actions
Right, that was more of my point. Anyways, with this patch and D56571, I can successfully build arm32 Linux kernel allyesconfig, and build+boot arm32 Linux kernel defconfig + CONFIG_JUMP_LABEL + CONFIG_STATIC_KEYS_SELFTEST.
Comment Actions I don't have a strong opinion here. I'm happy to disable the duplication for INLINEASM_BR at this point if we can capture the information that we could remove the restriction with a more sophisticated if conversion.
Comment Actions The following testcase still crashes with this patch. Given that, I don't think it makes sense to merge this patch; even if it lets some current version kernel build, someone else is likely to hit the same issue in the near future. target triple = "arm-linux-gnueabi" @a = external dso_local local_unnamed_addr global i32*, align 4 @c = external dso_local local_unnamed_addr global i32, align 4 define dso_local i32 @f() #0 { entry: %0 = load i32, i32* @c, align 4 %tobool = icmp eq i32 %0, 0 %1 = load i32*, i32** @a, align 4 br i1 %tobool, label %if.else, label %if.then if.then: ; preds = %entry tail call void asm sideeffect "str $1, $0", "*Qo,r"(i32* %1, i32 1) #1 callbr void asm sideeffect "", "X"(i8* blockaddress(@f, %g)) #1 to label %asm.fallthrough [label %g] if.else: ; preds = %entry tail call void asm sideeffect "str $1, $0", "*Qo,r"(i32* %1, i32 0) #1 callbr void asm sideeffect "", "X"(i8* blockaddress(@f, %g)) #1 to label %asm.fallthrough [label %g] asm.fallthrough: ; preds = %if.end unreachable g: ; preds = %if.end ret i32 undef } attributes #0 = { "use-soft-float"="false" } attributes #1 = { nounwind } |
You're missing a word in this sentence. "... leave it in its own ..."