This is an archive of the discontinued LLVM Phabricator instance.

[IR] Don't use blockaddresses as callbr arguments
ClosedPublic

Authored by nikic on Jul 7 2022, 6:47 AM.

Details

Summary

Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with ! label constraints that refer directly to callbr indirect destinations:

; Before:
%res = callbr i8* asm "# $0\0A\09# ${2:l}", "=r,0,i,~{dirflag},~{fpsr},~{flags}"(i8* %0, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "# $0\0A\09# ${2:l}", "=r,0,!i,~{dirflag},~{fpsr},~{flags}"(i8* %0)
to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations:

This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed.

Diff Detail

Event Timeline

nikic created this revision.Jul 7 2022, 6:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2022, 6:47 AM
nikic requested review of this revision.Jul 7 2022, 6:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2022, 6:47 AM

Can't we remove blocks from the callbr target list in some circumstances? E.g. if we merge two blocks maybe?

nikic added a comment.Jul 7 2022, 7:23 AM

Can't we remove blocks from the callbr target list in some circumstances? E.g. if we merge two blocks maybe?

I believe we currently don't allow doing that, see e.g. the check in https://github.com/llvm/llvm-project/blob/26f369393d4e60c78d872c4cafcbf3fd929b9004/llvm/lib/Transforms/Utils/Local.cpp#L1092. This is because we currently don't allow callbr to have the same successor multiple times (though I think we should allow that, to allow transforms like these.)

Why the poison placeholders? This makes it easy to satisfy the existing inlineasm infrastructure. It would be cleaner to omit the placeholder arguments and just say that trailing inlinasm constraints refer to the indirect target labels. InlineAsm itself doesn't have knowledge about those though, so it would not be possible to verify InlineAsm constraints in isolation.

I'm really uncomfortable with this hack. If we're going to introduce a new kind of input which doesn't come from the arglist, I think it'd be better to introduce a new constraint modifier symbol which can be used with input constraints to indicate that...then, the parsed InlineAsm will have the expected count of those separately, and the the verifier can verify the count matches at the appropriate time (during call/callbr verification), and we can propagate this into SDag, etc...

E.g., we could choose the symbol '!' for that, so you might have e.g.:
callbr void asm sideeffect "", "!i,r,!i,~{dirflag},~{fpsr},~{flags}"(i32 4) to label %1 [label %2, label %3]

Can't we remove blocks from the callbr target list in some circumstances? E.g. if we merge two blocks maybe?

I believe we currently don't allow doing that, see e.g. the check in https://github.com/llvm/llvm-project/blob/26f369393d4e60c78d872c4cafcbf3fd929b9004/llvm/lib/Transforms/Utils/Local.cpp#L1092. This is because we currently don't allow callbr to have the same successor multiple times (though I think we should allow that, to allow transforms like these.)

Huh. I would've expected us to previously have coalesced duplicates, not prohibit rewrites that would create duplicates. Anyways, NOW we should definitely allow duplicates in the list -- it's now actually an ordered list, not a set as it theoretically was before. If you've gone through the code to ensure that nothing ever modifies the order of entries, or adds/removes entries from the list, we'll probably be okay after that change.

Can't we remove blocks from the callbr target list in some circumstances? E.g. if we merge two blocks maybe?

I believe we currently don't allow doing that, see e.g. the check in https://github.com/llvm/llvm-project/blob/26f369393d4e60c78d872c4cafcbf3fd929b9004/llvm/lib/Transforms/Utils/Local.cpp#L1092. This is because we currently don't allow callbr to have the same successor multiple times (though I think we should allow that, to allow transforms like these.)

I agree that we should allow duplicate BBs in the IR operand list for callbr. While the higher level language prevents such cases (asm goto will error if the same label appears more than once), it leads to dumb compilation failures when optimizing, such as: https://github.com/llvm/llvm-project/issues/45248.

The main issue is that if the same operand appears in the operand list twice, we need to be careful not to remove instances; the inline asm string still needs to correspond to the same number of operands (but this isn't being proposed; I'm attacking a strawman).

nickdesaulniers added a comment.EditedJul 7 2022, 11:08 AM

Can't we remove blocks from the callbr target list in some circumstances? E.g. if we merge two blocks maybe?

I believe we currently don't allow doing that, see e.g. the check in https://github.com/llvm/llvm-project/blob/26f369393d4e60c78d872c4cafcbf3fd929b9004/llvm/lib/Transforms/Utils/Local.cpp#L1092. This is because we currently don't allow callbr to have the same successor multiple times (though I think we should allow that, to allow transforms like these.)

The main issue is that if the same operand appears in the operand list twice, we need to be careful not to remove instances; the inline asm string still needs to correspond to the same number of operands (but this isn't being proposed; I'm attacking a strawman).

Oh, I think that _is_ what @jyknight is proposing. We'd need to rewrite the asm string in that case. For example:

callbr void asm sideeffect "${0:l} ${1:l}", "!i,!i"()
 to label %1 [label %4, label %4]

would have to become:

callbr void asm sideeffect "${0:l} ${0:l}", "!i,!i"()
 to label %1 [label %4]

What happens if the inline asm uses identifier names rather than numeric references? (Maybe clang boils these away when lowering; haven't checked, but that question comes immediately to mind).

nikic added a comment.Jul 7 2022, 12:14 PM

@nickdesaulniers Why would we need to rewrite the asm string? I'd expect things to stay in the form

callbr void asm sideeffect "${0:l} ${1:l}", "!i,!i"()
to label %1 [label %4, label %4]

where the two inline asm arguments just happen to refer to the same label.

nickdesaulniers added a comment.EditedJul 7 2022, 1:31 PM

@nickdesaulniers Why would we need to rewrite the asm string? I'd expect things to stay in the form

callbr void asm sideeffect "${0:l} ${1:l}", "!i,!i"()
to label %1 [label %4, label %4]

where the two inline asm arguments just happen to refer to the same label.

The case you describe looks fine to me, but it also seems different than what @jyknight was describing when he said:

Can't we remove blocks from the callbr target list in some circumstances? E.g. if we merge two blocks maybe?

You'll notice in my (poorly formatted, now fixed) example, I noted that if an operand was removed, say because the two operands were equivalent, the asm string would also need to be updated. I think for immediates like labels or integer constant expressions, it doesn't make a difference whether the asm operand list has operands merged or not; I don't think that will provide any optimization for the register allocator for instance.

nickdesaulniers added a comment.EditedJul 7 2022, 1:51 PM

Why the poison placeholders? This makes it easy to satisfy the existing inlineasm infrastructure. It would be cleaner to omit the placeholder arguments and just say that trailing inlinasm constraints refer to the indirect target labels. InlineAsm itself doesn't have knowledge about those though, so it would not be possible to verify InlineAsm constraints in isolation.

I'm really uncomfortable with this hack. If we're going to introduce a new kind of input which doesn't come from the arglist, I think it'd be better to introduce a new constraint modifier symbol which can be used with input constraints to indicate that...then, the parsed InlineAsm will have the expected count of those separately, and the the verifier can verify the count matches at the appropriate time (during call/callbr verification), and we can propagate this into SDag, etc...

E.g., we could choose the symbol '!' for that, so you might have e.g.:
callbr void asm sideeffect "", "!i,r,!i,~{dirflag},~{fpsr},~{flags}"(i32 4) to label %1 [label %2, label %3]

I like this suggestion because to me it is homogenous with outputs. Outputs also do not appear in the operand list, yet are represented in the constraint string. As an example:

; Before:
%res = callbr i8* asm "# $0\0A\09# ${2:l}", "=r,0,i,~{dirflag},~{fpsr},~{flags}"(i8* %0, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "# $0\0A\09# ${2:l}", "=r,0,!i,~{dirflag},~{fpsr},~{flags}"(i8* %0)
to label %asm.fallthrough [label %foo]

(where "=r" applies to %res and isn't listed in the asm string, 0 refers to %0 and also isn't listed in the asm string, then !i refers to label %foo and ${2:l} in the asm string, just to be overly specific).

Does the langref mention any llvm-specific constraint string symbols? Regardless, such a change should be documented very explicitly since it doesn't collide with any constraint string AFAIK today, but theoretically could in the future should the higher level languages implementing extended inline asm choose such a character as the constraint representation.

nikic updated this revision to Diff 443189.Jul 8 2022, 3:27 AM
nikic edited the summary of this revision. (Show Details)

Introduce label constraints, remove limitation on duplicate destinations. I'm quite happy with this version.

nikic updated this revision to Diff 443191.Jul 8 2022, 3:33 AM

Update MIR test

While it makes me a bit sad to "lose" theoretical possibilities of the original design (e.g. I always thought it might be nice if someday we could use callbr to represent the potential control-transfer from a function call in a setjmp function back to the setjmp), I do think this is absolutely better for the ACTUAL current use-cases, and we'd've been better-off doing this from the get-go. Oops...my bad...

Does the langref mention any llvm-specific constraint string symbols? Regardless, such a change should be documented very explicitly since it doesn't collide with any constraint string AFAIK today, but theoretically could in the future should the higher level languages implementing extended inline asm choose such a character as the constraint representation.

This isn't an issue. In fact, the syntax is entirely llvm-specific, and is separate from the frontend syntax parsed by Clang. It bears some resemblance to the GCC constraint syntax, but it is not the same.

Other comments:

  • Clang won't emit them, but I'd like to see a test-case that uses different constraints for label args than just !i. E.g. !r, or !0. (Given the code structure, I expect it should just work, already.)
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
5511

I'd put if (isa<InlineAsm>(Callee)) around this whole upgrade section. While we only currently support inlineasm calee, the reader code is otherwise agnostic to the type of the callee, and it seems better to preserve that.

Also, this upgrade code seems potentially problematic -- a user could write the C code:
int foo() { label: asm goto("" : : "r"(&&label) : : label2); return 1; label2: return 2; }
which turns into

callbr void asm sideeffect "", "r,i,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %2), i8* blockaddress(@foo, %4)) #1
        to label %3 [label %4]

and this would error out on it.

(I'm not sure _why_ a user would write that, but we shouldn't fail in the bitcode reader if they do!)

I think it needs to:

  1. Only do anything if the constraint string doesn't contain label constraints (so we don't run this code on new bitcode).
  2. Only validate and remove the last IndirectDests.size() arguments -- not other blockaddress values that may appear earlier in the arglist.

(And of course, a test-case. Are there any test-cases for this code? I didn't see one.)

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8680

Why is this removed?

8690

I'm not sure we actually need to create a BlockAddress in SDAG either. But, fine to leave it for now, to keep the patch size down.

llvm/lib/IR/InlineAsm.cpp
296

Should still check:
if (NumClobbers) return false; // inputs before clobbers.

llvm/lib/IR/Verifier.cpp
2232–2238

Need to verify that there's zero indirect dests when there's no label constraints. Maybe like this?

llvm/test/CodeGen/X86/callbr-asm-errors.ll
0

This is no longer an "error" test, as per filename. Maybe merge into the "callbr-asm-destinations.ll" file?

Also, if we eliminate blockaddress operands from callbr, I wonder if we can revert the bitcode reader+writer changes added recently in:

Perhaps while retaining their tests.

I think this approach will have a nice side effect wrt. inlining:

NVM, that case has blockaddress, but not callbr. So we should strike

Allow inlining of callbr (https://github.com/llvm/llvm-project/issues/38908 / https://github.com/ClangBuiltLinux/linux/issues/263)

from the commit description.

Also, I left a few comments like: `s/!X/!i/. In 79ebc3b0dd130d759bf637c71c2a6aa039f0bd8f I meant to clean all of these up. Maybe a few slipped through the cracks, but while you're touching these lines anyways I would appreciate it if we could eradicate uses of X for the labels in our test cases.

Finally, I'm very happy with this cleanup. It seems like a great simplification that fixes many things that made working with callbr exceptional; now callbr seems like it fits in much better with the IR as we have much less special casing. We will need some time to test our kernel builds with a change of this nature.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8690

I assume we still need blockaddress in the MIR representation? This change is just creating them much later in the compilation pipeline. Or did I misunderstand your point?

llvm/test/Analysis/BasicAA/pr52735.ll
14

s/!X/!i/

21

s/!X/!i/

llvm/test/CodeGen/X86/callbr-asm-outputs.ll
221

s/!X/!i/

llvm/test/CodeGen/X86/speculation-hardening-sls.ll
67

s/!X/!i/

llvm/test/Transforms/CodeExtractor/PartialInlinePGOMultiRegion.ll
104

s/!X/!i/

llvm/test/Transforms/IROutliner/illegal-callbr.ll
19

Nice, so callbr will no longer inhibit outlining?

llvm/test/Transforms/LoopStrengthReduce/callbr-critical-edge-splitting2.ll
18–19

I wonder if this comment is stale from some of your recent changes @nikic ?

llvm/unittests/IR/InstructionsTest.cpp
1501

I think this test doesn't care about having any inputs at all. Consider dropping the arguments, rather than using i8* poison.

nikic added a comment.Jul 8 2022, 1:08 PM

Also, if we eliminate blockaddress operands from callbr, I wonder if we can revert the bitcode reader+writer changes added recently in:

Perhaps while retaining their tests.

I don't think so. These cases are related to &&labels (without callbr).

As a side note, this kind of usage of block labels is not supported by LLVM blockaddress semantics. &&labels are only supported in conjunction with computed goto (indirectbr). If the function with the label doesn't contain any computed goto, then LLVM is free to DCE all these blockaddresses. I believe the only reason why this currently doesn't happen is that we have a bunch of non-principled hasAddressTaken() checks strewn over the codebase that really shouldn't be there. If we drop those, _THIS_IP_ will probably become equal to 1 in most cases.

I think the kernel would be much better served by a new intrinsic llvm.instructionpointer or so (and a corresponding clang builtin). As far as I understand, _THIS_IP_ is only used as debugging information. Importantly, this address will never be used as an actual jump target. This usage really just wants to get the current instruction pointer at the point of the call, and doesn't need any of the blockaddress baggage that both inhibits a lot of optimization and only works by accident in the first place.

But well, this is kind of getting off topic...

nikic updated this revision to Diff 443646.Jul 11 2022, 8:06 AM
nikic marked 12 inline comments as done.
nikic edited the summary of this revision. (Show Details)

Address review comments.

nikic added inline comments.Jul 11 2022, 8:29 AM
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
5511

I've changed the code to only upgrade the trailing arguments. There's test coverage for this in llvm/test/Bitcode/callbr.ll.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8680

This was a leftover from a previous version.

It's rather suspicious that this doesn't break anything -- apparently, most of the code here is redundant. I've prepared a patch to clean it up (https://gist.github.com/nikic/219eec68f61b02fa4b22024307a50f84) and rely on TargetLowering::ParseConstraints() instead. Once I land that the label handling code added here will also move into that function.

llvm/test/Transforms/IROutliner/illegal-callbr.ll
19

Yes, in the sense that the indirect dest blocks are no longer address taken, so can be outlined. The callbr itself can't be outlined yet, because the outliner generally doesn't support outlining terminators.

llvm/test/Transforms/LoopStrengthReduce/callbr-critical-edge-splitting2.ll
18–19

From what I can tell, we don't perform any critical edge splitting here (we do perform it in the function below though).

llvm/unittests/IR/InstructionsTest.cpp
1501

This was a leftover from a previous patch.

We should still strike

Allow inlining of callbr

From the commit description.

This was already done in
commit 639b29b1b59f ("[INLINER] allow inlining of blockaddresses if sole uses are callbrs")
(I wonder if it's now safe to revert 639b29b1b59f, too? Keeping the test case though.)


Patch LGTM; but please don't land until we've had more time to test this. It's the kind of change that has the potential to regress our kernel builds. I'm tied up today, tomorrow, and most of Wednesday, but @nathanchance and I will hammer on this and get you a green light by EOW. Thanks for the work here!

llvm/test/Verifier/callbr.ll
67–76

I think we want to retain this test? The important part is that to label %both [label %both] not be modified.

llvm/test/Verifier/inline-asm-label-before-outputs.ll
3 ↗(On Diff #443646)

Does the assembler stop emitting diagnostics after the first? If not, consider adding this to llvm/test/Verifier/inline-asm-label-after-clobbers.ll and maybe renaming the test file.

llvm/test/tools/llvm-diff/callbr.ll
24

I was never really sure what the intent of this test was...

nikic edited the summary of this revision. (Show Details)Jul 11 2022, 1:09 PM
nikic marked an inline comment as done.Jul 11 2022, 1:18 PM

We should still strike

Allow inlining of callbr

From the commit description.

This was already done in
commit 639b29b1b59f ("[INLINER] allow inlining of blockaddresses if sole uses are callbrs")
(I wonder if it's now safe to revert 639b29b1b59f, too? Keeping the test case though.)

Ah yes, I got confused by this check here: https://github.com/llvm/llvm-project/blob/f0cd5389850589fbf8a3ce77cea2539579bd8028/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1763 But that one guards against inlining a callbr call itself, rather than a call to a function that contains callbr.

llvm/test/Verifier/callbr.ll
67–76

Do you mean as a CHECK-NOT test, to check that duplicate successors are allowed now?

llvm/test/Verifier/inline-asm-label-before-outputs.ll
3 ↗(On Diff #443646)

Yeah, these errors are thrown as part of IR parsing rather than the verifier, and that one stops on first error. (As a side note, we should probably make InlineAsm::Verify return an Error so we can produce a more specific error message for different cases. The current one leaves something to be desired...)

llvm/test/Verifier/callbr.ll
67–76

Ah, nevermind. I see we have test coverage of that in @duplicate_normal_and_indirect_dest in duplicate_normal_and_indirect_dest.

nikic updated this revision to Diff 443898.Jul 12 2022, 3:11 AM
nikic marked an inline comment as done.

Rebase over code cleanup and diagnostic improvements.

I tested diff 443646 and saw no regressions with my Linux kernel build and QEMU boot tests. I can test the latest revision if that would be useful. I can test bare metal later today.

nickdesaulniers accepted this revision.Jul 14 2022, 4:52 PM

Thanks for this cleanup, it's really nice to see.

This revision is now accepted and ready to land.Jul 14 2022, 4:52 PM
This revision was landed with ongoing or failed builds.Jul 15 2022, 1:18 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: cfe-commits. · View Herald Transcript