This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDagBuilder] improve CallBrInst BlockAddress constraint handling
AbandonedPublic

Authored by nickdesaulniers on Dec 1 2021, 11:19 AM.

Details

Summary

D87279 exposed an existing reliance on tied operands for CallBr to
process its constraints for indirect detination input parameters. A case
of using asm goto with a single output with the +r constraint on a
register variable caused an ICE.

The code originally was very careful to only consider arguments to the
asm which were the indirect targets of the callbr. Because we didn't
initially have support for outputs from callbr (ie. asm goto didn't
originally support output constraints), we didn't need to worry about
non-destination inputs occurring after indirect destination operands, as
can occur when outputs are tied (ie. the +r output constraint).

This became brittle when adding support for "asm goto with outputs"
(D69868) and finally broke when D87279 changed tied operands to no
longer use backreference-like tied inputs for register variables.

The LangRef section titled "Inline Asm Constraint String" currently
states:

The constraints must always be given in that order: outputs first,
then inputs, then clobbers. They cannot be intermingled.

There is an ambiguity where the indirect destinations of a callbr
are inputs, but they have no guaranteed order with respect to other
inputs. In fact, clang can easily intermingle these with other
non-indirect destination inputs as seen when using tied operands.

Codify the existing order of input arguments from clang into the LangRef
to be able to disambiguate this case, and don't just bias the range
check based on matching backreference-like operands. By following this
convention, we don't need to keep track of which arguments to the inline
assembler expression are intended to be the indirect destinations; we
can remateralize that information in SelectionDagBuilder when needed.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1512

Diff Detail

Event Timeline

nickdesaulniers requested review of this revision.Dec 1 2021, 11:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2021, 11:19 AM
void added inline comments.Dec 1 2021, 11:26 AM
llvm/test/CodeGen/X86/callbr-asm-outputs.ll
208

nit: I have a preference for using "test#" for test function names. Using the originating function's name isn't very useful.

  • rename test, add comment about test's intent
nickdesaulniers marked an inline comment as done.Dec 1 2021, 11:34 AM

But that doesn't really matter; CallBrInst can have a BlockAddress parameter that's not necessarily in the formal indirect destination list; so just process all BlockAddress parameters to a CallBrInst the same way.

I'm a little skeptical of this. The normal lowering for a random blockaddress constant would produce an ISD::BlockAddress node; this produces an ISD::TargetBlockAddress. I don't think we treat those the same way, in general; I suspect this patch breaks passing a blockaddress as a register input.

I'd be much happier if we actually did the computation properly to figure out whether an operand is actually an input, an output, or a destination. We have to do that computation at some point anyway, right?

nickdesaulniers edited the summary of this revision. (Show Details)
  • rework the approach completely to check the argno against the list of indirect targets precisely
nickdesaulniers added a comment.EditedDec 2 2021, 11:55 AM

Hmm...this still isn't as precise as it could be. Consider the following input:

int x;
void foo (void) {
    asm goto (
    "# %0\n\t"
    "# %1\n\t"
    "# %l2\n\t"
    : "=r"(x):"rm"(&&bar)::bar);
    bar:;
}

and corresponding IR:

%1 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${2:l}\0A\09", "=r,rm,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i8* blockaddress(@foo, %3))
          to label %2 [label %3]

(so, the indirect destination is both in the inputs and in the indirect destinations. Given an argument number, we still can't tell whether the first or second blockaddress was the intended target.

Changing the above =r output constrain to +r produces:

%2 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${2:l}\0A\09", "=r,rm,X,0,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %4), i8* blockaddress(@foo, %4), i32 %1)
          to label %3 [label %4]

ie. we have inputs that occur both before and after the indirect dest (in this case, "X" is meant for the indirect dest).
(before: "rm" corresponds to the first non-target blockaddress. after: 0 is a "backreference" to the 0-th (first) constraint ("=r"; the output)

then changing int x; at the global scope to register int x asm("edi"); at function scope produces:

%1 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${2:l}\0A\09", "={edi},rm,X,{edi},~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i8* blockaddress(@foo, %3), i32 undef)
          to label %2 [label %3]

ie. we again have inputs that occur both before and after the indirect dest.
(before: "rm" corresponds to the first non-target blockaddress. after: {edi} for the input register local).

I don't see how we can try to re-construct or disambiguate which blockaddress argument is the intended target at this point.

Perhaps if we refined the Langref from:

The constraints must always be given in that order: outputs first,
then inputs, then clobbers. They cannot be intermingled.

to:

The constraints must always be given in that order: outputs first,
then inputs, then clobbers, then goto labels. They cannot be intermingled.

example 3 would change to:

%1 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${2:l}\0A\09", "={edi},rm,{edi},~{dirflag},~{fpsr},~{flags},X"(i8* blockaddress(@foo, %3), i32 undef,  i8* blockaddress(@foo, %3))
          to label %2 [label %3]

This would more closely match the existing order of parameters in inline asm statements. Or perhaps even just requiring that goto labels always were the final inputs. ie. example 3 would change to:

%1 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${2:l}\0A\09", "={edi},rm,{edi},X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i32 undef, i8* blockaddress(@foo, %3))
          to label %2 [label %3]

Because if we didn't intersperse inputs with "goto labels", and we know we have 3 inputs (in the final case described above) and we know we have 1 goto label, then it's obvious that the argument numbers that are indirect dests are in the range [arg_size() - CBR->numIndirectDests(), arg_size()).

I would make such a change as a parent patch, then rebase this on top. Or is there something more perhaps more obvious that I'm missing that won't require a Langref change (and test churn)? Thoughts?

nickdesaulniers planned changes to this revision.Dec 3 2021, 11:41 AM

Shower-epiphany: we just need to walk the arg_list in reverse until we find the first blockaddress input; the number of non-blockaddress inputs is the amount we must bias the index calculation to find the right block address.

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

oh thank god! ok then I can probably make that work. Nonsensical rambling/notes below:


Or perhaps even just requiring that goto labels always were the final inputs. ie. example 3 would change to:

%1 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${2:l}\0A\09", "={edi},rm,{edi},X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i32 undef, i8* blockaddress(@foo, %3))
          to label %2 [label %3]

While it is trivial to reorder the processing labels vs InOutArgs in CodeGenFunction::EmitAsmStmt, this isn't quite right; the positional format strings in the asm string would need to change (from ${2:l} to ${3:l}), something like:

%1 = callbr i32 asm "# $0\0A\09# $1\0A\09# ${3:l}\0A\09", "={edi},rm,{edi},X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i32 undef, i8* blockaddress(@foo, %3))
           to label %2 [label %3]

I'm not sure how I feel about renumbering the asm string from the user like that. But I get the feeling that the "hidden" in-out args in the parameter list could be referred to in the asm string 0:-).

Finally, a more interesting ambiguity that perhaps should be addressed in the langref, specifically for the code in question here, consider:

%1 = callbr i32 asm "# $0\0A\09# ${1:l}\0A\09", "={edi},X,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i8* blockaddress(@foo, %3))
          to label %2 [label %3]

So basically we have 2 block addresses as input. Which blockaddress in the parameter list is the indirect target? It does matter, specifically because of the asm string, the order matters for what the inline asm is trying to do. ie. I think we should be able to flip the order in IR and have that work (even if that's not the order clang will generate today), like:

%1 = callbr i32 asm "# ${0:l}\0A\09# $1\0A\09", "={edi},X,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %3), i8* blockaddress(@foo, %3))
          to label %2 [label %3]

The LLVM code in question here in SelectionDAGBuilder::visitInlineAsm (even with my proposed changes of walking the arg list backwards) is reliant on the unspecified assumption that the inputs are ordered in form "vanilla inputs, labels, hidden tied-inputs" because that's the order that Clang currently emits them in. As demonstrated with this issue, that is somewhat brittle, but worse is both tightly coupled to the front-end/clang AND unspecified in the langref. Theoretically, another frontend other than clang could try to make use of callbr and violate that assumption (see example IR immediately above). I hope that never happens, but it's not impossible. I still think my idea for computing the bias by walking the arg list backwards is the best approach today, but the reviewers should consider whether this ambiguity I'm highlighting actually exists, and if so, should we provide stronger language in the langref about this particularly in this patch?

nickdesaulniers edited the summary of this revision. (Show Details)
  • rework approach again to walk arg list backwards, update LangRef
void added a comment.Dec 7 2021, 12:01 PM

Shower-epiphany: we just need to walk the arg_list in reverse until we find the first blockaddress input; the number of non-blockaddress inputs is the amount we must bias the index calculation to find the right block address.

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

oh thank god! ok then I can probably make that work. Nonsensical rambling/notes below:

I would have had to give up programming and become a basket weaver if block addresses were l-values. :-P

void accepted this revision.Dec 7 2021, 12:09 PM

LGTM with the addition of another test case.

llvm/test/CodeGen/X86/callbr-asm-outputs.ll
207

It might be good to have a test from one of your examples where a block addresses are used as inputs but not indirect destinations. From above:

int x;
void foo (void) {
    asm goto (
    "# %0\n\t"
    "# %1\n\t"
    "# %l2\n\t"
    : "=r"(x):"rm"(&&bar)::bar);
    bar:;
}
This revision is now accepted and ready to land.Dec 7 2021, 12:09 PM
  • add test (the ambiguity described isn't dependent on outputs, so put in different unit test file)
nickdesaulniers marked an inline comment as done.Dec 7 2021, 1:53 PM
  • fix typos in new test's comment. In vim, :set spell, then z= for fixits.

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

I think you meant to write something like this:

void *x(void) {
    void *p = &&foo;
    asm goto ("":"+r"(p):::foo);
    foo:;
    return p;
}

Sorry, I think my original comment led you in the wrong direction. Thinking about this a bit more, I think the solution is actually something like the following: treat any blockaddress passed to an "X" constraint the same way, regardless of whether there's a callbr involved, and regardless of the position in the input list. I think that comes closest to matching the original design here.

The "with an 'X' constraint" part avoids messing with users passing a blockaddress as a register input or something like that.

Actually, not sure why clang is emitting "X" for asm goto destinations, instead of "i". "i" is explicitly defined to be an immediate, and it's not clear what "X" is supposed to mean. So maybe better to do something like the following:

  1. Make clang emit "i" instead of "X" for asm goto destinations. LLVM already handles blockaddress passed to "i" correctly.
  2. Drop the special case for blockaddress in SelectionDAGBuilder::visitInlineAsm.
  3. Optionally, add some extra handling for "X" in the backend, for backwards compatibility with IR generated by old clang.
nickdesaulniers planned changes to this revision.Dec 7 2021, 4:23 PM

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

I think you meant to write something like this:

void *x(void) {
    void *p = &&foo;
    asm goto ("":"+r"(p):::foo);
    foo:;
    return p;
}

Damn! Then yeah, this approach (Diff 392533) is still broken.

Sorry, I think my original comment led you in the wrong direction. Thinking about this a bit more, I think the solution is actually something like the following: treat any blockaddress passed to an "X" constraint the same way, regardless of whether there's a callbr involved, and regardless of the position in the input list. I think that comes closest to matching the original design here.

As in always convert it to a TargetBlockAddress during selectiondagisel?

The "with an 'X' constraint" part avoids messing with users passing a blockaddress as a register input or something like that.

Actually, not sure why clang is emitting "X" for asm goto destinations, instead of "i". "i" is explicitly defined to be an immediate, and it's not clear what "X" is supposed to mean. So maybe better to do something like the following:

  1. Make clang emit "i" instead of "X" for asm goto destinations. LLVM already handles blockaddress passed to "i" correctly.
  2. Drop the special case for blockaddress in SelectionDAGBuilder::visitInlineAsm.
  3. Optionally, add some extra handling for "X" in the backend, for backwards compatibility with IR generated by old clang.

Thanks for the guidance, and the additional test case to consider. Let me give your suggestions a shot.

Regarding 3, is that something that we do in LLVM often? ie. try to support IR produced by a different version of clang?

void added a comment.Dec 7 2021, 4:32 PM

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

I think you meant to write something like this:

void *x(void) {
    void *p = &&foo;
    asm goto ("":"+r"(p):::foo);
    foo:;
    return p;
}

Sorry, I think my original comment led you in the wrong direction. Thinking about this a bit more, I think the solution is actually something like the following: treat any blockaddress passed to an "X" constraint the same way, regardless of whether there's a callbr involved, and regardless of the position in the input list. I think that comes closest to matching the original design here.

The "with an 'X' constraint" part avoids messing with users passing a blockaddress as a register input or something like that.

Actually, not sure why clang is emitting "X" for asm goto destinations, instead of "i". "i" is explicitly defined to be an immediate, and it's not clear what "X" is supposed to mean. So maybe better to do something like the following:

X seems to be the correct constraint code to use:

* i: An integer constant (of target-specific width). Allows either a simple immediate, or a relocatable value.
 ...
* X: Allows an operand of any kind, no constraint whatsoever. Typically useful to pass a label for an asm branch or call.
  1. Make clang emit "i" instead of "X" for asm goto destinations. LLVM already handles blockaddress passed to "i" correctly.
  2. Drop the special case for blockaddress in SelectionDAGBuilder::visitInlineAsm.
  3. Optionally, add some extra handling for "X" in the backend, for backwards compatibility with IR generated by old clang.

We could possibly remove the special handling here. We just need to pass to SelectionDAGBuilder::getValueImpl something indicating that it's a "target".

void added a comment.Dec 7 2021, 4:34 PM

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

I think you meant to write something like this:

void *x(void) {
    void *p = &&foo;
    asm goto ("":"+r"(p):::foo);
    foo:;
    return p;
}

Damn! Then yeah, this approach (Diff 392533) is still broken.

I must be missing something. Isn't that just simply reassigning the variable p and not using the label foo for output?

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

I think you meant to write something like this:

void *x(void) {
    void *p = &&foo;
    asm goto ("":"+r"(p):::foo);
    foo:;
    return p;
}

Damn! Then yeah, this approach (Diff 392533) is still broken.

I must be missing something. Isn't that just simply reassigning the variable p and not using the label foo for output?

The corresponding IR (trimmed for brevity) for @efriedma 's example above is:

%1 = callbr i8* asm "", "=r,X,0,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@x, %3), i8* blockaddress(@x, %3)) #1
        to label %2 [label %3]

So because p is an InOutArg (see also CodeGenFunction::EmitAsmStmt in clang/lib/CodeGen/CGStmt.cpp), technically the first blockaddress in the parameter list of the callbr is the indirect destination, while the second is the hidden tied-input.

InOutArgBias in my Diff 392533 would then be 0, since it's written to assume that the latest blockaddress in the argument list is the final possible indirect destination. @efriedma 's example shows that assumption is incorrect, and thus that Diff 392533 is still subtly broken.

@efriedma 's example shows that the address of the foo label is output. That doesn't make too much sense as an output in C, but it does mean that the address of foo could be written to by the inline asm string. I hope that's not useful to anyone...but I get a sense I might eat those words one day...plz no.

That doesn't make too much sense as an output in C, but it does mean that the address of foo could be written to by the inline asm string. I hope that's not useful to anyone...but I get a sense I might eat those words one day...plz no.

Why?!

$ cat x.c
#include <stdio.h>

int main () {
  void (*foo)(void) = &&bar;
bar:
  printf("lo");
  foo();
}
$ clang x.c
$ ./a.out
lolololololololololololololololololo...

Every day we stray further from God's light...

void added a comment.Dec 8 2021, 12:31 PM

Why?!

$ cat x.c
#include <stdio.h>

int main () {
  void (*foo)(void) = &&bar;
bar:
  printf("lo");
  foo();
}
$ clang x.c
$ ./a.out
lolololololololololololololololololo...

You monster!

As in always convert it to a TargetBlockAddress during selectiondagisel?

Not sure if we need to convert it to a TargetBlockAddress specifically? We just need to make sure we don't copy the value into a register. I didn't dig too deeply into it, but I think we're doing something weird to specifically make "X" not do the right thing, compared to "i"; see TargetLowering::LowerXConstraint.

Wait a second;

void *x(void) {
    void *p = &&foo;
    asm goto ("# %0\n\t# %l1":"+r"(p):::foo);
    foo:;
    return p;
}

doesn't compile on gcc-11:

<source>:3:5: error: invalid 'asm': '%l' operand isn't a label
    3 |     asm goto ("# %0\n\t# %l1":"+r"(p):::foo);
      |     ^~~

if you change the asm string to use %l2 rather than %l1, it compiles (and emits the expected code). I think this is another incompatibility that is likely to harm us in the future. I'm not sure yet whether this is orthogonal to this issue. (I'm not sure whether this is a bug or a feature in GCC; it means that the %0,%1,%2 references in the asm string are ordered: outputs, non-tied-inputs, inputs-tied-to-outputs, then goto-labels. Clang today emits them as outputs, non-tied-inputs, goto-labels, then inputs-tied-to-outputs.

If we fixed that incompatibility (not sure if it's a bug vs feature in GCC), then the approach in this patch would make more sense. That said, I have made the changes that @efriedma described locally and they do seem like a good simplification. They need more testing, but I'm cleaning them up and will hopefully post by EOD.

Finally, I had a thought last night that I've always felt that callbr had a minor design issue that never quite sat right with me. https://reviews.llvm.org/D74947 demonstrates how methods like User::replaceUsesOfWith can subtly break callbr. Since callbr duplicates it indirect destinations twice in its operand list, once for the arg list to the call asm and again just cause (ie. the labels used as indirect destinations show up twice in the actual IR statement), replacing operands is brittle because the duplicate operands need to stay in sync. If the indirect list at the end used placeholders more akin to std::placeholders, then we would know precisely which argument are intended to be indirect destinations, and avoid brittle duplication that breaks when calling User::replaceUsesOfWith. This would be extensive surgery to callbr, but I'd sleep better at night knowing that callbr was less brittle.

if you change the asm string to use %l2 rather than %l1, it compiles (and emits the expected code). I think this is another incompatibility that is likely to harm us in the future. I'm not sure yet whether this is orthogonal to this issue. (I'm not sure whether this is a bug or a feature in GCC; it means that the %0,%1,%2 references in the asm string are ordered: outputs, non-tied-inputs, inputs-tied-to-outputs, then goto-labels. Clang today emits them as outputs, non-tied-inputs, goto-labels, then inputs-tied-to-outputs.

Ouch. We probably need to fix that...

Finally, I had a thought last night that I've always felt that callbr had a minor design issue that never quite sat right with me. https://reviews.llvm.org/D74947 demonstrates how methods like User::replaceUsesOfWith can subtly break callbr. Since callbr duplicates it indirect destinations twice in its operand list, once for the arg list to the call asm and again just cause (ie. the labels used as indirect destinations show up twice in the actual IR statement), replacing operands is brittle because the duplicate operands need to stay in sync. If the indirect list at the end used placeholders more akin to std::placeholders, then we would know precisely which argument are intended to be indirect destinations, and avoid brittle duplication that breaks when calling User::replaceUsesOfWith. This would be extensive surgery to callbr, but I'd sleep better at night knowing that callbr was less brittle.

The current design allows you to use blockaddress constants in code outside the asm goto to compute the destination of an asm goto, I think? At least, the IR definition looks like it allows that; not sure how well it works in practice.

If we're willing to abandon this, we could use an alternate design where we don't use blockaddress at all. Instead, asm callbr has implicit arguments: we just allow the asm to refer to the destinations labels directly. So for a callbr with no explicit arguments, "$0" refers to the first destination. The primary benefit here is that we can remove a bunch of special cases for callbr control flow at the IR level. (I expect that in/after SelectionDAG, we'd keep the current representation? At least, we can consider that separately.)

The current design allows you to use blockaddress constants in code outside the asm goto to compute the destination of an asm goto, I think? At least, the IR definition looks like it allows that; not sure how well it works in practice.

That was the original intent of the design for the IR, yes -- you could e.g. store an array of blockaddresses in a global constant, and have the asm look up where to jump in that. You still need to specify all the possibilities in the "to label..." list on the callbr, but there's no reason for them to be direct input values, in such a case.

Unless you can tie a block address. Hopefully "+r"(&&goto_label) isn't a valid output constraint...

void x(void) {
    asm goto ("":"+r"(&&foo):::foo);
    //                ^ error: lvalue required in 'asm' statement
    foo:;
}

I think you meant to write something like this:

void *x(void) {
    void *p = &&foo;
    asm goto ("":"+r"(p):::foo);
    foo:;
    return p;
}

Sorry, I think my original comment led you in the wrong direction. Thinking about this a bit more, I think the solution is actually something like the following: treat any blockaddress passed to an "X" constraint the same way, regardless of whether there's a callbr involved, and regardless of the position in the input list. I think that comes closest to matching the original design here.

The "with an 'X' constraint" part avoids messing with users passing a blockaddress as a register input or something like that.

Actually, not sure why clang is emitting "X" for asm goto destinations, instead of "i". "i" is explicitly defined to be an immediate, and it's not clear what "X" is supposed to mean. So maybe better to do something like the following:

  1. Make clang emit "i" instead of "X" for asm goto destinations. LLVM already handles blockaddress passed to "i" correctly.
  2. Drop the special case for blockaddress in SelectionDAGBuilder::visitInlineAsm.
  3. Optionally, add some extra handling for "X" in the backend, for backwards compatibility with IR generated by old clang.

1a. https://reviews.llvm.org/D115410
1b. https://reviews.llvm.org/D115311

  1. https://reviews.llvm.org/D115409
  2. <not done/is this necessary?>

if you change the asm string to use %l2 rather than %l1, it compiles (and emits the expected code). I think this is another incompatibility that is likely to harm us in the future. I'm not sure yet whether this is orthogonal to this issue. (I'm not sure whether this is a bug or a feature in GCC; it means that the %0,%1,%2 references in the asm string are ordered: outputs, non-tied-inputs, inputs-tied-to-outputs, then goto-labels. Clang today emits them as outputs, non-tied-inputs, goto-labels, then inputs-tied-to-outputs.

Ouch. We probably need to fix that...

  1. https://reviews.llvm.org/D115471

Finally, I had a thought last night that I've always felt that callbr had a minor design issue that never quite sat right with me. https://reviews.llvm.org/D74947 demonstrates how methods like User::replaceUsesOfWith can subtly break callbr. Since callbr duplicates it indirect destinations twice in its operand list, once for the arg list to the call asm and again just cause (ie. the labels used as indirect destinations show up twice in the actual IR statement), replacing operands is brittle because the duplicate operands need to stay in sync. If the indirect list at the end used placeholders more akin to std::placeholders, then we would know precisely which argument are intended to be indirect destinations, and avoid brittle duplication that breaks when calling User::replaceUsesOfWith. This would be extensive surgery to callbr, but I'd sleep better at night knowing that callbr was less brittle.

The current design allows you to use blockaddress constants in code outside the asm goto to compute the destination of an asm goto, I think? At least, the IR definition looks like it allows that; not sure how well it works in practice.

If we're willing to abandon this, we could use an alternate design where we don't use blockaddress at all. Instead, asm callbr has implicit arguments: we just allow the asm to refer to the destinations labels directly. So for a callbr with no explicit arguments, "$0" refers to the first destination. The primary benefit here is that we can remove a bunch of special cases for callbr control flow at the IR level. (I expect that in/after SelectionDAG, we'd keep the current representation? At least, we can consider that separately.)

So the idea I had (I should probably post this to llvm-dev, rather than re-use this likely to be abandoned code review) was to change:

callbr void asm sideeffect "", "r,X"(i64 42, i8* blockaddress(@test5, %2))
          to label %1 [label %2]

to

callbr void asm sideeffect "", "r,X"(i64 42, i8* blockaddress(@test5, %2))
          to label %1 [i64 1]

(Note: the only difference is in the indirect destination list at the end within the []).

In the first case, the label %2 shows up twice within the operand list of the callbr. Once near the end as an indirect destination, but again in the blockaddress argument list to the asm. Those two need to stay in sync, and D74947 demonstrates how easily User::replaceUsesOfWith can force those repeated operands out of sync.

In the second case, I just use a literal i64 1 which is an index into the argument list (i64 0 would refer to i64 42; the 0th arg to the asm). This is akin/reminiscent of std::placeholders that one might use with std::bind. I don't think we ever reorder operands to an asm call, so we can use User::replaceUsesOfWith on the blockaddress operands/arguments all we want without the indices getting messed up.

This would also have the side effect of fixing the ambiguity described in this exact phab review. For an input like:

; see test5 from this phab review above
callbr void asm sideeffect "# $0\0A\09# ${1:l}\0A\09", "X,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@test5, %2), i8* blockaddress(@test5, %2))
          to label %1 [i64 1]

It's no longer ambiguous which of the 2 matching arguments is the indirect destination; the latter is! So disambiguation and making callbr less brittle to User::replaceUsesOfWith are two benefits IMO of doing such a change.