This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAGBuilder] Add restrictions to EmitFuncArgumentDbgValue
ClosedPublic

Authored by bjope on Feb 4 2019, 11:42 AM.

Details

Summary

This patch fixes PR40587.

When a dbg.value instrinsic is emitted to the DAG
by using EmitFuncArgumentDbgValue the resulting
DBG_VALUE is hoisted to the beginning of the entry
block. I think the idea is to be able to locate
a formal argument already from the start of the
function.
However, EmitFuncArgumentDbgValue only checked that
the value that was used to describe a variable was
originating from a function parameter, not that the
variable itself actually was an argument to the
function. So when for example assigning a local
variable "local" the value from an argument "a",
the DBG_VALUE instruction describing that "local"
had the value of "a" would be hoisted to the
beginning of the function, even if the scope for
"local" started somewhere else (or if "local"
had other values earlier in the function).

This patch adds some logic to EmitFuncArgumentDbgValue
to check that the variable being described actually
is an argument to the function. And that the dbg.value
being lowered already is in the entry block. Otherwise
we bail out, and the dbg.value will be handled as an
ordinary dbg.value (not as a "FuncArgumentDbgValue").

A tricky situation is when both the variable and
the value is related to function arguments, but not
the same argument. This is kind of hard to check.
Instead we verify that we do not describe the same
argument more than once as a "FuncArgumentDbgValue".
This solution works as long as opt has injected a
"first" dbg.value that corresponds to the formal
argument.

Event Timeline

bjope created this revision.Feb 4 2019, 11:42 AM
bjope planned changes to this revision.Feb 4 2019, 11:46 AM

There is some problem with this making the test/DebugInfo/AArch64/inlined-argument.ll fail. The inlined "resources" parameter will appear optimized out. I haven't figured out why yet.

aprantl added inline comments.Feb 4 2019, 12:28 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
4989

This condition looks very much like the condition in https://reviews.llvm.org/D57584. Could it be factored out into a helper function with a reasonable name and documented behavior?

bjope updated this revision to Diff 185558.Feb 6 2019, 7:55 AM

Added some code to make test/DebugInfo/AArch64/inlined-argument.ll pass.
I now allow emitting using ArgDbgValue if the dbg.value intrinsic is
in the beginning of the function entry (before any non-dbg instruction).
This is a little bit hacky, but the best solution I could find.

bjope added inline comments.Feb 6 2019, 8:08 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
1159

I'll investigate this a little bit more. I have a reproducer where this results would result in incorrect hosting (but that is on trunk). With this patch EmitFuncArgumentDbgValue returns false for that reproducer.
But adding an assert in the else clause (after applying this patch), and then running some test suites might give us some new reproducers (or indicate that the EmitFuncArgumentDbgValue()" call is redundant here).

4989

Yes, I'm kind of monitoring the progress of D57584 to see where we end up with these checks (I'm now using the getInlinedAt kind of check here as well). Either we can wait and refactor after these patches has landed, or we can perhaps create a common solution (in one of the patches) based on which patch that lands first.
I got a feeling that somehow they might overlap (both patches are reducing amount of hoisted DBG_VALUE nodes to the entry).

bjope updated this revision to Diff 185571.Feb 6 2019, 9:01 AM

Revert the no longer needed modification in test/DebugInfo/Sparc/subreg.ll

bjope edited the summary of this revision. (Show Details)Feb 6 2019, 9:02 AM
jmorse added a comment.Feb 7 2019, 4:29 AM

Looks great,

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
4997

Typo dgg -> dbg

5025–5032

Will this work with Arguments that are referred to by multiple fragments? Perhaps it doesn't need to, however the two-i32s-packed-in-i64-argument in PR40188 sprung to mind as that produces two DBG_VALUEs based on one Argument.

bjope marked an inline comment as done.Feb 7 2019, 5:19 AM
bjope added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5025–5032

At one point in time I tried using ArgNo = Variable->getArg(), looking at the C-level argument numbers, but then I had a test case that failed due to receiving that single struct variable in multiple registers (multiple fragments).
Now I use the argument numbers from the function on IR level. So at least that test case passed, since we get one fragment per input register. I guess it will be the same if the value is divided in multiple stack slots.

So I actually had problem with fragments in my first attempts to fix this, but I have no (real, non handwritten IR) test cases failing with the current solution.
Maybe there can be scenarios when we have multiple fragments mapping to one argument on IR level? (probably easy to do when doing a handwritten test case, but does it happen in reality)

So we might need to "complicate" this code even more in the future, or even find a totally different solution (the special case handling for function arguments seems a little bit messy IMHO). Maybe the frontend shouldn't use dbg.declare/dbg.value to describe input arguments, maybe we need dbg.argument or some other way to model this (if we really care about debug info for unused arguments, and if we really care about debug info inside the prologue)?

For now, as far as I've seen (mainly when looking at C-programs), this patch seems to improve things. Avoiding faulty placement of DBG_VALUE nodes due to incorrectly treating them as describing function arguments. There is a small risk that it is too defensive in some situations. But instead of getting weird values (like with the faulty hoisting), being defensive in this case means that we would get <optimized out> in the debugger.

bjope marked 2 inline comments as done.Feb 7 2019, 5:22 AM
bjope added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
1159

I've run a bunch of tests (including random programs generated by csmith) and with the changes in EmitFuncArgumentDbgValue I haven't seen any situation when EmitFuncArgumentDbgValue would return true here any longer (and in the past that seems to have been incorrect). So I really think we shold remove the call to EmitFuncArgumentDbgValue here. But I rather do it in a separate patch after this one.

4997

Thanks, I'll fix that!

jmorse added a comment.Feb 8 2019, 4:47 AM

LGTM although I'll wait for someone else to approve, plus it'd be good to hear Adrians view on the fragment scenario.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5025–5032

[...] a test case that failed due to receiving that single struct variable in multiple registers (multiple fragments)

Ah, my use of "fragments" was too vague, it was intended to mean:

Maybe there can be scenarios when we have multiple fragments mapping to one argument on IR level?

Which I tried to produce an example for, but I hit some extra bugs along the way: [0]. In the example code there [0] the struct argument comes in as one i64, and SROA splits the dbg.declare into two dbg.values that refer to different variable fragments in their DIExpressions, but both ultimately refer to the same input Argument. If it weren't for [0] I believe (95%) they'd be salvaged back to two dbg.values operating on the input Argument.

For now, as far as I've seen (mainly when looking at C-programs), this patch seems to improve things.

I agree, the argument-handling-weirdness has bugged me for a while, and this patch reduces the amount false debug information we produce. If multiple fragments of one Argument can be an issue we'll definitely hit it when solving [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=40645

plus it'd be good to hear Adrians view on the fragment scenario.

Do you have any specific questions in mind?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5011

I don't understand why this is necessary?

5023

That is not generally possible. For example:

$ cat /tmp/s.c 
struct s { long long int i, j; };

int f(struct s s) {
  return s.i+s.j;
}

$ clang -g -S -emit-llvm /tmp/s.c -o -
; ModuleID = '/tmp/s.c'
source_filename = "/tmp/s.c"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.14.0"

%struct.s = type { i64, i64 }

; Function Attrs: noinline nounwind optnone ssp uwtable
define i32 @f(i64 %s.coerce0, i64 %s.coerce1) #0 !dbg !8 {
entry:

Here you'd have two dbg.values referring to the same DILocalVariable, but with different DIExpressions (with different DW_OP_LLVM_fragment). Since this case is so easy to reproduce, it would be good to at least handle it. The opposite (two DILocalVariables referring to the same function parameter) seems less interesting to me.

5025–5032

Maybe there can be scenarios when we have multiple fragments mapping to one argument on IR level? (probably easy to do when doing a handwritten test case, but does it happen in reality)

The Swift compiler splats the elements of structs into individual function arguments and has a function signature optimization pass. If the optimizer were better at preserving debug info this combination could trigger what you are describing.

bjope marked 2 inline comments as done.Feb 9 2019, 4:23 PM
bjope added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5011

One missing piece in the example might be that the function on C level looks like this

void foo(long a, long b) {
  ...
  b = 123;
  ...
  b = a;
}

so "b" is an argument on C level, and %a and` %b` are function argument values.

The assignment b = a results in the last dbg.value where we say that "b" has the value of %a. So both the DIVariable and the value is associated with function arguments, but it is not correct to hoist this dbg.value to the function entry (for example hoisting above the dbg.value mapping "b" to 123), because the assignment is not done at function entry.

So the above maps to the @foo_other_param test.

I also made an assumption that we could not get something like this:

define @foo(i32 %a, i32 %b)  {
      entry:
        call void @llvm.dbg.value(metadata i32 %b, "b",
        ...
        call void @llvm.dbg.value(metadata i32 123, "b"
        ...
        call void @llvm.dbg.value(metadata i32 %b, "b"

where "b" first gets the value %b (at the function entry), then is reassigned another value, and then gets the value %b again. Because then the value of "b" must saved somethere so the value in the last assignment of "b" will not be %b. Now I realize that the above pattern isn't that hard to trigger, just use an input like this:

extern void fie(long);

void foo(long a, long b) {
  long c = b;
  b = 123;
  fie(b);
  b = c;
  fie(b);
}

So we can't really consider a dbg.value as describing a function argument just because "the parameter value V actually maps to the argument described by Variable". I will remove that code comment.

The only solution I can imagine right now is the one that I have implemented (assuming that only the first dbg.value that describes an argument (on IR level) is a "function argument dbg.value").

bjope updated this revision to Diff 186138.Feb 9 2019, 4:56 PM

Updated code comments.

Added a test case involving fragments (a struct split between two input
registers). The test case also involves some alloca statements that appear
before the dbg.value statements that describe the input arguments. So the new
test case should demonstrate why it isn't enough to have the "IsInPrologue"
check and the "VariableIsFunctionInputArg" check. We also need the
FuncInfo.DescribedArgs solution if we want to get the DBG_VALUE instructions
that descibe the functions arguments at the function entry.

bjope marked an inline comment as done.Feb 9 2019, 5:13 PM
bjope added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5023

The code handles the situation with fragments from your example. I've now added test/DebugInfo/X86/dbg-value-funcarg2.ll which is a little bit inspired by your example.

bjope updated this revision to Diff 186140.Feb 9 2019, 5:32 PM

Minor adjustments to the test case (added some missing CHECK-NOT).

plus it'd be good to hear Adrians view on the fragment scenario.

Do you have any specific questions in mind?

I think I meant to write something like "Perhaps Adrian knows a situation where multiple fragments map to one function Argument", but had an attack of vagueness, sorry. (FWIW I think your comment on line 5032 covered that).

aprantl accepted this revision.Feb 11 2019, 9:29 AM

Thanks for the explanation!

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5005

we usually write this as !DL->getInlinedAt()

5041

I assume the in-between elements are zeroed out by default?

This revision is now accepted and ready to land.Feb 11 2019, 9:29 AM
bjope marked 2 inline comments as done.Feb 11 2019, 10:27 AM
bjope added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5005

Yes, I'll fix that before commit.

5041

Yes, but I can add an explicit argument to resize to indicate that we want to initialize new bits to "false" to avoid relying on default arguments.

This revision was automatically updated to reflect the committed changes.