# [DwarfDebug] Dump call site debug info into DWARFClosedPublicActions

Authored by djtodoro on Apr 15 2019, 8:57 AM.

# Details

Summary

Dump DWARF information about call sites and call site parameters into debug info sections.

The patch also provides an interface for interpretation of instructions that could load values of call site parameters in order to generate DWARF info about call site parameters.

Authors: @asowda, @NikolaPrica, @djtodoro, @ivanbaev

# Diff Detail

### Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

djtodoro updated this revision to Diff 201522.May 27 2019, 6:46 AM

djtodoro updated this revision to Diff 202420.May 31 2019, 5:09 AM

-Rebase tests

Herald added a project: Restricted Project. Jun 20 2019, 6:37 AM
llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
473 ↗(On Diff #205799)

Is the fallthrough intentional here, or a merge artifact?

djtodoro marked an inline comment as done.Jul 4 2019, 7:57 AM
llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
473 ↗(On Diff #205799)

We forgot to update this on to Phabricator. Thanks!

djtodoro updated this revision to Diff 208045.Jul 4 2019, 7:59 AM

-Fixing unintentional code during the rebase process

This revision was automatically updated to reflect the committed changes.

Thanks a lot for your work with this patch series! It seems very useful.

Sorry for coming with this post-merge question, but I encountered it first now when trying to adapt this to our downstream target; what limitations are imposed on the MachineOperand pointer that the describeLoadedValue returns? For example, in our downstream target we materialize 0 with a sub $reg,$reg, so we'd have to return a pointer to a MachineOperand that is not part of the instruction. Is that allowed? I guess x86 has the same issue as it materializes 0 with xor.

Perhaps it would be possible to make the hook return a MachineOperand object instead? It could simplify the code a bit in such cases. Or would that be too expensive?

@dstenb Thanks a lot!

That seems like a good catch and we should support such scenario.

I think that it will be expensive to return a MachineOperand object. Could you please provide us with more info about the case? Can we describe such situation through the DIExpression instead?

This commit broke codegen on SystemZ, we're now failing both the test-suite and the multistage bootstrap. Unfortunately this wasn't detected earlier since the SystemZ build bot was down.

The regressions go away when I revert the change to TargetRegisterInfo::isCallerPreservedPhysReg, i.e. change it back to always returning false. Note that this change is the only modification in this patch that has any effect on actual code generation (as opposed to debug info generation) -- was this really intentional?

I have not yet debugged the failure in detail, but I believe the isCallerPreservedPhysReg change is in fact incorrect -- this function was not intended to just mirror getCallPreservedMask (which simply describes the function call ABI), but rather to handle special cases like the PowerPC TOC register, which is not just simply preserved across calls, but rather is invariant throughout the whole function. This is why this check is used to eliminate invariant stores in MachineLICM. The change in this patch breaks that semantics.

This really needs to be fixed before the LLVM 9 release.

djtodoro reopened this revision.Jul 12 2019, 4:31 AM

@uweigand Thanks for that! The commit is reverted with the rL365886.

We didn't want to change the generated code, so we just made a quick fix for this. This should solve the problem.

This revision is now accepted and ready to land.Jul 12 2019, 4:31 AM
djtodoro updated this revision to Diff 209459.Jul 12 2019, 4:35 AM

-Make a wrapper around the getCallPreservedMask() and use it in order to avoid changes in generated code

@dstenb Thanks a lot!

That seems like a good catch and we should support such scenario.

I think that it will be expensive to return a MachineOperand object.

The size of MachineOperand is 32 bytes on x86-64, and the class only contains POD fields, so I guess it should be fairly cheap to copy such objects, or initialize them with compile-time known values? Though, perhaps that would still cause a noticeable increase in compilation time.

I think it would be interesting to benchmark that at least.

Here is an x86-64 reproducer.

extern void callee(int);
void caller() { callee(0); }

When compiled using -O2 -g -Xclang -femit-debug-entry-values it gives the following instruction to try to describe the parameter with:

$edi = XOR32rr undef$edi(tied-def 0), undef $edi, implicit-def dead$eflags

Can we describe such situation through the DIExpression instead?

Oh, maybe! Do you have some suggestions for how that could be done?

@dstenb Thanks a lot, that seems useful!

The size of MachineOperand is 32 bytes on x86-64, and the class only contains POD fields, so I guess it should be fairly cheap to copy such objects, or initialize them with compile-time known values? Though, perhaps that would still cause a noticeable increase in compilation time.

I think it would be interesting to benchmark that at least

Yes, I agree that we should definitely experiment with that.

Here is an x86-64 reproducer.

extern void callee(int);
void caller() { callee(0); }

When compiled using -O2 -g -Xclang -femit-debug-entry-values it gives the following instruction to try to describe the parameter with:

$edi = XOR32rr undef$edi(tied-def 0), undef $edi, implicit-def dead$eflags

Thanks for the test case, I also came up with some "real world" cases with situation like this.

Can we describe such situation through the DIExpression instead?

Oh, maybe! Do you have some suggestions for how that could be done?

I think we could make a DIExpression with the DW_OP_const* , 0 (within ParamLoadedValue) and then handle it properly in collectCallSiteParameters(). WDYT?
In addition, do you think that we could treat that situation like an improvement and create a separate thread for that (so we can re-land this with the latest fix)?

Thanks for the quick revert! The new patch looks reasonable to me.

@uweigand Thanks for the note and comment!

vsk added a comment.Jul 25 2019, 6:52 PM

This has been sitting on my review queue for too long, I'm really sorry for the delay here. I've still only gotten 2/3 of the way through this patch but will make time to finish the rest soon. Thanks for working on this!

I have a correctness concern in 'shouldInterpret'. My other concerns are about documentation/naming.

include/llvm/CodeGen/TargetInstrInfo.h
1684

Could you clarify what RHS mean, here?

lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
945

const auto& Param?

lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
228

Please document these methods and, if it's not too much trouble, mark them const.

237

As the descriptions of the parameters here is fairly complex, could you split the discussion of each parameter into a separate paragraph?

Also, is the parenthetical "(in the case of non-gdb tuning)" in the PCOffset description redundant?

237

Does CallReg=0 for direct calls?

241

Please document constructCallSiteParmEntryDIEs. Specifically, it would help to have a \ref to the function that creates the param array, and to have a note about whether or not the order of the parameters matters.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
539

Can we assert that this insertion into ArgsRegsForProcess occurs? I'm assuming that it's a bug if more than one argument is forward in the same register.

Super nit-pick, but, I don't think having 'ForProcess' in the name is very clear. Something like 'ForwardedRegWorklist' would be more in keeping with the naming in the rest of the compiler.

544

Could you add a comment explaining what RegsForEntryValues is before explaining ShouldTryEmitEntryVals?

548

The comment for 'shouldInterpret' should be updated to reflect the fact that it returns a register, not a bool. Also, consider using a more specific name, like 'getForwardingRegDefinedByMI'.

556

Is the first call to MO.getReg() just checking for undef? I wonder why TRI->isPhysicalRegister() does not make it redundant.

558

Can more than one Def be attached to an MI? 'MachineInstr::getNumDefs()' suggests it is possible. If so, then it seems like there is a bug here when more than one forwarding reg is clobbered by MI?

566

Why are DbgLocVal and Reg passed in by reference?

569

Consider using RegsForEntryValues.find() to avoid doing two lookups.

576

Consider using a for loop here. It will guarantee iterator increment and make it easier to continue early.

586

To simplify this, I suggest continuing early when Reg is undef.

589

These declarations for Op and Expr don't seem necessary.

603

Please use /*IsIndirect=*/IsSPorFP when calling the MachineLocation constructor.

606

I wonder whether two passes through the block are needed, as ArgsRegsForProcess may not be complete when it is referenced in shouldInterpret.

619

Not sure what '1' is for in '{dwarf::DW_OP_entry_value, 1}', perhaps a comment would help?

621

RegsForEntryValues.find to avoid two lookups?

lib/CodeGen/AsmPrinter/DwarfDebug.h
269

unsigned Register; ///< Parameter register at the callee entry point.
DbgValueLoc; ///< Corresponding location for the parameter value at the call site.
272

Assert that Reg isn't undef here?

275

Please mark the DbgCallSiteParam methods const.

test/DebugInfo/MIR/X86/dbgcall-site-interpretation.mir
30

Consider using 'FileCheck -implicit-check-not=DW_TAG_GNU_call_site_parameter %s' to catch all of them.

djtodoro marked 36 inline comments as done.Jul 29 2019, 5:57 AM

@vsk Thanks a lot for such nice review!

include/llvm/CodeGen/TargetInstrInfo.h
1684

Sure.

lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
237

As the descriptions of the parameters here is fairly complex, could you split the discussion of each parameter into a separate paragraph?

Sure.

Also, is the parenthetical "(in the case of non-gdb tuning)" in the PCOffset description redundant?

The comment is updated in order to avoid the ambiguity. GDB uses PC info even for tail calls.

Does CallReg=0 for direct calls?

Yes.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
539

This only adds all the regs into the working list by iterating through the CallSiteInfo.

Super nit-pick, but, I don't think having 'ForProcess' in the name is very clear. Something like 'ForwardedRegWorklist' would be more in keeping with the naming in the rest of the compiler.

Thanks!!

544

Sure.

548

Thanks for this!

556

Yes..Thanks!

558

Hmm...Yes, we have not consider such situation. Thanks!

566

We can pass them as values, and let the compiler to optimize it for us.

606

The working list is complete at the moment. But, we are inserting new registers in order to try generating call site parameters values as entry values (after the loop ends). We are doing that only for the entry MBB (the simplest case), but since the GCC generates the call site values expression as entry values even for the MBBs other then the entry one, we have left the TODO comment for that.

619

Sure.

lib/CodeGen/AsmPrinter/DwarfDebug.h
269

Thanks!

test/DebugInfo/MIR/X86/dbgcall-site-interpretation.mir
30

We just wanted to emphasizes that we did not describe a call site parameter. Since all of this is controlled by an option, the implicit-check-not will require not to use the option on the same test case. But we put an error alerting that the CallSiteInfo (the MF attribute callSites) is provided but not used, so it will be difficult to use the implicit-check-not here.

djtodoro updated this revision to Diff 212158.Jul 29 2019, 5:58 AM
djtodoro marked 12 inline comments as done.

lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
236

nit, tuning

240

nit, matter, and its -> the

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
539

I see. I had in mind:

bool InsertedReg = ForwardedRegWorklist.insert(ArgReg.Reg).second;
assert(InsertedReg && "Single register used to forward two arguments?");

If this is already checked elsewhere, there's no need for a duplicate assert.

568

After FwdReg from MO is pushed into Defs, shouldn't this loop should break? IIUC no other registers should be equal-to/alias FwdReg.

590

Should the basic block scan stop at branches in addition to calls to reduce compile time (MI::isCall() || MI::isBranch())?

591

If the scan reaches a call or a branch, shouldn't it break instead of returning? That way, the ShouldTryEmitEntryVals case can be finished. Perhaps there should be a test?

642

nit, extra indentation here?

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
483

I don't understand the lowering of DW_OP_regx within addMachineRegExpression. Why should the lowering be different within/without parameter values? It seems like the DIExpression itself should be formed with DW_OP_bregx, if/when needed.

lib/CodeGen/TargetInstrInfo.cpp
1143

I do not think that casting the offset to unsigned is safe due to loss of precision. Also, note that the expression generated for the Offset=0 case is inefficient. Is there some way to reuse the code in DIExpression::appendOffset for this?

854

@aprantl would you mind re-stating your concern? The dwarf standard says that DW_OP_reg<n> describes an object in a register. That seems like it would be valid outside of a DW_OP_entry_value block. Is it not?

lib/Target/X86/X86InstrInfo.cpp
7175

We should be able to factor out the logic in DIExpression::{append,prepend}Offset to simplify this.

djtodoro marked 12 inline comments as done.Jul 31 2019, 4:26 AM

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
539

I see.. That makes sense.

568

Yes. Thanks!

590

Actually, that is not necessary, since this will never encounter a branch.

591

If it encounters a call that means we can not consider the 'DW_OP_entry_value' as valid expression with in a call site parameter value, since the call may clobber that, and we stop the analysis as is.

There is a test for that part within the test/DebugInfo/MIR/X86/dbgcall-site-interpretation.mir.

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
483

Yes, thanks! That is better way for sure.

lib/CodeGen/TargetInstrInfo.cpp
1143

I agree, we can use the DIExpression::appendOffset.

lib/Target/X86/X86InstrInfo.cpp
7175

We can use that.

djtodoro updated this revision to Diff 212551.Jul 31 2019, 4:28 AM
djtodoro marked 2 inline comments as done.