Page MenuHomePhabricator

[DwarfDebug] Dump call site debug info into DWARF
ClosedPublic

Authored by djtodoro on Apr 15 2019, 8:57 AM.

Details

Summary

Dump DWARF information about call sites and call site parameters into debug info sections.

The patch also provides an interface for interpretation of instructions that could load values of call site parameters in order to generate DWARF info about call site parameters.

Authors: @asowda, @NikolaPrica, @djtodoro, @ivanbaev

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@dstenb Thanks a lot for your comments!

djtodoro updated this revision to Diff 201522.May 27 2019, 6:46 AM

-Addressing comments

djtodoro updated this revision to Diff 202420.May 31 2019, 5:09 AM

-Rebase tests

Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2019, 6:37 AM
dstenb added inline comments.Jul 4 2019, 7:09 AM
llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
473 ↗(On Diff #205799)

Is the fallthrough intentional here, or a merge artifact?

djtodoro marked an inline comment as done.Jul 4 2019, 7:57 AM
djtodoro added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
473 ↗(On Diff #205799)

We forgot to update this on to Phabricator. Thanks!

djtodoro updated this revision to Diff 208045.Jul 4 2019, 7:59 AM

-Fixing unintentional code during the rebase process

This revision was automatically updated to reflect the committed changes.

Thanks a lot for your work with this patch series! It seems very useful.

Sorry for coming with this post-merge question, but I encountered it first now when trying to adapt this to our downstream target; what limitations are imposed on the MachineOperand pointer that the describeLoadedValue returns? For example, in our downstream target we materialize 0 with a sub $reg, $reg, so we'd have to return a pointer to a MachineOperand that is not part of the instruction. Is that allowed? I guess x86 has the same issue as it materializes 0 with xor.

Perhaps it would be possible to make the hook return a MachineOperand object instead? It could simplify the code a bit in such cases. Or would that be too expensive?

@dstenb Thanks a lot!

That seems like a good catch and we should support such scenario.

I think that it will be expensive to return a MachineOperand object. Could you please provide us with more info about the case? Can we describe such situation through the DIExpression instead?

This commit broke codegen on SystemZ, we're now failing both the test-suite and the multistage bootstrap. Unfortunately this wasn't detected earlier since the SystemZ build bot was down.

The regressions go away when I revert the change to TargetRegisterInfo::isCallerPreservedPhysReg, i.e. change it back to always returning false. Note that this change is the only modification in this patch that has any effect on actual code generation (as opposed to debug info generation) -- was this really intentional?

I have not yet debugged the failure in detail, but I believe the isCallerPreservedPhysReg change is in fact incorrect -- this function was not intended to just mirror getCallPreservedMask (which simply describes the function call ABI), but rather to handle special cases like the PowerPC TOC register, which is not just simply preserved across calls, but rather is invariant throughout the whole function. This is why this check is used to eliminate invariant stores in MachineLICM. The change in this patch breaks that semantics.

This really needs to be fixed before the LLVM 9 release.

djtodoro reopened this revision.Jul 12 2019, 4:31 AM

@uweigand Thanks for that! The commit is reverted with the rL365886.

We didn't want to change the generated code, so we just made a quick fix for this. This should solve the problem.

This revision is now accepted and ready to land.Jul 12 2019, 4:31 AM
djtodoro updated this revision to Diff 209459.Jul 12 2019, 4:35 AM

-Make a wrapper around the getCallPreservedMask() and use it in order to avoid changes in generated code

@dstenb Thanks a lot!

That seems like a good catch and we should support such scenario.

I think that it will be expensive to return a MachineOperand object.

The size of MachineOperand is 32 bytes on x86-64, and the class only contains POD fields, so I guess it should be fairly cheap to copy such objects, or initialize them with compile-time known values? Though, perhaps that would still cause a noticeable increase in compilation time.

I think it would be interesting to benchmark that at least.

Could you please provide us with more info about the case?

Here is an x86-64 reproducer.

extern void callee(int);
void caller() { callee(0); }

When compiled using -O2 -g -Xclang -femit-debug-entry-values it gives the following instruction to try to describe the parameter with:

$edi = XOR32rr undef $edi(tied-def 0), undef $edi, implicit-def dead $eflags

Can we describe such situation through the DIExpression instead?

Oh, maybe! Do you have some suggestions for how that could be done?

@dstenb Thanks a lot, that seems useful!

The size of MachineOperand is 32 bytes on x86-64, and the class only contains POD fields, so I guess it should be fairly cheap to copy such objects, or initialize them with compile-time known values? Though, perhaps that would still cause a noticeable increase in compilation time.

I think it would be interesting to benchmark that at least

Yes, I agree that we should definitely experiment with that.

Here is an x86-64 reproducer.

extern void callee(int);
void caller() { callee(0); }

When compiled using -O2 -g -Xclang -femit-debug-entry-values it gives the following instruction to try to describe the parameter with:

$edi = XOR32rr undef $edi(tied-def 0), undef $edi, implicit-def dead $eflags

Thanks for the test case, I also came up with some "real world" cases with situation like this.

Can we describe such situation through the DIExpression instead?

Oh, maybe! Do you have some suggestions for how that could be done?

I think we could make a DIExpression with the DW_OP_const* , 0 (within ParamLoadedValue) and then handle it properly in collectCallSiteParameters(). WDYT?
In addition, do you think that we could treat that situation like an improvement and create a separate thread for that (so we can re-land this with the latest fix)?

Thanks for the quick revert! The new patch looks reasonable to me.

@uweigand Thanks for the note and comment!

vsk added a comment.Thu, Jul 25, 6:52 PM

This has been sitting on my review queue for too long, I'm really sorry for the delay here. I've still only gotten 2/3 of the way through this patch but will make time to finish the rest soon. Thanks for working on this!

I have a correctness concern in 'shouldInterpret'. My other concerns are about documentation/naming.

include/llvm/CodeGen/TargetInstrInfo.h
1696

Could you clarify what RHS mean, here?

lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
970

const auto& Param?

lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
230

Please document these methods and, if it's not too much trouble, mark them const.

240

As the descriptions of the parameters here is fairly complex, could you split the discussion of each parameter into a separate paragraph?

Also, is the parenthetical "(in the case of non-gdb tuning)" in the PCOffset description redundant?

240

Does CallReg=0 for direct calls?

244

Please document constructCallSiteParmEntryDIEs. Specifically, it would help to have a \ref to the function that creates the param array, and to have a note about whether or not the order of the parameters matters.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
580

Can we assert that this insertion into ArgsRegsForProcess occurs? I'm assuming that it's a bug if more than one argument is forward in the same register.

Super nit-pick, but, I don't think having 'ForProcess' in the name is very clear. Something like 'ForwardedRegWorklist' would be more in keeping with the naming in the rest of the compiler.

585

Could you add a comment explaining what RegsForEntryValues is before explaining ShouldTryEmitEntryVals?

589

The comment for 'shouldInterpret' should be updated to reflect the fact that it returns a register, not a bool. Also, consider using a more specific name, like 'getForwardingRegDefinedByMI'.

597

Is the first call to MO.getReg() just checking for undef? I wonder why TRI->isPhysicalRegister() does not make it redundant.

599

Can more than one Def be attached to an MI? 'MachineInstr::getNumDefs()' suggests it is possible. If so, then it seems like there is a bug here when more than one forwarding reg is clobbered by MI?

607

Why are DbgLocVal and Reg passed in by reference?

610

Consider using RegsForEntryValues.find() to avoid doing two lookups.

617

Consider using a for loop here. It will guarantee iterator increment and make it easier to continue early.

627

To simplify this, I suggest continuing early when Reg is undef.

630

These declarations for Op and Expr don't seem necessary.

644

Please use /*IsIndirect=*/IsSPorFP when calling the MachineLocation constructor.

647

I wonder whether two passes through the block are needed, as ArgsRegsForProcess may not be complete when it is referenced in shouldInterpret.

660

Not sure what '1' is for in '{dwarf::DW_OP_entry_value, 1}', perhaps a comment would help?

662

RegsForEntryValues.find to avoid two lookups?

lib/CodeGen/AsmPrinter/DwarfDebug.h
261

Consider adding some detail, like:

unsigned Register; ///< Parameter register at the callee entry point.
DbgValueLoc; ///< Corresponding location for the parameter value at the call site.
264

Assert that Reg isn't undef here?

267

Please mark the DbgCallSiteParam methods const.

test/DebugInfo/MIR/X86/dbgcall-site-interpretation.mir
29

Consider using 'FileCheck -implicit-check-not=DW_TAG_GNU_call_site_parameter %s' to catch all of them.

djtodoro marked 36 inline comments as done.Mon, Jul 29, 5:57 AM

@vsk Thanks a lot for such nice review!

include/llvm/CodeGen/TargetInstrInfo.h
1696

Sure.

lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
240

As the descriptions of the parameters here is fairly complex, could you split the discussion of each parameter into a separate paragraph?

Sure.

Also, is the parenthetical "(in the case of non-gdb tuning)" in the PCOffset description redundant?

The comment is updated in order to avoid the ambiguity. GDB uses PC info even for tail calls.

Does CallReg=0 for direct calls?

Yes.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
580

This only adds all the regs into the working list by iterating through the CallSiteInfo.

Super nit-pick, but, I don't think having 'ForProcess' in the name is very clear. Something like 'ForwardedRegWorklist' would be more in keeping with the naming in the rest of the compiler.

Thanks!!

585

Sure.

589

Thanks for this!

597

Yes..Thanks!

599

Hmm...Yes, we have not consider such situation. Thanks!

607

We can pass them as values, and let the compiler to optimize it for us.

647

The working list is complete at the moment. But, we are inserting new registers in order to try generating call site parameters values as entry values (after the loop ends). We are doing that only for the entry MBB (the simplest case), but since the GCC generates the call site values expression as entry values even for the MBBs other then the entry one, we have left the TODO comment for that.

660

Sure.

lib/CodeGen/AsmPrinter/DwarfDebug.h
261

Thanks!

test/DebugInfo/MIR/X86/dbgcall-site-interpretation.mir
29

We just wanted to emphasizes that we did not describe a call site parameter. Since all of this is controlled by an option, the implicit-check-not will require not to use the option on the same test case. But we put an error alerting that the CallSiteInfo (the MF attribute callSites) is provided but not used, so it will be difficult to use the implicit-check-not here.

djtodoro updated this revision to Diff 212158.Mon, Jul 29, 5:58 AM
djtodoro marked 12 inline comments as done.

-Addressing comments

vsk added inline comments.Mon, Jul 29, 1:56 PM
lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
240

nit, tuning

243

nit, matter, and its -> the

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
580

I see. I had in mind:

bool InsertedReg = ForwardedRegWorklist.insert(ArgReg.Reg).second;
assert(InsertedReg && "Single register used to forward two arguments?");

If this is already checked elsewhere, there's no need for a duplicate assert.

609

After FwdReg from MO is pushed into Defs, shouldn't this loop should break? IIUC no other registers should be equal-to/alias FwdReg.

631

Should the basic block scan stop at branches in addition to calls to reduce compile time (MI::isCall() || MI::isBranch())?

632

If the scan reaches a call or a branch, shouldn't it break instead of returning? That way, the ShouldTryEmitEntryVals case can be finished. Perhaps there should be a test?

683

nit, extra indentation here?

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
479

I don't understand the lowering of DW_OP_regx within addMachineRegExpression. Why should the lowering be different within/without parameter values? It seems like the DIExpression itself should be formed with DW_OP_bregx, if/when needed.

lib/CodeGen/TargetInstrInfo.cpp
1146

I do not think that casting the offset to unsigned is safe due to loss of precision. Also, note that the expression generated for the Offset=0 case is inefficient. Is there some way to reuse the code in DIExpression::appendOffset for this?

lib/IR/DebugInfoMetadata.cpp
856

@aprantl would you mind re-stating your concern? The dwarf standard says that DW_OP_reg<n> describes an object in a register. That seems like it would be valid outside of a DW_OP_entry_value block. Is it not?

lib/Target/X86/X86InstrInfo.cpp
7408

We should be able to factor out the logic in DIExpression::{append,prepend}Offset to simplify this.

djtodoro marked 12 inline comments as done.Wed, Jul 31, 4:26 AM

@vsk Thanks for the comments!

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
580

I see.. That makes sense.

609

Yes. Thanks!

631

Actually, that is not necessary, since this will never encounter a branch.

632

If it encounters a call that means we can not consider the 'DW_OP_entry_value' as valid expression with in a call site parameter value, since the call may clobber that, and we stop the analysis as is.

There is a test for that part within the test/DebugInfo/MIR/X86/dbgcall-site-interpretation.mir.

lib/CodeGen/AsmPrinter/DwarfExpression.cpp
479

Yes, thanks! That is better way for sure.

lib/CodeGen/TargetInstrInfo.cpp
1146

I agree, we can use the DIExpression::appendOffset.

lib/Target/X86/X86InstrInfo.cpp
7408

We can use that.

djtodoro updated this revision to Diff 212551.Wed, Jul 31, 4:28 AM
djtodoro marked 2 inline comments as done.

-Addressing comments

vsk accepted this revision.Wed, Jul 31, 8:20 AM

Thanks, lgtm.

This revision was automatically updated to reflect the committed changes.