This is an archive of the discontinued LLVM Phabricator instance.

[CSInfo][clang][ISEL][RISCV] Support for recovering optimized debug function parameters
Needs ReviewPublic

Authored by mmatic05 on Dec 27 2021, 5:00 AM.

Details

Summary

This patch enables functionality in clang frontend and adds call site info
generation support for RISCV targets (riscv64, riscv32).

Debug entry values functionality provides debug information about
call sites and function parameters values at the call entry spot.
Condition for generating this type of information is
compiling with -g option and optimization level higher than zero(-O0).

In ISEL phase, while lowering call instructions, collect info about registers
that forward arguments into following function frame.
We store such info into MachineFunction of the caller function.
This is used very late, when dumping DWARF info about call site parameters.

The call site info is visible at MIR level, as callSites attribute of MachineFunction.
To deal with callSites attribute, we should pass -emit-call-site-info option to llc.

When parameter's value is loaded into a register by RISCV instructions
(which are processed by the solution) , it could be described correctly and emitted
as DW_TAG_call_site_parameter.

We can see increased numbers of parameters (DW_TAG_call_site_parameter) as a consequence of the implemented solution:

  • clang binary file (compile with "-g -O3")
$ llvm-dwarfdump --statistics clang
...
"call site parameter DIEs": 558
...
  • opt binary file (compile with "-g -O3")
$ llvm-dwarfdump --statistics opt
...
"call site parameter DIEs": 1184
...
  • llc binary file (compile with "-g -O3")
$ llvm-dwarfdump --statistics llc
...
"call site parameter DIEs": 375
...

Diff Detail

Unit TestsFailed

Event Timeline

mmatic05 created this revision.Dec 27 2021, 5:00 AM
mmatic05 requested review of this revision.Dec 27 2021, 5:00 AM
mmatic05 edited the summary of this revision. (Show Details)Dec 27 2021, 5:04 AM

Thanks for this. Some initial comments inline.

Please apply the suggestions to all tests.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9141

This is an invariant, could we hoist it outside the loop?

llvm/test/CodeGen/RISCV/call-site-info-output_riscv32.ll
39–42

not needed

55

I think the lifetime intrinsics are not needed for this.

86–90

we don't need most of these

96
97
104
137–141

all of these could use just one DILocation

Can we put all the dbg-call-site-param-* tests in one MIR file?

mmatic05 updated this revision to Diff 396541.Dec 29 2021, 8:46 AM
  • addressed comments

Thanks for the suggestions!

ormris removed a subscriber: ormris.Jan 18 2022, 10:21 AM

This whole scheme seems like a pretty bad idea, is that really how the debug infrastructure works, scavenging generated code to figure out what arguments were passed rather than just remembering from the input? This all seems pretty fragile. I mean, if that's what we need to do to make it work then I guess we have to, but I really do not like this code...

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1910

Such as?....

1935

Does this need to be a member function? It doesn't seem to access any state from what I can tell, only its inputs.

Also, the name is misleading. isAddImmediate suggests it's checking for RISCV::ADDI (which would be a pointless function), but it in fact does many more things than that.

1943

This naming convention isn't used anywhere else in the backend that I know of

1963

Does this case really matter for this? Why would we ever generate ANDI X0, 0 as the input to a function? Same goes for a whole bunch of these cases, that would be terrible codegen if we generate them.

Hi @jrtc27:)

This whole scheme seems like a pretty bad idea, is that really how the debug infrastructure works, scavenging generated code to figure out what arguments were passed rather than just remembering from the input? This all seems pretty fragile. I mean, if that's what we need to do to make it work then I guess we have to, but I really do not like this code...

I am not sure If I understood this, can you please elaborate what exactly is bad idea?

This is all part of the Debug Entry Values feature that improves debugging user experience of optimized code. The feature was divided into two parts: callee and caller. The callee part implements production of DW_OP_entry_value DWARF expressions, which indicates that debugger should jump into parent's frame and look for a value of the parameter that was available at the call-site. The info at call site we call caller side. The caller side generates info about parameter's value. We need an info about what register was used for passing arguments since we need to express the value that was passed in, in terms of DWARF (expression), very late in the pipeline (when doing the AsmPrinter/DwarfDebug). During the ISel phase, the compiler remembers the info into the MF attribute, so it can be used later.

An alternative was to introduce some additional metadata, that would be generated at fronted, but keeping such info (valid) throughout pipeline is very expensive.

djtodoro added inline comments.Feb 1 2022, 2:13 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1915–1916

This comment sounds confusing -- please fix it.

1935

I guess a more appropriate name would be something like isAddImmLike()

1943

+1 to this, please refactor it

llvm/test/DebugInfo/MIR/RISCV/dbg-call-site-param.mir
485–516 ↗(On Diff #396541)

I believe we don't need all of these attrs.

djtodoro added inline comments.Feb 1 2022, 6:01 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1935

In addition to this -- please take a look into the X86 implementation of the describeLoadedValue(). I think that some of these instructions here are moveImmLike (since the offset ends up being 0) rather than addImmLike instructions -- so those could/should be moved out of this predicate for sure.

mmatic05 updated this revision to Diff 412674.Mar 3 2022, 4:32 AM
  • addressed comments
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 4:32 AM