Page MenuHomePhabricator

[WebAssembly] Add DBG_VALUE with local operands location in WebAssemblyExplicitLocals pass
Needs ReviewPublic

Authored by yurydelendik on Sep 27 2018, 2:48 PM.

Details

Reviewers
aprantl
dschuff
Summary

Extends DWARF expression larguage to express locals/globals locations. (via
target-index operands atm) (possible variants are: non-virtual registers
or address spaces)

The WebAssemblyExplicitLocals can replace virtual registers to targer-index operand type at the time when WebAssembly backend introduces {get,set,tee}_local instead of
corresponding virtual registers.

Diff Detail

Event Timeline

yurydelendik created this revision.Sep 27 2018, 2:48 PM

There was a question about how easy it will be to reuse or modify LLDB to use with such extension (DW_OP_WASM_location). I looked at DWARFExpression::Evaluate and, I think, it will not be a problem to add support for that. It might be even easier than try to emulate registers mapping to some WASM local or stack operand.

I added DWARF data visualization is the wasm file to the https://wasdk.github.io/wasmcodeexplorer/. Here is a comparison of before and after the patch:

Can you explain how this new operation is meant to work?

include/llvm/BinaryFormat/Dwarf.def
626

0xf6
Did you check against https://sourceware.org/elfutils/DwarfExtensions to see if this is is available?

include/llvm/BinaryFormat/Dwarf.h
62

Is this actually WASM-specific or is LLVM the only implementation and you could use LLVM instead?

Can you explain how this new operation is meant to work?

This particular patch is trying to "save" debug values during the WebAssemblyExplicitLocals pass. Currently they are lost because virtual registers are not replaced. The patch replaces virtual registers with target-index(), that are correspond to the introduced WebAssembly locals, and then with WebAssembly DWARF extensions operators during serialization. (This patch is a first step and it does not fix all WebAssembly passes yet.)

include/llvm/BinaryFormat/Dwarf.h
62

Stack operands, locals and globals are specific to WebAssembly target. (These can be mapped to known primitives such as registers or specific address space, but will require more effort on producer and consumer sides IMHO)

The plan is to make it non-LLVM specific.

Change DW_OP_WASM_location op to 0xED.

0xF6 is taken per https://sourceware.org/elfutils/DwarfExtensions

yurydelendik marked 2 inline comments as done.Nov 9 2018, 1:23 PM
yurydelendik added inline comments.
include/llvm/BinaryFormat/Dwarf.def
626

0xf6 is taken, using 0xed instead

yurydelendik marked an inline comment as done.Dec 18 2018, 6:37 AM

Can you explain how this new operation is meant to work?

@aprantl, here is some description of what we are trying to achieve: https://github.com/WebAssembly/debugging/issues/1#issuecomment-448237641

In the nutshell, the WebAssembly extensions for the DWARF expression will allow us to express WebAssembly specific locations such as for locals, globals and operands stack. The consumers of this information will be limited to LLVM tools, web browser debuggers and AOT/JIT WebAssembly compilers.

yurydelendik retitled this revision from [WIP] [WebAssembly] Add DBG_VALUE with local operands location in WebAssemblyExplicitLocals pass to [WebAssembly] Add DBG_VALUE with local operands location in WebAssemblyExplicitLocals pass.Dec 18 2018, 6:38 AM

Can you explain how this new operation is meant to work?

@aprantl, here is some description of what we are trying to achieve: https://github.com/WebAssembly/debugging/issues/1#issuecomment-448237641

In the nutshell, the WebAssembly extensions for the DWARF expression will allow us to express WebAssembly specific locations such as for locals, globals and operands stack. The consumers of this information will be limited to LLVM tools, web browser debuggers and AOT/JIT WebAssembly compilers.

Does WebAssembly have a concept of registers? You could just define three additional register numbers for WebAssembly and address a stack slot as DW_OP_breg [wasm_stack_register] + 3 etc.. This would need less support in all the involved tools. The link says that it was considered, but rejected, but there is no mention as to why.

Can you explain how this new operation is meant to work?

@aprantl, here is some description of what we are trying to achieve: https://github.com/WebAssembly/debugging/issues/1#issuecomment-448237641

In the nutshell, the WebAssembly extensions for the DWARF expression will allow us to express WebAssembly specific locations such as for locals, globals and operands stack. The consumers of this information will be limited to LLVM tools, web browser debuggers and AOT/JIT WebAssembly compilers.

Does WebAssembly have a concept of registers? You could just define three additional register numbers for WebAssembly and address a stack slot as DW_OP_breg [wasm_stack_register] + 3 etc.. This would need less support in all the involved tools. The link says that it was considered, but rejected, but there is no mention as to why.

There is no registers in wasm. There are (somewhat) unlimited numbers of locals and stack depth per function, globals per module. Maybe there will be more types of elements, e.g.table items? Yes, I entertained idea of complex encoding scheme as <register number> = <type>*3 + <index>. IHMO it is to brittle and does not carry any value, except to be decodable by tools that do not know how to handle it, and at the same time adds a burden to the tools that will know how to deal with and decode that. That's just my opinion. I also found that the tools (e.g. LLVM or LLDB) are eager to know about number of registers (which we cannot determine) to allocate internal structures.

aprantl added a comment.EditedDec 18 2018, 9:11 AM
I also found that the tools (e.g. LLVM or LLDB) are eager to know about number of registers (which we cannot determine) to allocate internal structures.

What I was thinking of was to define exactly three WASM DWARF register numbers: wasm_reg_local, wasm_reg_global, wasm_reg_stack and use DW_OP_breg regnum offset like you would use the new special operation introduced by this patch.

I also found that the tools (e.g. LLVM or LLDB) are eager to know about number of registers (which we cannot determine) to allocate internal structures.

What I was thinking of was to define exactly three WASM DWARF register numbers: wasm_reg_local, wasm_reg_global, wasm_reg_stack and use DW_OP_breg regnum offset like you would use the new special operation introduced by this patch.

Okay, let me try that. It will be sufficient to try it out, since signatures are identical, just spec defined meaning is different (it is not a memory location and offset is not bytes) from what we are trying to achieve, and that's my main concern. I'll make the changes.

Replace DW_OP_wasm_location with DW_OP_breg

@aprantl Is the advantage of your suggested approach just that we don't have to define a new expression type? Obviously the interpretation is not the same as DW_OP_breg on other targets so as you say, either way there would have to be special logic in all the tools that consume it. Is this kind of repurposing of builtin primitives common?

lib/CodeGen/AsmPrinter/DebugLocEntry.h
24

This is very generic and doesn't really say what it's for, or what any of the fields are for. I get that this is trying to be target-independent but maybe it would be easier to understand if it wasn't. It looks like the index corresponds now to the dwarf register (e.g. to indicate that this is for a local or global), and the offset to indicate which one, but that seems backwards because we use the term "index" to refer to e.g. the index of the global or local.
Really what we have might in LLVM terms is a "kind" (i.e. a kind of location, local or stack or global) and an index (which local). "Index" is still a pretty generic term but it does match the usage of the term "index space" in the spec (and I would expect that our numbering should be the same, e.g. sharing the local index space for arguments and local variables).

@aprantl Is the advantage of your suggested approach just that we don't have to define a new expression type? Obviously the interpretation is not the same as DW_OP_breg on other targets so as you say, either way there would have to be special logic in all the tools that consume it. Is this kind of repurposing of builtin primitives common?

DWARF leaves a lot of details to the ABI of a platform so I would say it is not uncommon. But since it was mentioned that the indices aren't really offsets I see how using custom operators would be cleaner. I'm not against this approach, I was just trying to point out that an alternative less invasive encoding could be used that might be less confusing to existing consumers of debug information.

@aprantl Is the advantage of your suggested approach just that we don't have to define a new expression type? Obviously the interpretation is not the same as DW_OP_breg on other targets so as you say, either way there would have to be special logic in all the tools that consume it. Is this kind of repurposing of builtin primitives common?

I'm not against this approach, I was just trying to point out that an alternative less invasive encoding could be used that might be less confusing to existing consumers of debug information.

There will no be existing consumers of WebAssembly debug information, except probably llvm-dwarfdump. Investing time into extension of LLDB/GDB is a risk and may not pay off. The new consumers will be web browser debugger and AOT compilers, which will be familiar with the extensions.

I think there will eventually be more consumers, including traditional-style debuggers (even if just for non-web use cases).
I had initially hoped that we could get away with just using DWARF registers more-or-less directly for locals, but as Yury pointed out, there are a lot of places that assume a fixed number of architectural registers. And of course wasm globals and the wasm value stack don't really have analogs on most architectures. I still hope we'll be able to use DWARF's CFA and other stack stuff for the stack in linear memory. (Although in our case the stack pointer will be a global rather than a register. Hopefully we can make that work.)

We could experiment with the register-based approach for now since it wouldn't require coordinating any global numbers or anything.

yurydelendik marked an inline comment as done.Dec 19 2018, 10:29 AM
yurydelendik added inline comments.
lib/CodeGen/AsmPrinter/DebugLocEntry.h
24

Here I was trying to us existing MO_TargetIndex "Target-dependent index+offset operand" as it is defined in LLVM (similar to lib/Target/AMDGPU/AMDGPU.h).

I also pondering the idea of using the Offset as "memory" offset by introducing the imaginary array/layout for e.g. locals with indices 0, 1, 2,.. will have offsets +0, +16, +32, ...

The alternative is to define MachineOperandType (e.g. MO_WasmLocation), but it will be more invasive

Come to think of it, might we actually want the possibility of constant offset in addition to the "kind" and index field, e.g. for cases where the value is actually a pointer?

Come to think of it, might we actually want the possibility of constant offset in addition to the "kind" and index field, e.g. for cases where the value is actually a pointer?

The idea was target-index() will define the (virtual?) location for the local/global/operand, so it is a reference, and it can further participate in DWARF expression calculations using the plus and deref operations if needed. Having target-index() as location/ref may pay-off when structure is stored in a local (e.g. flags) (not sure if it is the case in clang/llvm).

  • revert breg use due to incompatibility with DWARF definition
  • rebase to git llvm-project

Add simple test

Minor changes to comment; move #include

nikic added a subscriber: nikic.Jul 12 2019, 7:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 7:39 AM

Let's work toward getting this landed!
@yurydelendik Your note mentioned that you reverted the breg version based on some kind of incompatibility, can you say more about that?

yurydelendik added a comment.EditedJul 13 2019, 6:59 AM

Let's work toward getting this landed!
@yurydelendik Your note mentioned that you reverted the breg version based on some kind of incompatibility, can you say more about that?

breg<i> n literally means to read register i content and add n offset (and that will be used as location address of value). What the above "breg" encoding does is it assigns reg 0 as an "virtual array" location of locals, reg 1 as a "virtual array" location of globals, etc., which makes it looks like a workaround since locals, globals or operand stack are not always stored in continues memory segments.

Incompatibilities or conflicts with the DWARF spec definitions:

  1. unclear what offset n means - is it an index or byte offset (e.g. 8 can mean 8th local or low byte of 1th local)?
  2. whole encoding scheme expresses a location address rather than a location (we can append extra DW_OP_deref (?) operation thus increasing size of encoding and risking that tools might break/optimize it)

So far there no benefits to us to choose breg encoding: it is not extensible and we cannot further extend it (e.g. beyond the single n offset parameter). Also generic DWARF (optimization?) tools has to track that they are in a WebAssembly context to choose different logic for breg.