This is an archive of the discontinued LLVM Phabricator instance.

[entry values] X86: Describe effects of MOV{8,16}ri (PR45053)
AbandonedPublic

Authored by vsk on Feb 27 2020, 6:29 PM.

Details

Reviewers
dstenb
djtodoro
Summary

We should be able to describe MOV{8,16}ri like MOV32ri and others.

https://bugs.llvm.org/show_bug.cgi?id=45053
rdar://59822099

Diff Detail

Event Timeline

vsk created this revision.Feb 27 2020, 6:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 27 2020, 6:29 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
djtodoro accepted this revision.Feb 27 2020, 11:39 PM

Nice, thanks!

llvm/test/DebugInfo/MIR/X86/call-site-info-mov16ri.mir
48

Unused, so it can be deleted.

115

We do not need all these MF attributes (I think only 'name' and 'callSites').

This revision is now accepted and ready to land.Feb 27 2020, 11:39 PM
dstenb added inline comments.Feb 28 2020, 2:05 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
7811

At least when I tried this out on a 8th generation i7, it seems that MOV8ri and MOV16ri do not modify the upper bits in super-registers.

movl    $287454020, %edi        # imm = 0x11223344                                                                                                                                                                                                                                                                     
mov     $21862, %di             # imm = 0x5566
movb    $119, %dil              # imm = 0x77
callq   call

That produces the value 0x11225577.

In this test case we try to describe:

renamable $rsi = LEA64r $rip, 1, $noreg, @.str, $noreg, debug-location !51
renamable $si = MOV16ri 2112

As Reg is a 32-bit register, I think we want to return something like this for the MOV16ri instruction:

ParamLoadedValue(Reg, (DW_OP_constu, 0xFFFF0000, DW_OP_and, DW_OP_constu, MI.getOperand(1).getImm(), DW_OP_or))
djtodoro added inline comments.Feb 28 2020, 2:28 AM
llvm/lib/Target/X86/X86InstrInfo.cpp
7811

At least when I tried this out on a 8th generation i7, it seems that MOV8ri and MOV16ri do not modify the upper bits in super-registers.

I have not tried debugging it, but I guess this would result a wrong value printed within debugger, and we need this:

As Reg is a 32-bit register, I think we want to return something like this for the MOV16ri instruction:
ParamLoadedValue(Reg, (DW_OP_constu, 0xFFFF0000, DW_OP_and, DW_OP_constu, MI.getOperand(1).getImm(), DW_OP_or))

If that is the case, I agree with that. So the X86::MOV8ri and X86::MOV16rishould be handled as a special case here.

vsk updated this revision to Diff 247260.Feb 28 2020, 7:11 AM

Thank you @dstenb and @djtodoro very much for your feedback.

I think I've addressed the main issue, we now produce DW_AT_call_value (DW_OP_constu 0x840, DW_OP_stack_value, DW_OP_piece 0x2). PTAL.

In D75326#1898071, @vsk wrote:

Thank you @dstenb and @djtodoro very much for your feedback.

I think I've addressed the main issue, we now produce DW_AT_call_value (DW_OP_constu 0x840, DW_OP_stack_value, DW_OP_piece 0x2). PTAL.

I might be incorrect here, but I don't think we are allowed to use DW_OP_piece operations in DW_AT_call_value attributes, as they should be DWARF expressions? As far as I have understood it, the DW_OP_piece operation is only allowed in composite location descriptions, and you can in general not go from location descriptions to DWARF expressions, with the only (?) exception being DW_OP_entry_value operations holding register location descriptions.

(This seems a bit related to D75270, in which I stop emitting entry values around composite location descriptions, since the DWARF standard specifies that those are only allowed to hold DWARF expressions and register location descriptions.)

aprantl added inline comments.
llvm/test/DebugInfo/MIR/X86/call-site-info-mov16ri.mir
86

I'm pretty sure there is a bunch of unnecessary debug info in this test. Do we need more than one local variable?

96

Do we need the TBAA info?

vsk added a comment.EditedFeb 28 2020, 10:09 AM
In D75326#1898071, @vsk wrote:

Thank you @dstenb and @djtodoro very much for your feedback.

I think I've addressed the main issue, we now produce DW_AT_call_value (DW_OP_constu 0x840, DW_OP_stack_value, DW_OP_piece 0x2). PTAL.

I might be incorrect here, but I don't think we are allowed to use DW_OP_piece operations in DW_AT_call_value attributes, as they should be DWARF expressions? As far as I have understood it, the DW_OP_piece operation is only allowed in composite location descriptions, and you can in general not go from location descriptions to DWARF expressions, with the only (?) exception being DW_OP_entry_value operations holding register location descriptions.

(This seems a bit related to D75270, in which I stop emitting entry values around composite location descriptions, since the DWARF standard specifies that those are only allowed to hold DWARF expressions and register location descriptions.)

If I understand your reading correctly, it's that composite location descriptions are not (in general) a kind of DWARF expression.

FWIW, this doesn't line up with my understanding. DWARF5 says "a single location description" may be "a composite location description", which consists of "one or more simple location descriptions", each of which describes "one piece of the object" (2.6.1). And it says that a DWARF expression describes "how to compute a value or specify a location" (2.5). That certainly /sounds/ like a composite location can be a DWARF expression.

Even if I've read the standard incorrectly, I'm not sure what the point of this distinction would be. Having implemented the OP_entry_value support in lldb, istm that there is no difficulty for the debugger in handling a composite location in an entry value. Ah, but reading your comment in D75270, it sounds like GDB doesn't support this. Could that simply be a GDB bug?

If OP_piece in an entry value really /is/ an issue, please confirm, and I'll update the x86 describeLoadedValue to just return 'None' for mov{8,16}ri (it doesn't seem worth it to me to do anything more complicated).

3.4.2. DW_AT_call_value attribute which is a DWARF expression which when evaluated yields the value of the parameter at the time of the call.

Curiously, DW_OP_piece is only described in chapter 2.6 Location descriptions and not in 2.5. DWARF Expressions. I *think* this may just be a bug in the spec. We should raise this question on dwarf-discuss for clarification, so we either learn why it is this way or can fix the spec to be less misleading.

Even if this is against the spec, I currently don't see a reason to artificially restrict LLVM from emitting composite descriptions.

vsk updated this revision to Diff 247392.Feb 28 2020, 5:28 PM
  • Reduce the test a bit more.
dstenb added a comment.Mar 2 2020, 2:06 AM
In D75326#1898465, @vsk wrote:
In D75326#1898071, @vsk wrote:

Thank you @dstenb and @djtodoro very much for your feedback.

I think I've addressed the main issue, we now produce DW_AT_call_value (DW_OP_constu 0x840, DW_OP_stack_value, DW_OP_piece 0x2). PTAL.

I might be incorrect here, but I don't think we are allowed to use DW_OP_piece operations in DW_AT_call_value attributes, as they should be DWARF expressions? As far as I have understood it, the DW_OP_piece operation is only allowed in composite location descriptions, and you can in general not go from location descriptions to DWARF expressions, with the only (?) exception being DW_OP_entry_value operations holding register location descriptions.

(This seems a bit related to D75270, in which I stop emitting entry values around composite location descriptions, since the DWARF standard specifies that those are only allowed to hold DWARF expressions and register location descriptions.)

If I understand your reading correctly, it's that composite location descriptions are not (in general) a kind of DWARF expression.

FWIW, this doesn't line up with my understanding. DWARF5 says "a single location description" may be "a composite location description", which consists of "one or more simple location descriptions", each of which describes "one piece of the object" (2.6.1). And it says that a DWARF expression describes "how to compute a value or specify a location" (2.5). That certainly /sounds/ like a composite location can be a DWARF expression.

Section 2.5 also specifies the following for DWARF expressions: "They are expressed in terms of DWARF operations that operate on a stack of values.". Section 2.6.1.2 specifies "A composite location description describes an object or value which may be contained in part of a register or stored in more than one location. Each piece is described by a composition operation, which does not compute a value nor store any result on the DWARF stack.". I think that those two things speak against composite location description being DWARF expressions.

Section 2.6 specifies the following for single location descriptions: "are a language independent representation of addressing rules of arbitrary complexity built from DWARF expressions (See Section 2.5 on page 26) and/or other DWARF operations specific to describing locations.". There they make a distinction between DWARF expressions, and the "DWARF opertions specific to describing locations", which I assume are the operations listed in 2.6, which include DW_OP_piece.

It would be good to get a clarification on dwarf-discuss, as @aprantl suggests. Does anyone want to put together a mail? Otherwise I can volunteer.

dstenb added a comment.EditedMar 2 2020, 3:14 AM
In D75326#1898465, @vsk wrote:

If OP_piece in an entry value really /is/ an issue, please confirm, and I'll update the x86 describeLoadedValue to just return 'None' for mov{8,16}ri (it doesn't seem worth it to me to do anything more complicated).

I applied this patch on top of 5900d3f2e94f710d73a89931953ce0a3d928c70d. I was unable to get the IR reproducer listed here up and running, but I tried it out with the following example:

caller.c:

extern int call(int);
int main() {
  call(0x11223344);
  return 0;
}

callee.c:

int global;
int call(int param) {
  global = param; // side effect to keep param.
  asm __volatile("movl $0xdeadbeef, %%edi" : : : "rdi"); // clobber reg.
  return 0; // print param here.
}

in which I changed the parameter setup to:

$edi = MOV32ri 287454020, debug-location !15 ; 0x11223344
$di = MOV16ri 21862, debug-location !15 ; 0x5566                                                                                                                                                                                                                                                                           
CALL64pcrel32 @call, csr_64, implicit $rsp, implicit $ssp, implicit killed $edi, implicit-def $rsp, implicit-def $ssp, implicit-def dead $eax, debug-location !15

producing the following call site parameter entry:

0x00000050:       DW_TAG_GNU_call_site_parameter
                    DW_AT_location      (DW_OP_reg5 RDI)
                    DW_AT_GNU_call_site_value   (DW_OP_constu 0x5566, DW_OP_stack_value, DW_OP_piece 0x2)

and the following location list for the parameter in the callee:

DW_AT_location        (0x00000000: 
   [0x0000000000201110, 0x000000000020111b): DW_OP_reg5 RDI
   [0x000000000020111b, 0x000000000020111e): DW_OP_GNU_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)

When I print that in GDB 8.2.1, 0x5566 is printed instead of 0x11225566 which is the actual parameter value:

(gdb) print /x param
$1 = 0x5566
(gdb) disas
Dump of assembler code for function call:
   0x0000000000201110 <+0>:	mov    %edi,0x2eea(%rip)        # 0x204000 <global>
   0x0000000000201116 <+6>:	mov    $0xdeadbeef,%edi
=> 0x000000000020111b <+11>:	xor    %eax,%eax
   0x000000000020111d <+13>:	retq   
End of assembler dump.

EDIT: Sorry, disregard the LLDB part of this I had previously written. It was a PEBKAC due to me providing a mix of -glldb and -ggdb for the caller and callee.

dstenb added a comment.Mar 3 2020, 2:59 AM

I dug around some in the Dwarf-discuss archive, and it seems that when working on DWARFv3 the piece operations were moved out from the "Dwarf Expressions" section to "Location Descriptions" to clarify that they don't push anything to the stack.

Mail: https://lists.linuxfoundation.org/pipermail/dwarf-discuss/2005-June/000206.html
Issue: http://dwarfstd.org/ShowIssue.php?issue=050701.1

Even if this is against the spec, I currently don't see a reason to artificially restrict LLVM from emitting composite descriptions.

The DW_OP_entry_value operation at the callee expects a value to be pushed to the DWARF stack when evaluating the DW_AT_call_value at the caller. How should/would that work with composite descriptions?

vsk added a comment.EditedMar 3 2020, 2:25 PM

Sorry for the delay here.

@dstenb wrote:

When I print that in GDB 8.2.1, 0x5566 is printed instead of 0x11225566 which is the actual parameter value:

Yeah, this sounds like a debugger bug to me. The debugger should print something that indicates that the upper bits of "param" are unknown. Although, to be fair, lldb has the same bug (or it might be worse, and print out "param" as 0x00005566 -- it doesn't track the bits of a variable that are uncovered by pieces).

The DW_OP_entry_value operation at the callee expects a value to be pushed to the DWARF stack when evaluating the DW_AT_call_value at the caller. How should/would that work with composite descriptions?

Well, in lldb at least, I think the idea is that while OP_piece doesn't push anything to the stack, it changes the debugger's notion of which bits in the evaluated result are "valid". So one way this could work is: the result of evaluating the DW_AT_call_value is pushed onto the stack when evaluating DW_OP_entry_value (and if it's e.g. a 2-byte piece, so be it).

All of which sounds a little iffy to me for what appears to be a pretty uncommon case (fwiw I was able to build all of clang & LNT for x86 without hitting this).

dstenb added a comment.Mar 9 2020, 9:04 AM
In D75326#1904157, @vsk wrote:

Sorry for the delay here.

@dstenb wrote:

When I print that in GDB 8.2.1, 0x5566 is printed instead of 0x11225566 which is the actual parameter value:

Yeah, this sounds like a debugger bug to me. The debugger should print something that indicates that the upper bits of "param" are unknown. Although, to be fair, lldb has the same bug (or it might be worse, and print out "param" as 0x00005566 -- it doesn't track the bits of a variable that are uncovered by pieces).

The DW_OP_entry_value operation at the callee expects a value to be pushed to the DWARF stack when evaluating the DW_AT_call_value at the caller. How should/would that work with composite descriptions?

Well, in lldb at least, I think the idea is that while OP_piece doesn't push anything to the stack, it changes the debugger's notion of which bits in the evaluated result are "valid". So one way this could work is: the result of evaluating the DW_AT_call_value is pushed onto the stack when evaluating DW_OP_entry_value (and if it's e.g. a 2-byte piece, so be it).

All of which sounds a little iffy to me for what appears to be a pretty uncommon case (fwiw I was able to build all of clang & LNT for x86 without hitting this).

Sorry for the late response!

Would that only work when you a have a single DW_OP_piece? Or would it be possible to have multiple composition operations in a DW_AT_call_value? What would happen in the latter case?

I'm sorry if I'm a bit persistent here. Please don't let me stand in the way if this can be made to work with LLDB, and the GDB developers can be convinced that them ignoring DW_OP_piece is a bug. I'm personally just not entirely convinced that that is valid DWARF.

Hi all, we have a build failure reported on the D73534, regarding this part. I think we should either land this or land a workaround like this one: D75974.

vsk added a comment.Mar 16 2020, 4:10 PM

@djtodoro thanks for landing a workaround!

In D75326#1904157, @vsk wrote:

Sorry for the delay here.

@dstenb wrote:

When I print that in GDB 8.2.1, 0x5566 is printed instead of 0x11225566 which is the actual parameter value:

Yeah, this sounds like a debugger bug to me. The debugger should print something that indicates that the upper bits of "param" are unknown. Although, to be fair, lldb has the same bug (or it might be worse, and print out "param" as 0x00005566 -- it doesn't track the bits of a variable that are uncovered by pieces).

The DW_OP_entry_value operation at the callee expects a value to be pushed to the DWARF stack when evaluating the DW_AT_call_value at the caller. How should/would that work with composite descriptions?

Well, in lldb at least, I think the idea is that while OP_piece doesn't push anything to the stack, it changes the debugger's notion of which bits in the evaluated result are "valid". So one way this could work is: the result of evaluating the DW_AT_call_value is pushed onto the stack when evaluating DW_OP_entry_value (and if it's e.g. a 2-byte piece, so be it).

All of which sounds a little iffy to me for what appears to be a pretty uncommon case (fwiw I was able to build all of clang & LNT for x86 without hitting this).

Sorry for the late response!

Would that only work when you a have a single DW_OP_piece? Or would it be possible to have multiple composition operations in a DW_AT_call_value? What would happen in the latter case?

That's an interesting question. On one hand, it doesn't seem like there should be anything special about having multiple composition operations. OTOH, if the debugger's representation of an evaluated expression ("Value" in lldb parlance) doesn't track which bits are known, the debugger might evaluate an entry value incorrectly if it encounters a DW_AT_call_value like "[...; DW_OP_piece 4; ..; DW_OP_piece 2]".

I'm sorry if I'm a bit persistent here. Please don't let me stand in the way if this can be made to work with LLDB, and the GDB developers can be convinced that them ignoring DW_OP_piece is a bug. I'm personally just not entirely convinced that that is valid DWARF.

No worries, I think this is great feedback. I think we've just touched on a tricky part of the standard. Do you still have the time to start a discussion about this on the dwarf mailing list?

In D75326#1925517, @vsk wrote:

No worries, I think this is great feedback. I think we've just touched on a tricky part of the standard. Do you still have the time to start a discussion about this on the dwarf mailing list?

Yeah, sure! I can put together a mail for the mailing list today or tomorrow.

Yeah, sure! I can put together a mail for the mailing list today or tomorrow.

I sent a mail: http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2020-March/004610.html

vsk abandoned this revision.Mar 31 2020, 4:20 PM

Yeah, sure! I can put together a mail for the mailing list today or tomorrow.

I sent a mail: http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2020-March/004610.html

Thanks. Reading through the discussion, I do see some amount of consensus about the first sentence of section 2.5 being misleading (or at least confusing) -- this is the bit I cited in argument for allowing composite locations wherever a value is expected. I think we'd need a strong reason to disregard that consensus. So I think we should set this patch aside for now.