This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
src/
-
UnwindCursor.hpp
3
libunwind.cpp

Differential D38680

[libunwind] Fix handling of DW_CFA_GNU_args_size
ClosedPublic

Authored by joerg on Oct 9 2017, 2:15 AM.

Download Raw Diff

Details

Reviewers

compnerd
rnk
mstorsjo

Commits

rGd0166a0a9377: The semantics of DW_CFA_GNU_args_size have changed subtile over the years.
rUNW337312: The semantics of DW_CFA_GNU_args_size have changed subtile over the
rL337312: The semantics of DW_CFA_GNU_args_size have changed subtile over the

Summary

This effectively reverts SVN r204290 and r204292 (back when this code was part of libcxxabi).

According to SVN r204290, the primary architecture using the DW_CFA_GNU_args_size opcode is VAX.

However, clang also produces it on X86 when it has done X86CallFrameOptimization, which gets done much more frequently if the stack is aligned to 4 bytes (which is the default when targeting windows).

This issue can be tested by building code for x86 linux (at least for 32 bit) with clang with -mstack-alignment=4.

I'm not sure if this code should be handled differently for VAX with some ifdef, or what the correct interpretation of this opcode is. I ran into this issue while trying to use libunwind for 32 bit windows (where clang uses a 4 byte stack alignment by default), found this commit, and noticed I got it working by reverting it. And later noticed I could reproduce the same issue on 32 bit x86 linux as well, by forcing -mstack-alignment=4.

Diff Detail

Event Timeline

mstorsjo created this revision.Oct 9 2017, 2:15 AM

mstorsjo mentioned this in D38679: [libunwind] Support dwarf unwinding on i386 windows.Oct 9 2017, 1:19 PM

I think that the problem is that we are using the generic register name, but we need to use the target specific register name. On x86, EIP/ESP are swapped. We should also have a test case for this. I had reduced this down to a simpler test case of:

void f(int,int,int,int,int,int,int,int,int);

int main() {
  try {
    f(0,1,2,3,4,5,6,7,8);
  } catch (int) {
    return 0;
  }
  return 1;
}

This revision now requires changes to proceed.Oct 9 2017, 2:19 PM

In D38680#892487, @compnerd wrote:

I think that the problem is that we are using the generic register name, but we need to use the target specific register name. On x86, EIP/ESP are swapped.

You mean EBP/ESP? I think the code here does the right thing, with UNW_REG_SP always mapping to the actual ESP.

We should also have a test case for this. I had reduced this down to a simpler test case of:
void f(int,int,int,int,int,int,int,int,int);

int main() {
  try {
    f(0,1,2,3,4,5,6,7,8);
  } catch (int) {
    return 0;
  }
  return 1;
}

Oh, great if you can reproduce the issue with that! Feel free to try to dig into the issue and figure out a better fix then.

rnk added inline comments.Oct 11 2017, 2:53 PM

src/libunwind.cpp
194	I think it makes sense to have this here: the contract is that if the personality sets the IP when the cursor pointed to a PC with a non-zero arg size, we should adjust SP for the personality. However, it's not clear to me that we don't need the same adjustment when stepping across frames that use arg size without a frame pointer.

mstorsjo added inline comments.Oct 11 2017, 9:45 PM

src/libunwind.cpp
194	When stepping across frames, the gnu args size is already included in, as part of the CFA offset, so with the current code, it steps too far.

lgtm

src/libunwind.cpp
194	OK, yes, now I see the code that does that. :)

dtzWill added a subscriber: dtzWill.Oct 19 2017, 6:21 PM

Wow, I've finally debugged some unwind failures to this on x86_64 Linux ...

Call-frame optimization introduces it
This fixes the cases I was investigating!!

Thank you!

In D38680#901892, @dtzWill wrote:

Wow, I've finally debugged some unwind failures to this on x86_64 Linux ...

Call-frame optimization introduces it

This fixes the cases I was investigating!!

That's good to hear! FWIW, I've talked to @joerg off-list and he said he'll try to look into it soon.

I've looked at this in some detail now. I'm not exactly sure yet why it is broken. The patch seems quite wrong to me. DW_CFA_GNU_args_size should be applied only when unwinding a call instruction and that regard, the commit message of the original change is quite correct. What I am still trying to understand is how the precise unwind frame disagrees with the unwinder.

In D38680#903203, @joerg wrote:

I've looked at this in some detail now. I'm not exactly sure yet why it is broken. The patch seems quite wrong to me. DW_CFA_GNU_args_size should be applied only when unwinding a call instruction and that regard, the commit message of the original change is quite correct. What I am still trying to understand is how the precise unwind frame disagrees with the unwinder.

@joerg Any update on this matter, how should we move forward with unbreaking the unwinding on i386?

In D38680#903203, @joerg wrote:

I've looked at this in some detail now. I'm not exactly sure yet why it is broken. The patch seems quite wrong to me. DW_CFA_GNU_args_size should be applied only when unwinding a call instruction and that regard, the commit message of the original change is quite correct. What I am still trying to understand is how the precise unwind frame disagrees with the unwinder.

If you look at Clang's output here https://godbolt.org/g/jFcSxz, you can see that we emit precise CFA adjustments for each push. We don't need to adjust the CFA by gnu arg size in UnwindCursor::step, which unwinds through frames. We only apply it when setting up the register context before transitioning to the landingpad. That's why unw_set_reg UNW_REG_IP is at least approximately the right place to do this SP adjustment, IMO.

Basically, a general purpose unwinder that collects return addresses can ignore gnu args size because we already have CFA adjustments, but an unwinder that implements landingpad transitions must adjust SP by gnu arg size as part of that context switch.

At least, that's how I understand this change. Maybe older (VAX?) compilers didn't implement gnu arg size this way, but at least with this patch, we handshake with ourselves correctly.

Ping @joerg

Ping @joerg and @rnk - what's a sensible path forward with respect to this? @rnk thinks this change is correct, while @joerg disagrees, but hasn't had time to look into it yet. Is it acceptable to apply this now and then revert later once @joerg has analyzed the issue?

Herald added a subscriber: chrib. · View Herald TranscriptApr 24 2018, 5:30 AM

I'm back to the point where I can't reproduce the problem :( Can we start providing an actual failing test case? It's annoying to debug a problem when you can't reproduce it.

In D38680#1077262, @joerg wrote:

I'm back to the point where I can't reproduce the problem :( Can we start providing an actual failing test case? It's annoying to debug a problem when you can't reproduce it.

My testcase that triggers the issue is available at https://martin.st/temp/hello-exception.cpp. With this input file, I can produce the DW_CFA_GNU_args_size opcodes with this command: clang -target i386-netbsd -O3 -fomit-frame-pointer -S -o - hello-exception.cpp (tested with both clang 4.0 and current trunk).

joerg commandeered this revision.Jun 5 2018, 3:03 PM

joerg edited reviewers, added: mstorsjo; removed: joerg.

After a careful review of newer GCC / libgcc and the assembler annotations from LLVM, I have come to the following conclusions:

(1) The semantics have been somewhat changed by GCC in recent years. There is no actual specification, so we have to go by what behavior actually makes sense.
(2) The primary motivation is still that the DW_CFA_GNU_args_size is a call-site specific annotation. It is expected to be applied when the IP is moved by the personality routine to compensate for the call site specific (temporary) adjustment.
(3) It is not clear with plain unw_set_ip outside the scope of the Itanium EH handling should have this behavior, so it might need to be split into an internal routine.
(4) LLVM does not produce correct CFA annotation for stdcall and similar cases where the callee removes additional stack space.

The patch covers the first two items.

Sounds sensible to me; I've tested the patch in the setup that showed the issue this original patch tried to fix, and it seems to work fine there.

This makes much more sense. Thanks @joerg

This revision is now accepted and ready to land.Jun 11 2018, 2:09 PM

In D38680#1123018, @joerg wrote:

After a careful review of newer GCC / libgcc and the assembler annotations from LLVM, I have come to the following conclusions:

(1) The semantics have been somewhat changed by GCC in recent years. There is no actual specification, so we have to go by what behavior actually makes sense.
(2) The primary motivation is still that the DW_CFA_GNU_args_size is a call-site specific annotation. It is expected to be applied when the IP is moved by the personality routine to compensate for the call site specific (temporary) adjustment.

Right.

(3) It is not clear with plain unw_set_ip outside the scope of the Itanium EH handling should have this behavior, so it might need to be split into an internal routine.

I don't know enough about this code to really respond to this.

(4) LLVM does not produce correct CFA annotation for stdcall and similar cases where the callee removes additional stack space.

Here's what we generate for that case today: https://godbolt.org/g/33cNJy
The important part is:

.cfi_escape 0x2e, 0x0c
pushl   $3
.cfi_adjust_cfa_offset 4
pushl   $2
.cfi_adjust_cfa_offset 4
pushl   $1
.cfi_adjust_cfa_offset 4
calll   __Z13may_throw_stdiii@12
.cfi_adjust_cfa_offset -12

Are you saying that the runtime will calculate the wrong CFA because it will include the .cfi_adjust_cfa_offset -12? As in, adding a nop after the call would fix the glitch? If so, I think the right thing to do would be to fix libunwind to use return_address - 1 when unwinding.

Ping @joerg

Closed by commit rL337312: The semantics of DW_CFA_GNU_args_size have changed subtile over the (authored by joerg). · Explain WhyJul 17 2018, 12:05 PM

This revision was automatically updated to reflect the committed changes.

Herald added subscribers: llvm-commits, christof. · View Herald TranscriptJul 17 2018, 12:06 PM

Revision Contents

Path

Size

src/

UnwindCursor.hpp

2 lines

libunwind.cpp

7 lines

Diff 150049

src/UnwindCursor.hpp

Show First 20 Lines • Show All 1,405 Lines • ▼ Show 20 Lines	#error Need _LIBUNWIND_SUPPORT_COMPACT_UNWIND or \
_LIBUNWIND_ARM_EHABI		_LIBUNWIND_ARM_EHABI
#endif		#endif

// update info based on new PC		// update info based on new PC
if (result == UNW_STEP_SUCCESS) {		if (result == UNW_STEP_SUCCESS) {
this->setInfoBasedOnIPRegister(true);		this->setInfoBasedOnIPRegister(true);
if (_unwindInfoMissing)		if (_unwindInfoMissing)
return UNW_STEP_END;		return UNW_STEP_END;
if (_info.gp)
setReg(UNW_REG_SP, getReg(UNW_REG_SP) + _info.gp);
}		}

return result;		return result;
}		}

template <typename A, typename R>		template <typename A, typename R>
void UnwindCursor<A, R>::getInfo(unw_proc_info_t *info) {		void UnwindCursor<A, R>::getInfo(unw_proc_info_t *info) {
*info = _info;		*info = _info;
Show All 12 Lines

src/libunwind.cpp

Show First 20 Lines • Show All 182 Lines • ▼ Show 20 Lines	_LIBUNWIND_EXPORT int unw_set_reg(unw_cursor_t *cursor, unw_regnum_t regNum,
_LIBUNWIND_TRACE_API("unw_set_reg(cursor=%p, regNum=%d, value=0x%" PRIxPTR ")",		_LIBUNWIND_TRACE_API("unw_set_reg(cursor=%p, regNum=%d, value=0x%" PRIxPTR ")",
static_cast<void *>(cursor), regNum, value);		static_cast<void *>(cursor), regNum, value);
typedef LocalAddressSpace::pint_t pint_t;		typedef LocalAddressSpace::pint_t pint_t;
AbstractUnwindCursor co = (AbstractUnwindCursor )cursor;		AbstractUnwindCursor co = (AbstractUnwindCursor )cursor;
if (co->validReg(regNum)) {		if (co->validReg(regNum)) {
co->setReg(regNum, (pint_t)value);		co->setReg(regNum, (pint_t)value);
// specical case altering IP to re-find info (being called by personality		// specical case altering IP to re-find info (being called by personality
// function)		// function)
if (regNum == UNW_REG_IP)		if (regNum == UNW_REG_IP) {
		unw_proc_info_t info;
		co->getInfo(&info);
co->setInfoBasedOnIPRegister(false);		co->setInfoBasedOnIPRegister(false);
		rnkUnsubmitted Not Done Reply Inline Actions I think it makes sense to have this here: the contract is that if the personality sets the IP when the cursor pointed to a PC with a non-zero arg size, we should adjust SP for the personality. However, it's not clear to me that we don't need the same adjustment when stepping across frames that use arg size without a frame pointer. rnk: I think it makes sense to have this here: the contract is that if the personality sets the IP…
		mstorsjoUnsubmitted Not Done Reply Inline Actions When stepping across frames, the gnu args size is already included in, as part of the CFA offset, so with the current code, it steps too far. mstorsjo: When stepping across frames, the gnu args size is already included in, as part of the CFA…
		rnkUnsubmitted Not Done Reply Inline Actions OK, yes, now I see the code that does that. :) rnk: OK, yes, now I see the code that does that. :)
		if (info.gp)
		co->setReg(UNW_REG_SP, co->getReg(UNW_REG_SP) + info.gp);
		}
return UNW_ESUCCESS;		return UNW_ESUCCESS;
}		}
return UNW_EBADREG;		return UNW_EBADREG;
}		}


/// Get value of specified float register at cursor position in stack frame.		/// Get value of specified float register at cursor position in stack frame.
_LIBUNWIND_EXPORT int unw_get_fpreg(unw_cursor_t *cursor, unw_regnum_t regNum,		_LIBUNWIND_EXPORT int unw_get_fpreg(unw_cursor_t *cursor, unw_regnum_t regNum,
▲ Show 20 Lines • Show All 196 Lines • Show Last 20 Lines