This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/X86/
-
Target/
-
X86/
2
X86ISelLowering.cpp

Differential D56275

x86 interrupt calling convention: Fix argument offsets
AbandonedPublic

Authored by phil-opp on Jan 3 2019, 8:46 AM.

Download Raw Diff

Details

Reviewers

craig.topper
rnk

Summary

The x86 interrupt calling convention uses custom stack offsets for the parameters because they are pushed directly by the CPU. Thus the calling convention requires MFI.setObjectOffset(FI, Offset) calls before all returns in LowerMemArgument in order to set the correct offset. Commit 4c3428b604324ed1528a78dbc31c8c8805d3c3e6 introduced new code paths without MFI.setObjectOffset(FI, Offset), which broke the argument passing for the x86 interrupt calling convention in some cases.

For example, it resulted in incorrect error code arguments in Rust binaries, but only when compiled without optimizations: https://github.com/rust-lang/rust/issues/57270

This patch fixes this bug by adding the setObjectOffset calls to the two new code paths.

Diff Detail

Event Timeline

phil-opp created this revision.Jan 3 2019, 8:46 AM

Herald added a subscriber: JDevlieghere. · View Herald TranscriptJan 3 2019, 8:46 AM

Test?

nikic added a subscriber: nikic.Jan 3 2019, 9:35 AM

Test?

I tried to add additional checks to the x86-64-intrcc.ll test, but the patched version of llc generates the exact same assembly as the unpatched version for this test (with and without -O0). I think this is because the test doesn't use the relevant code paths. I'm not sure whether it's practical to construct new tests for every code path and how to construct such tests.

In D56275#1345277, @phil-opp wrote:

Test?

I tried to add additional checks to the x86-64-intrcc.ll test, but the patched version of llc generates the exact same assembly as the unpatched version for this test (with and without -O0). I think this is because the test doesn't use the relevant code paths. I'm not sure whether it's practical to construct new tests for every code path and how to construct such tests.

Either something is 'broken', and this patch fixes things, or the patch is not needed.

How do you know that this patch helps/does the right thing?
There is probably some code that now is not broken?
Can you write an automated check for that?
If yes, you could try to creduce/bugpoint that code to the minimal snippet that still shows the miscompile(?) fix.

rnk added a subscriber: aaboud.Jan 3 2019, 10:09 AM

rnk added inline comments.

lib/Target/X86/X86ISelLowering.cpp
2956–2970	We should really be fixing this so that the CCValAssign VA object has the right offset as soon as AnalyzeArguments in returns. Then all the generic code would work find without updating the stack offset after we create the fixed stack object. It looks like @aaboud added it here.
2992	The existing test doesn't exercise this condition here. I think all you have to do to exercise it is to set up a function with this convention that copies this argument to an alloca, takes its address, and forwards it to another function.

In D56275#1345293, @lebedev.ri wrote:

In D56275#1345277, @phil-opp wrote:

Test?

I tried to add additional checks to the x86-64-intrcc.ll test, but the patched version of llc generates the exact same assembly as the unpatched version for this test (with and without -O0). I think this is because the test doesn't use the relevant code paths. I'm not sure whether it's practical to construct new tests for every code path and how to construct such tests.

Either something is 'broken', and this patch fixes things, or the patch is not needed.

How do you know that this patch helps/does the right thing?
There is probably some code that now is not broken?
Can you write an automated check for that?
If yes, you could try to creduce/bugpoint that code to the minimal snippet that still shows the miscompile(?) fix.

Yes, there is some broken Rust code that multiple people are able to reproduce: https://github.com/phil-opp/blog_os/issues/513. With this patch applied, the issue no longer occurs.

I spent my entire day trying to create a test case from the LLVM IR that the Rust code is translated to, but I could not reproduce the failing behavior in a test. I'm not sure why, maybe because of the different properties of the compilation target (e.g. disabled SSE). I'm sure that it would be relatively simple for someone with more LLVM experience, but I would need to invest much more time to familiarize myself with LLVM first, and I don't have the time to do so currently. If someone wants to take this from me or fix it in a different way (e.g. how @rnk suggested), that's fine with me, I just want this issue to be fixed.

That being said, this patch doesn't break any tests and only affects the x86 interrupt calling convention. Also, it does not introduce any new behavior, but just reverts the behavior change that was done in 4c3428b604324ed1528a78dbc31c8c8805d3c3e6. This behavior change was most likely accidental, because it was not mentioned in the commit and no new tests for the x86 interrupt calling convention were added either. So even though I understand the policy of requiring a test, it makes more sense to me to merge this without a test than to keep the wrong behavior until someone invests the needed time to construct a failing test.

@phil-opp, sorry for the breakage, I understand your frustration. From my perspective, the x86 interrupt calling convention was hacked into the x86 backed in a way that doesn't play nice with orthogonal features, like the copy elision I added. Your fix proliferates the wrong direction of the existing design of the interrupt convention. Yes, it fixes the problem, but if I don't push back, require a test, contact Intel, the original authors of this feature, and pressure them to do it in a more proper way, LLVM will continue to become more unmaintanable.

I would promise to take responsibility and fix this properly myself, but I am going on vacation next week, and I don't plan to bring a laptop. I hesitate to make promises for freely given open source work with a time horizon beyond a week, but if you ping this in a week and a half, I can try to make a test and proper fix then.

In any case, thanks for the report, I'm aware of the problem now.

@rnk Thanks for your thoughtful reply! Sorry for voicing my frustration, I just feared that the issue remains unfixed because no one wants to invest the time to fix it properly. But I understand your reasoning and I see how the current code is somewhat hacky.

Thanks a lot for offering your help! It would be great if this could be fixed in a clean way, so that it won't break on each new code path in the future.

Ping @rnk: Do you have some time to fix this?

In D56275#1359939, @phil-opp wrote:

Ping @rnk: Do you have some time to fix this?

Yep, I plan to pick it up this week.

Yep, I plan to pick it up this week.

Awesome, thanks a lot!

wxiao3 added a subscriber: wxiao3.Jan 17 2019, 6:11 PM

Hi,

I have narrowed the issue to a small test case as below and hope it can help you fix it:

%struct.interrupt_frame = type { i64, i64, i64, i64, i64 }

declare void @foo(%struct.interrupt_frame*, i64*)
define x86_intrcc void @test_fn_ecode(%struct.interrupt_frame* %frame, i64 %ecode) {
  %x.addr = alloca i64, align 4
  store i64 %ecode, i64* %x.addr, align 4
  call void @foo(%struct.interrupt_frame* %frame, i64* %x.addr)
  ret void
}

Wei

I ended up putting together this small cleanup around calling conventions first:
https://reviews.llvm.org/D56883

rnk mentioned this in D56944: [X86] Fix bug in x86_intrcc with arg copy elision.Jan 18 2019, 4:13 PM

Replaced by D56944

rnk mentioned this in rL354837: [X86] Fix bug in x86_intrcc with arg copy elision.Feb 25 2019, 6:11 PM

rnk mentioned this in rG2f055f026ad7: [X86] Fix bug in x86_intrcc with arg copy elision.

Revision Contents

Path

Size

lib/

Target/

X86/

X86ISelLowering.cpp

8 lines

Diff 180064

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,947 Lines • ▼ Show 20 Lines	bool ExtendedInMem =
VA.isExtInLoc() && VA.getValVT().getScalarType() == MVT::i1 &&		VA.isExtInLoc() && VA.getValVT().getScalarType() == MVT::i1 &&
VA.getValVT().getSizeInBits() != VA.getLocVT().getSizeInBits();		VA.getValVT().getSizeInBits() != VA.getLocVT().getSizeInBits();

if (VA.getLocInfo() == CCValAssign::Indirect \|\| ExtendedInMem)		if (VA.getLocInfo() == CCValAssign::Indirect \|\| ExtendedInMem)
ValVT = VA.getLocVT();		ValVT = VA.getLocVT();
else		else
ValVT = VA.getValVT();		ValVT = VA.getValVT();

// Calculate SP offset of interrupt parameter, re-arrange the slot normally		// Calculate SP offset of interrupt parameter, re-arrange the slot normally
// taken by a return address.		// taken by a return address.
int Offset = 0;		int Offset = 0;
if (CallConv == CallingConv::X86_INTR) {		if (CallConv == CallingConv::X86_INTR) {
// X86 interrupts may take one or two arguments.		// X86 interrupts may take one or two arguments.
// On the stack there will be no return address as in regular call.		// On the stack there will be no return address as in regular call.
// Offset of last argument need to be set to -4/-8 bytes.		// Offset of last argument need to be set to -4/-8 bytes.
// Where offset of the first argument out of two, should be set to 0 bytes.		// Where offset of the first argument out of two, should be set to 0 bytes.
Offset = (Subtarget.is64Bit() ? 8 : 4) * ((i + 1) % Ins.size() - 1);		Offset = (Subtarget.is64Bit() ? 8 : 4) * ((i + 1) % Ins.size() - 1);
if (Subtarget.is64Bit() && Ins.size() == 2) {		if (Subtarget.is64Bit() && Ins.size() == 2) {
// The stack pointer needs to be realigned for 64 bit handlers with error		// The stack pointer needs to be realigned for 64 bit handlers with error
// code, so the argument offset changes by 8 bytes.		// code, so the argument offset changes by 8 bytes.
Offset += 8;		Offset += 8;
}		}
}		}
		rnkUnsubmitted Not Done Reply Inline Actions We should really be fixing this so that the CCValAssign VA object has the right offset as soon as AnalyzeArguments in returns. Then all the generic code would work find without updating the stack offset after we create the fixed stack object. It looks like @aaboud added it here. rnk: We should really be fixing this so that the CCValAssign VA object has the right offset as soon…

// FIXME: For now, all byval parameter objects are marked mutable. This can be		// FIXME: For now, all byval parameter objects are marked mutable. This can be
// changed with more analysis.		// changed with more analysis.
// In case of tail call optimization mark all arguments mutable. Since they		// In case of tail call optimization mark all arguments mutable. Since they
// could be overwritten by lowering of arguments in case of a tail call.		// could be overwritten by lowering of arguments in case of a tail call.
if (Flags.isByVal()) {		if (Flags.isByVal()) {
unsigned Bytes = Flags.getByValSize();		unsigned Bytes = Flags.getByValSize();
if (Bytes == 0) Bytes = 1; // Don't create zero-sized stack objects.		if (Bytes == 0) Bytes = 1; // Don't create zero-sized stack objects.

// FIXME: For now, all byval parameter objects are marked as aliasing. This		// FIXME: For now, all byval parameter objects are marked as aliasing. This
// can be improved with deeper analysis.		// can be improved with deeper analysis.
int FI = MFI.CreateFixedObject(Bytes, VA.getLocMemOffset(), isImmutable,		int FI = MFI.CreateFixedObject(Bytes, VA.getLocMemOffset(), isImmutable,
/isAliased=/true);		/isAliased=/true);
// Adjust SP offset of interrupt parameter.		// Adjust SP offset of interrupt parameter.
if (CallConv == CallingConv::X86_INTR) {		if (CallConv == CallingConv::X86_INTR) {
MFI.setObjectOffset(FI, Offset);		MFI.setObjectOffset(FI, Offset);
}		}
return DAG.getFrameIndex(FI, PtrVT);		return DAG.getFrameIndex(FI, PtrVT);
}		}

// This is an argument in memory. We might be able to perform copy elision.		// This is an argument in memory. We might be able to perform copy elision.
if (Flags.isCopyElisionCandidate()) {		if (Flags.isCopyElisionCandidate()) {
		rnkUnsubmitted Not Done Reply Inline Actions The existing test doesn't exercise this condition here. I think all you have to do to exercise it is to set up a function with this convention that copies this argument to an alloca, takes its address, and forwards it to another function. rnk: The existing test doesn't exercise this condition here. I think all you have to do to exercise…
EVT ArgVT = Ins[i].ArgVT;		EVT ArgVT = Ins[i].ArgVT;
SDValue PartAddr;		SDValue PartAddr;
if (Ins[i].PartOffset == 0) {		if (Ins[i].PartOffset == 0) {
// If this is a one-part value or the first part of a multi-part value,		// If this is a one-part value or the first part of a multi-part value,
// create a stack object for the entire argument value type and return a		// create a stack object for the entire argument value type and return a
// load from our portion of it. This assumes that if the first part of an		// load from our portion of it. This assumes that if the first part of an
// argument is in memory, the rest will also be in memory.		// argument is in memory, the rest will also be in memory.
int FI = MFI.CreateFixedObject(ArgVT.getStoreSize(), VA.getLocMemOffset(),		int FI = MFI.CreateFixedObject(ArgVT.getStoreSize(), VA.getLocMemOffset(),
/Immutable=/false);		/Immutable=/false);
		// Adjust SP offset of interrupt parameter.
		if (CallConv == CallingConv::X86_INTR) {
		MFI.setObjectOffset(FI, Offset);
		}
PartAddr = DAG.getFrameIndex(FI, PtrVT);		PartAddr = DAG.getFrameIndex(FI, PtrVT);
return DAG.getLoad(		return DAG.getLoad(
ValVT, dl, Chain, PartAddr,		ValVT, dl, Chain, PartAddr,
MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI));		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI));
} else {		} else {
// This is not the first piece of an argument in memory. See if there is		// This is not the first piece of an argument in memory. See if there is
// already a fixed stack object including this offset. If so, assume it		// already a fixed stack object including this offset. If so, assume it
// was created by the PartOffset == 0 branch above and create a load from		// was created by the PartOffset == 0 branch above and create a load from
// the appropriate offset into it.		// the appropriate offset into it.
int64_t PartBegin = VA.getLocMemOffset();		int64_t PartBegin = VA.getLocMemOffset();
int64_t PartEnd = PartBegin + ValVT.getSizeInBits() / 8;		int64_t PartEnd = PartBegin + ValVT.getSizeInBits() / 8;
int FI = MFI.getObjectIndexBegin();		int FI = MFI.getObjectIndexBegin();
for (; MFI.isFixedObjectIndex(FI); ++FI) {		for (; MFI.isFixedObjectIndex(FI); ++FI) {
int64_t ObjBegin = MFI.getObjectOffset(FI);		int64_t ObjBegin = MFI.getObjectOffset(FI);
int64_t ObjEnd = ObjBegin + MFI.getObjectSize(FI);		int64_t ObjEnd = ObjBegin + MFI.getObjectSize(FI);
if (ObjBegin <= PartBegin && PartEnd <= ObjEnd)		if (ObjBegin <= PartBegin && PartEnd <= ObjEnd)
break;		break;
}		}
if (MFI.isFixedObjectIndex(FI)) {		if (MFI.isFixedObjectIndex(FI)) {
SDValue Addr =		SDValue Addr =
DAG.getNode(ISD::ADD, dl, PtrVT, DAG.getFrameIndex(FI, PtrVT),		DAG.getNode(ISD::ADD, dl, PtrVT, DAG.getFrameIndex(FI, PtrVT),
DAG.getIntPtrConstant(Ins[i].PartOffset, dl));		DAG.getIntPtrConstant(Ins[i].PartOffset, dl));
		// Adjust SP offset of interrupt parameter.
		if (CallConv == CallingConv::X86_INTR) {
		MFI.setObjectOffset(FI, Offset);
		}
return DAG.getLoad(		return DAG.getLoad(
ValVT, dl, Chain, Addr,		ValVT, dl, Chain, Addr,
MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI,		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI,
Ins[i].PartOffset));		Ins[i].PartOffset));
}		}
}		}
}		}

▲ Show 20 Lines • Show All 39,458 Lines • Show Last 20 Lines