Download Raw Diff

Details

Reviewers

Commits

rG198df5f68204: Weaken MFI Max Call Frame Size Assertion

Summary

A year ago when I was not invested at all into compilers, I found an assertion error when building an AArch64 debug build with LTO + CFI, among other combinations.

It was posted as a github issue here: https://github.com/llvm/llvm-project/issues/54088

I took it upon myself to revisit the issue now that I have spent some more time working on LLVM.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

oskarwirga created this revision.May 23 2023, 6:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2023, 6:27 PM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

oskarwirga requested review of this revision.May 23 2023, 6:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2023, 6:27 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

barannikov88 added a subscriber: barannikov88.May 23 2023, 6:48 PM

barannikov88 added inline comments.

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll
2	The test without -mtriple will run for the default target (LLVM_DEFAULT_TARGET_TRIPLE). Does it trigger on any backend? If so, can it be moved to llvm/test/CodeGen/Generic?

barannikov88 added inline comments.May 23 2023, 6:58 PM

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll
2	Ah, never mind, it contains the triple string. It is a bit of unusual though, llc test don't usually have neither triple nor datalayout.

Harbormaster completed remote builds in B234047: Diff 524966.May 23 2023, 7:08 PM

oskarwirga added inline comments.May 23 2023, 7:37 PM

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll
2	I don't know what is usual when it comes to this stuff, happy to change things to make it proper :)

I'm not sure this is the correct fix. The assertion specifically checks that MaxCallFrameSize and AdjustsStack computed here are the same as computed by MachineFrameInfo::computeMaxCallFrameSize.
Have you been able to figure out why the results of the calculations are different?

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll
2	llc tests usually have -mtriple argument instead of specifying 'target triple' in the source and usually don't have 'target datalayout' (it is inferred from the triple). This allows you to write multiple RUN lines with different -mtriple's, i.e. makes tests more flexible.

In D151276#4366712, @barannikov88 wrote:

I'm not sure this is the correct fix. The assertion specifically checks that MaxCallFrameSize and AdjustsStack computed here are the same as computed by MachineFrameInfo::computeMaxCallFrameSize.
Have you been able to figure out why the results of the calculations are different?

Yes, bcl5980 commented on the github issue:

In the AArch64TargetLowering::finalizeLowering it call MFI.computeMaxCallFrameSize(MF) to get MaxCallFrameSize
MaxCallFrameSize is 8 because a ADJCALLSTACKDOWN/ADJCALLSTACKUP in bb.2
And in the pass unreachable-mbb-elimination, bb.2 will be removed
Then the assert will be triggered in the pass prologepilog
Not sure how to fix the assert

I was able to fix this assertion by appending to UnreachableMachineBlockElim::runOnMachineFunction

MachineFrameInfo &MFI = F.getFrameInfo();
MFI.computeMaxCallFrameSize(F);

but was advised to weaken this assertion by @MatzeB with his reasoning being that the assertion was mostly to catch cases where the value grew later, rather than shrank.

In D151276#4366847, @oskarwirga wrote:
In D151276#4366712, @barannikov88 wrote:

I'm not sure this is the correct fix. The assertion specifically checks that MaxCallFrameSize and AdjustsStack computed here are the same as computed by MachineFrameInfo::computeMaxCallFrameSize.
Have you been able to figure out why the results of the calculations are different?

Yes, bcl5980 commented on the github issue:

In the AArch64TargetLowering::finalizeLowering it call MFI.computeMaxCallFrameSize(MF) to get MaxCallFrameSize
MaxCallFrameSize is 8 because a ADJCALLSTACKDOWN/ADJCALLSTACKUP in bb.2
And in the pass unreachable-mbb-elimination, bb.2 will be removed
Then the assert will be triggered in the pass prologepilog
Not sure how to fix the assert

I was able to fix this assertion by appending to UnreachableMachineBlockElim::runOnMachineFunction
MachineFrameInfo &MFI = F.getFrameInfo();
MFI.computeMaxCallFrameSize(F);
but was advised to weaken this assertion by @MatzeB with his reasoning being that the assertion was mostly to catch cases where the value grew later, rather than shrank.

In this case AdjustsStack on line 352 should be initialized to false and MFI.adjustsStack() == AdjustsStack should also be removed making this assertion kind of useless.
If @MatzeB is happy with this, I'm too.

Fix the test to be more normal

oskarwirga marked 3 inline comments as done.May 24 2023, 6:37 PM

Harbormaster completed remote builds in B234366: Diff 525395.May 24 2023, 7:53 PM

In this case AdjustsStack on line 352 should be initialized to false and MFI.adjustsStack() == AdjustsStack should also be removed making this assertion kind of useless.
If @MatzeB is happy with this, I'm too.

Maybe we can handle this in a similar manner. AdjustsStack==true should be the more conservative answer and we should be fine going from an initial true to false after optimizations. So we can weaken this check as well to only assert that we are not going from false to true...

MatzeB added inline comments.May 31 2023, 11:24 AM

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll
2	A simple `-mtriple aarch64--` may work too.
11–27	can you simplify (in the sense of less lines in the .ll file) this test more by dropping the `!type` annotations and the data for them, the `#0` attribute function attributes, `internal` and `fastcc`, the `!nosanitize` metadata?

Make sure we don't go from false to true.

barannikov88 added inline comments.May 31 2023, 12:04 PM

llvm/lib/CodeGen/PrologEpilogInserter.cpp
352	This should be initialized to false. There is no difference in practice, but it will be clearer that the flag is recalculated.

oskarwirga updated this revision to Diff 527157.May 31 2023, 12:18 PM

oskarwirga marked an inline comment as done.

Make the target generic aarch64

oskarwirga marked an inline comment as done.May 31 2023, 12:48 PM

Harbormaster completed remote builds in B235651: Diff 527171.May 31 2023, 2:22 PM

oskarwirga marked an inline comment as not done.May 31 2023, 8:24 PM

oskarwirga added inline comments.

llvm/lib/CodeGen/PrologEpilogInserter.cpp
352	Interestingly setting this to false causes `CodeGen/X86/asan-check-memaccess-add.ll` to crash :D

oskarwirga marked an inline comment as not done.May 31 2023, 8:25 PM

Is there any reason why this would be emitting less instructions for just asan-check-memaccess-add.ll?

Herald added a subscriber: pengfei. · View Herald TranscriptMay 31 2023, 8:34 PM

Harbormaster completed remote builds in B235729: Diff 527269.May 31 2023, 9:49 PM

barannikov88 added inline comments.Jun 1 2023, 12:44 AM

llvm/lib/CodeGen/PrologEpilogInserter.cpp
352	Probably it changed from true to false, which could never happen before. I didn't expect a change in behavior, so feel free to revert this part

Lets go back to what makes the least changes

Harbormaster completed remote builds in B235883: Diff 527468.Jun 1 2023, 11:05 AM

Let's see how this goes. LGTM

This revision is now accepted and ready to land.Jul 5 2023, 1:30 PM

Closed by commit rG198df5f68204: Weaken MFI Max Call Frame Size Assertion (authored by oskarwirga, committed by MatzeB). · Explain WhyJul 5 2023, 2:03 PM

This revision was automatically updated to reflect the committed changes.

MatzeB added a commit: rG198df5f68204: Weaken MFI Max Call Frame Size Assertion.

Diff 537487

llvm/lib/CodeGen/PrologEpilogInserter.cpp

Show First 20 Lines • Show All 343 Lines • ▼ Show 20 Lines
/// variables for the function's frame information and eliminate call frame		/// variables for the function's frame information and eliminate call frame
/// pseudo instructions.		/// pseudo instructions.
void PEI::calculateCallFrameInfo(MachineFunction &MF) {		void PEI::calculateCallFrameInfo(MachineFunction &MF) {
const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();		const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();		const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();
MachineFrameInfo &MFI = MF.getFrameInfo();		MachineFrameInfo &MFI = MF.getFrameInfo();

unsigned MaxCallFrameSize = 0;		unsigned MaxCallFrameSize = 0;
bool AdjustsStack = MFI.adjustsStack();		bool AdjustsStack = MFI.adjustsStack();
		barannikov88Unsubmitted Not Done Reply Inline Actions This should be initialized to false. There is no difference in practice, but it will be clearer that the flag is recalculated. barannikov88: This should be initialized to false. There is no difference in practice, but it will be clearer…
		oskarwirgaAuthorUnsubmitted Not Done Reply Inline Actions Interestingly setting this to false causes `CodeGen/X86/asan-check-memaccess-add.ll` to crash :D oskarwirga: Interestingly setting this to false causes `CodeGen/X86/asan-check-memaccess-add.ll` to crash :D
		barannikov88Unsubmitted Not Done Reply Inline Actions Probably it changed from true to false, which could never happen before. I didn't expect a change in behavior, so feel free to revert this part barannikov88: Probably it changed from true to false, which could never happen before. I didn't expect a…

// Get the function call frame set-up and tear-down instruction opcode		// Get the function call frame set-up and tear-down instruction opcode
unsigned FrameSetupOpcode = TII.getCallFrameSetupOpcode();		unsigned FrameSetupOpcode = TII.getCallFrameSetupOpcode();
unsigned FrameDestroyOpcode = TII.getCallFrameDestroyOpcode();		unsigned FrameDestroyOpcode = TII.getCallFrameDestroyOpcode();

// Early exit for targets which have no call frame setup/destroy pseudo		// Early exit for targets which have no call frame setup/destroy pseudo
// instructions.		// instructions.
if (FrameSetupOpcode == ~0u && FrameDestroyOpcode == ~0u)		if (FrameSetupOpcode == ~0u && FrameDestroyOpcode == ~0u)
Show All 10 Lines	for (MachineBasicBlock::iterator I = BB.begin(); I != BB.end(); ++I)
} else if (I->isInlineAsm()) {		} else if (I->isInlineAsm()) {
// Some inline asm's need a stack frame, as indicated by operand 1.		// Some inline asm's need a stack frame, as indicated by operand 1.
unsigned ExtraInfo = I->getOperand(InlineAsm::MIOp_ExtraInfo).getImm();		unsigned ExtraInfo = I->getOperand(InlineAsm::MIOp_ExtraInfo).getImm();
if (ExtraInfo & InlineAsm::Extra_IsAlignStack)		if (ExtraInfo & InlineAsm::Extra_IsAlignStack)
AdjustsStack = true;		AdjustsStack = true;
}		}

assert(!MFI.isMaxCallFrameSizeComputed() \|\|		assert(!MFI.isMaxCallFrameSizeComputed() \|\|
(MFI.getMaxCallFrameSize() == MaxCallFrameSize &&		(MFI.getMaxCallFrameSize() >= MaxCallFrameSize &&
MFI.adjustsStack() == AdjustsStack));		!(AdjustsStack && !MFI.adjustsStack())));
MFI.setAdjustsStack(AdjustsStack);		MFI.setAdjustsStack(AdjustsStack);
MFI.setMaxCallFrameSize(MaxCallFrameSize);		MFI.setMaxCallFrameSize(MaxCallFrameSize);

for (MachineBasicBlock::iterator I : FrameSDOps) {		for (MachineBasicBlock::iterator I : FrameSDOps) {
// If call frames are not being included as part of the stack frame, and		// If call frames are not being included as part of the stack frame, and
// the target doesn't indicate otherwise, remove the call frame pseudos		// the target doesn't indicate otherwise, remove the call frame pseudos
// here. The sub/add sp instruction pairs are still inserted, but we don't		// here. The sub/add sp instruction pairs are still inserted, but we don't
// need to track the SP adjustment for frame index elimination.		// need to track the SP adjustment for frame index elimination.
▲ Show 20 Lines • Show All 1,211 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll

This file was added.

				; RUN: llc < %s -mtriple aarch64--

				barannikov88Unsubmitted Done Reply Inline Actions The test without -mtriple will run for the default target (LLVM_DEFAULT_TARGET_TRIPLE). Does it trigger on any backend? If so, can it be moved to llvm/test/CodeGen/Generic? barannikov88: The test without -mtriple will run for the default target (LLVM_DEFAULT_TARGET_TRIPLE). Does it…
				barannikov88Unsubmitted Done Reply Inline Actions Ah, never mind, it contains the triple string. It is a bit of unusual though, llc test don't usually have neither triple nor datalayout. barannikov88: Ah, never mind, it contains the triple string. It is a bit of unusual though, llc test don't…
				oskarwirgaAuthorUnsubmitted Done Reply Inline Actions I don't know what is usual when it comes to this stuff, happy to change things to make it proper :) oskarwirga: I don't know what is usual when it comes to this stuff, happy to change things to make it…
				barannikov88Unsubmitted Done Reply Inline Actions llc tests usually have -mtriple argument instead of specifying 'target triple' in the source and usually don't have 'target datalayout' (it is inferred from the triple). This allows you to write multiple RUN lines with different -mtriple's, i.e. makes tests more flexible. barannikov88: llc tests usually have -mtriple argument instead of specifying 'target triple' in the source…
				MatzeBUnsubmitted Done Reply Inline Actions A simple `-mtriple aarch64--` may work too. MatzeB: A simple `-mtriple aarch64--` may work too.
				; This tests that the MFI assert in unreachableblockelim pass
				; does not trigger

				%struct.ngtcp2_crypto_aead = type { i8*, i64 }
				%struct.ngtcp2_crypto_aead_ctx = type { i8* }

				; Function Attrs: noinline optnone
				define internal fastcc void @decrypt_pkt() unnamed_addr #0 !type !0 !type !1 {
				entry:
				br i1 false, label %cont, label %trap, !nosanitize !2

				trap: ; preds = %entry
				unreachable, !nosanitize !2

				cont: ; preds = %entry
				%call = call i32 undef(i8* undef, %struct.ngtcp2_crypto_aead* undef, %struct.ngtcp2_crypto_aead_ctx* undef, i8* undef, i64 undef, i8* undef, i64 undef, i8* undef, i64 undef)
				ret void
				}

				attributes #0 = { noinline optnone }

				!0 = !{i64 0, !"_ZTSFlPhPK18ngtcp2_crypto_aeadPKhmS4_mlP16ngtcp2_crypto_kmPFiS_S2_PK22ngtcp2_crypto_aead_ctxS4_mS4_mS4_mEE"}
				!1 = !{i64 0, !"_ZTSFlPvPKvS1_mS1_mlS_S_E.generalized"}
				!2 = !{}

This is an archive of the discontinued LLVM Phabricator instance.

Weaken MFI Max Call Frame Size Assertion
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 537487

llvm/lib/CodeGen/PrologEpilogInserter.cpp

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll

This is an archive of the discontinued LLVM Phabricator instance.

Weaken MFI Max Call Frame Size AssertionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 537487

llvm/lib/CodeGen/PrologEpilogInserter.cpp

llvm/test/CodeGen/AArch64/compute-call-frame-size-unreachable-pass.ll

Weaken MFI Max Call Frame Size Assertion
ClosedPublic