Fixes PR23804: assertion failure in emitPrologue in the case of a function with an empty frame and a dynamic alloca that needs stack realignment.
This is a typical case for AddressSanitizer.
Details
- Reviewers
kristof.beyls eugenis
Diff Detail
Event Timeline
I think there's more going on here that needs fixing too. For example if I add a simple "%tmp = alloca i8, i32 4" to your entry block I get the same assertion failure David mentioned in the PR (even with your patch).
To me it looks like NeedsRealignment is misdefined, though that might not be the end of it. Elsewhere we used RegInfo->needsStackRealignment(MF), which looks like it would be an improvement, but doesn't actually fix the issue.
Another possible problem in this area is that if NumBytes != 0, we end up realigning the stack twice: once in the prologue and once at the alloca. That's arguably a separate performance problem though.
Cheers.
Tim.
Good point, thanks. This particular case is easy to fix by reusing the base register for scratchSPReg. This matches register reservation code in processFunctionBeforeCalleeSavedScan.
RegInfo->needsStackRealignment is also true in this case, and it has roughly the same meaning as getMaxAlignment, i.e. it does not take alloca scopes into account. This problem (extra re-alignment) is also present on i686 and x86_64 targets.
I think the logic of this patch is roughly right, I only have a few minor comments.
After addressing those comments, this LGTM.
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
352 | This should become const bool NeedsRealignment = RegInfo->needsStackRealignment(MF); That's in line with what other backends do and respects the DRY principle more. | |
356–358 | I think slightly nicer/better logic is if (RegInfo->hasBasePointer(MF)) scratchSPReg = RegInfo->getBaseRegister(); else scratchSPReg = AArch64::X9; assert(MF.getRegInfo().isPhysRegUsed(scratchSPReg) && "No scratch register to align SP!"); With the above code, a few small tweaks to the tests in test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll are needed (to use the base pointer X19 instead of X9 for a few tests). When both X9 and X19 are available as scratch registers it's an arbitrary decision which one to use. | |
test/CodeGen/AArch64/aarch64-dynamic-stack-layout-pr23804.ll | ||
1 ↗ | (On Diff #28822) | I don't think there is a need to create a separate test file just for pr23804, these test cases can be added to test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
356–358 | Can't we just nick any temporary register we please (I think the original comment miscategorises X9) and make sure it's used? scratchSPReg = AArch64::X9; MF.getRegInfo().setPhysRegUsed(scratchSPReg); // Take that! |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
356–358 | The caller in PrologEpilogInserter.cpp does TFI->processFunctionBeforeCalleeSavedScan(Fn, RS); // Scan the function for modified callee saved registers and insert spill code // for any callee saved registers that are modified. calculateCalleeSavedRegisters(Fn); ... if (!F->hasFnAttribute(Attribute::Naked)) insertPrologEpilogCode(Fn); So it looks like it is too late to add used regisgters at this point (in emitPrologue). |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
354–358 | But X9 isn't callee-saved (AAPCS says it's temporary), so I don't think we need to worry about stack layouts: it's basically dead at this point. Unless there's some assertion that *everything* is fixed by now? |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
354–358 | Well, all tests pass. |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
354–358 | It seems like the simplest variant so far. I'd say so, unless Kristof objects. |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
354–358 | I can't think of a reason not to do the simpler variant (always using X9 as the scratch register) - so no objections from my side. |
I've only got two very minor suggestions for improvements.
With those improvements, LGTM!
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
354–358 | I'd keep the comment // Use the first callee-saved register as a scratch register otherwise looks ok. | |
test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll | ||
525–526 | This comment isn't correct anymore in the newest version of the patch. I would just remove the comment. |
This should become
That's in line with what other backends do and respects the DRY principle more.