This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.
Details
Diff Detail
Event Timeline
lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
2750 | This could use a brief comment to explain why stack realignment matters. |
Why isn't SP used in this case? I'd expect that the frame pointer addresses parameters, followed by the alignment gap, followed by the locals area, which is addressable with fixed offsets from SP.
If there's a call to localaddress in a function without funclets or VLAs, we should use sp, yes. That should be rare in practice, but I guess the testcase is an example.
So I guess the logic should be something like this:
if (!hasVarSizedObjects && !hasFunclets) --> use SP else if (needsStackRealignment) --> use BP else --> use FP
Is that not what the existing code does? RegInfo->getFrameRegister(MF) does this to choose between SP and FP:
unsigned AArch64RegisterInfo::getFrameRegister(const MachineFunction &MF) const { const AArch64FrameLowering *TFI = getFrameLowering(MF); return TFI->hasFP(MF) ? AArch64::FP : AArch64::SP; }
And hasBasePointer checks the same conditions you've listed here.
...
I see, hasFP returns true when there are calls and the stack frame is large, and also in this case:
// Win64 SEH requires frame pointer if funclets are present. if (MF.hasLocalEscape()) return true;
Are you sure we still want that there? It sounds like we are trying to allow addressing variables with SP when localescape is present.
hasBasePointer checks the same conditions
We sometimes emit a base pointer or a frame pointer when it isn't strictly necessary. We don't want to change the result of llvm.localaddress() in that case.
The MF.hasLocalEscape() check in hasFP() probably isn't necessary.
In the existing code, there are several conditions on when FP/BP/SP should be used. I have tried to summarize them here:
use BP: if ((hasVarSizedObjects || hasEHFunclets)) && (needsStackRealignment || LocalFrameSize >= 256)) use FP: if (hasEHFunclets || hasVarSizedObjects || needsStackRealignment || hasLocalEscape || hasCalls || isFrameAddressTaken || hasStackMap || hasPatchPoint || !MaxCallFrameSizeComputed || MaxCallFrameSize > DefaultSafeSPDisplacement) else use SP
Here's my understanding of how the locals should be accessed in various scenarios:
struct S { int x; }; // Use FP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = false, needsStackRealignment = false) void simple() { struct S o; __try { o.x; } __finally { o.x; } } // Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = false, needsStackRealignment = true) void stack_realignment() { struct S __declspec(align(32)) o; __try { o.x; } __finally { o.x; } } // Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = true, needsStackRealignment = false) void vla_present(int n) { int vla[n]; __try { vla[0]; } __finally { vla[0]; } } // Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = true, needsStackRealignment = true) void all(int n) { struct S __declspec(align(32)) o; int vla[n]; __try { o.x; vla[0]; } __finally { o.x; vla[0]; } } // Use SP to access locals: (hasFunclets = false, hasVarSizedObjects = false, needsStackRealignment = false) void non_seh() { // call to llvm.localaddress(); }
@rnk @efriedma Could you please comment on if all the scenarios have been captured here and if the behavior is what is expected?
use FP:
if (hasEHFunclets || hasVarSizedObjects || needsStackRealignment || hasLocalEscape || hasCalls || isFrameAddressTaken || hasStackMap || hasPatchPoint || !MaxCallFrameSizeComputed || MaxCallFrameSize > DefaultSafeSPDisplacement)
Should FP be used if needsStackRealignment is true and none of the other conditions are true? I would've expected that SP needs to be used, because FP points to the parameter space before the stack was realigned.
I suppose that it is correct to force a frame to use FP when hasLocalEscape is true, assuming we've already checked the conditions under which BP is needed.
// Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = true, needsStackRealignment = false) void vla_present(int n) {
Should still use fp here. (Having both VLAs and funclets isn't really any different from having only one of them.)
Deleted test/CodeGen/AArch64/seh-localescape.ll as localescape testing is better covered under the updated test/CodeGen/AArch64/seh-finally.ll.
With this patch, most of the SEH tests in https://github.com/Microsoft/windows_seh_tests/blob/master/src/xcpt4/xcpt4u.c pass.
Great to have this. Thanks @mgrang. Curious on what remaining tests fail in xcpt4u.c?
LGTM. I think this is the correct model for representing frame offsets for localescape/localrecover.
(The code to handle eh_recoverfp correctly is still missing, but I think it's okay to handle that in a separate patch.)
Please give Reid a few days to comment before you merge, though.
I see, we need a separate register selection codepath that ignores frame size considerations. (I could be wrong, I haven't dug that deep, though.)
I think the functionality is good, but I keep insisting that we name these routines after the localescape / localrecover intrinsic set, since they are ostensibly for lambdas as well as SEH. :) Feel free to use your judgement, don't want for me to review it.
include/llvm/CodeGen/TargetFrameLowering.h | ||
---|---|---|
267 ↗ | (On Diff #184403) | Let's call this getNonLocalFrameIndexReference. In theory, this should have nothing to do with EH. llvm.localescape is a separate LLVM IR feature that happens to support the frontend-outlined try / except / __finally funclets. |
lib/Target/AArch64/AArch64RegisterInfo.h | ||
125 ↗ | (On Diff #184403) | Similarly, perhaps this should be getLocalAddressRegister so it's associated with the intrinsics. |
@TomTan Tests with features which have not yet been implemented (like __leave and asynchronous SEH) are failing.
This could use a brief comment to explain why stack realignment matters.