This is an archive of the discontinued LLVM Phabricator instance.

[COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects
ClosedPublic

Authored by mgrang on Jan 24 2019, 1:21 PM.

Details

Summary

This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Diff Detail

Repository
rL LLVM

Event Timeline

mgrang created this revision.Jan 24 2019, 1:21 PM
efriedma added inline comments.Jan 24 2019, 1:42 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
2750 ↗(On Diff #183381)

This could use a brief comment to explain why stack realignment matters.

rnk added a comment.Jan 24 2019, 2:04 PM

Why isn't SP used in this case? I'd expect that the frame pointer addresses parameters, followed by the alignment gap, followed by the locals area, which is addressable with fixed offsets from SP.

If there's a call to localaddress in a function without funclets or VLAs, we should use sp, yes. That should be rare in practice, but I guess the testcase is an example.

If there's a call to localaddress in a function without funclets or VLAs, we should use sp, yes. That should be rare in practice, but I guess the testcase is an example.

So I guess the logic should be something like this:

if (!hasVarSizedObjects && !hasFunclets) --> use SP
else if (needsStackRealignment) --> use BP
else --> use FP

Yes, that looks right.

rnk added a comment.Jan 24 2019, 5:13 PM

If there's a call to localaddress in a function without funclets or VLAs, we should use sp, yes. That should be rare in practice, but I guess the testcase is an example.

So I guess the logic should be something like this:

if (!hasVarSizedObjects && !hasFunclets) --> use SP
else if (needsStackRealignment) --> use BP
else --> use FP

Is that not what the existing code does? RegInfo->getFrameRegister(MF) does this to choose between SP and FP:

unsigned
AArch64RegisterInfo::getFrameRegister(const MachineFunction &MF) const {
  const AArch64FrameLowering *TFI = getFrameLowering(MF);
  return TFI->hasFP(MF) ? AArch64::FP : AArch64::SP;
}

And hasBasePointer checks the same conditions you've listed here.

...

I see, hasFP returns true when there are calls and the stack frame is large, and also in this case:

// Win64 SEH requires frame pointer if funclets are present.
if (MF.hasLocalEscape())
  return true;

Are you sure we still want that there? It sounds like we are trying to allow addressing variables with SP when localescape is present.

hasBasePointer checks the same conditions

We sometimes emit a base pointer or a frame pointer when it isn't strictly necessary. We don't want to change the result of llvm.localaddress() in that case.


The MF.hasLocalEscape() check in hasFP() probably isn't necessary.

mgrang added a comment.EditedJan 25 2019, 2:19 PM

In the existing code, there are several conditions on when FP/BP/SP should be used. I have tried to summarize them here:

use BP:
  if ((hasVarSizedObjects || hasEHFunclets)) && (needsStackRealignment || LocalFrameSize >= 256))

use FP:	
  if (hasEHFunclets || hasVarSizedObjects || needsStackRealignment || hasLocalEscape ||
      hasCalls || isFrameAddressTaken || hasStackMap || hasPatchPoint || !MaxCallFrameSizeComputed ||
      MaxCallFrameSize > DefaultSafeSPDisplacement)

else use SP

Here's my understanding of how the locals should be accessed in various scenarios:

struct S { int x; };

// Use FP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = false, needsStackRealignment = false)
void simple() {
  struct S o;

  __try { o.x; }
  __finally { o.x; }
}

// Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = false, needsStackRealignment = true)
void stack_realignment() {
  struct S __declspec(align(32)) o;

  __try { o.x; }
  __finally { o.x; }
}

// Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = true, needsStackRealignment = false)
void vla_present(int n) {
  int vla[n];

  __try { vla[0]; }
  __finally { vla[0]; }
}

// Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = true, needsStackRealignment = true)
void all(int n) {
  struct S __declspec(align(32)) o;
  int vla[n];
  
  __try { o.x; vla[0]; }
  __finally { o.x; vla[0]; }
}

// Use SP to access locals: (hasFunclets = false, hasVarSizedObjects = false, needsStackRealignment = false)
void non_seh() {
  // call to llvm.localaddress();
}

@rnk @efriedma Could you please comment on if all the scenarios have been captured here and if the behavior is what is expected?

rnk added a comment.Jan 25 2019, 4:48 PM

use FP:

if (hasEHFunclets || hasVarSizedObjects || needsStackRealignment || hasLocalEscape ||
    hasCalls || isFrameAddressTaken || hasStackMap || hasPatchPoint || !MaxCallFrameSizeComputed ||
    MaxCallFrameSize > DefaultSafeSPDisplacement)

Should FP be used if needsStackRealignment is true and none of the other conditions are true? I would've expected that SP needs to be used, because FP points to the parameter space before the stack was realigned.

I suppose that it is correct to force a frame to use FP when hasLocalEscape is true, assuming we've already checked the conditions under which BP is needed.

// Use BP to access escaped locals: (hasFunclets = true, hasVarSizedObjects = true, needsStackRealignment = false)
void vla_present(int n) {

Should still use fp here. (Having both VLAs and funclets isn't really any different from having only one of them.)

mgrang updated this revision to Diff 184403.Jan 30 2019, 4:31 PM
mgrang edited the summary of this revision. (Show Details)

Deleted test/CodeGen/AArch64/seh-localescape.ll as localescape testing is better covered under the updated test/CodeGen/AArch64/seh-finally.ll.

mgrang retitled this revision from [COFF, ARM64] Fix localaddress to handle stack realignment to [COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects.Jan 30 2019, 4:33 PM

Great to have this. Thanks @mgrang. Curious on what remaining tests fail in xcpt4u.c?

efriedma accepted this revision.Jan 30 2019, 5:50 PM

LGTM. I think this is the correct model for representing frame offsets for localescape/localrecover.

(The code to handle eh_recoverfp correctly is still missing, but I think it's okay to handle that in a separate patch.)

Please give Reid a few days to comment before you merge, though.

This revision is now accepted and ready to land.Jan 30 2019, 5:50 PM
rnk accepted this revision.Jan 31 2019, 6:15 PM

I see, we need a separate register selection codepath that ignores frame size considerations. (I could be wrong, I haven't dug that deep, though.)

I think the functionality is good, but I keep insisting that we name these routines after the localescape / localrecover intrinsic set, since they are ostensibly for lambdas as well as SEH. :) Feel free to use your judgement, don't want for me to review it.

include/llvm/CodeGen/TargetFrameLowering.h
267 ↗(On Diff #184403)

Let's call this getNonLocalFrameIndexReference. In theory, this should have nothing to do with EH. llvm.localescape is a separate LLVM IR feature that happens to support the frontend-outlined try / except / __finally funclets.

lib/Target/AArch64/AArch64RegisterInfo.h
125 ↗(On Diff #184403)

Similarly, perhaps this should be getLocalAddressRegister so it's associated with the intrinsics.

mgrang updated this revision to Diff 184818.Feb 1 2019, 12:39 PM
mgrang marked 2 inline comments as done.
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2019, 1:41 PM
mgrang added a comment.Feb 1 2019, 1:43 PM

Great to have this. Thanks @mgrang. Curious on what remaining tests fail in xcpt4u.c?

@TomTan Tests with features which have not yet been implemented (like __leave and asynchronous SEH) are failing.

llvm/trunk/test/CodeGen/AArch64/seh-finally.ll