Page MenuHomePhabricator

[COFF, ARM64] Implement support for SEH extensions __try/__except/__finally
ClosedPublic

Authored by mgrang on Oct 22 2018, 5:33 PM.

Details

Summary

This patch supports MS SEH extensions try/except/__finally. The intrinsics localescape and localrecover are responsible for communicating escaped static allocas from the try block to the handler.

We need to preserve frame pointers for SEH. So we create a new function/property HasLocalEscape.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
mgrang edited the summary of this revision. (Show Details)Oct 22 2018, 5:41 PM
rnk added inline comments.Oct 22 2018, 6:48 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
2724 ↗(On Diff #170537)

It's not EBP for AArch64, surely. I would generalize the text to "FP" instead.

2731–2732 ↗(On Diff #170537)

Yeah, this was a great hack. :)

dmajor added a subscriber: dmajor.Oct 22 2018, 7:04 PM

I think you got the titles swapped on this and D53541; this one adds the aarch64_seh_recoverfp intrinsic while D53541 uses it to support __try/__except.

mgrang updated this revision to Diff 170698.Oct 23 2018, 10:50 AM
mgrang edited the summary of this revision. (Show Details)EditedOct 23 2018, 10:52 AM

I think you got the titles swapped on this and D53541; this one adds the aarch64_seh_recoverfp intrinsic while D53541 uses it to support __try/__except.

D53541 generates aarch64_seh_recoverfp while this patch lowers it. This patch also lowers the localescape/localrecover intrinsics. I have updated the title summary accordingly.

mgrang added inline comments.Oct 23 2018, 11:02 AM
lib/Target/AArch64/AArch64AsmPrinter.cpp
571 ↗(On Diff #170698)

I see that X86 generates LEA to calculate the address of the .L label. AArch64 does not have a similar instruction. I tried to use ADRP + ADD but that requires either a GlobalSymbol or an ExternalSymbol type. However, the .L label is of type MCSymbol.

So I generate a MOVK x1, #.Lfoo. The problem is that MOVK takes an unsigned immediate. The .L label here represents offset from FP. This works if the offset is positive. But for negative offsets we should generate a MOVN. However, I am not sure how to detect negative immediate without first generating a MOV.

Any suggestions?

lib/Target/AArch64/AArch64ISelLowering.cpp
2731–2732 ↗(On Diff #170537)

I am not really familiar with this piece of code. Could you please comment on whether this should work for AArch64?

lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
219 ↗(On Diff #170698)

As I commented above, we assume a non-negative immediate.

rnk added inline comments.Oct 23 2018, 3:51 PM
lib/Target/AArch64/AArch64AsmPrinter.cpp
571 ↗(On Diff #170698)

I would expect you would want a generic MOVMCSym instruction to be able to materialize any label, not just ones that are known during assembly. This might get selected if we needed to materialize, for example, the address of the LSDA, which is also a private label.

lib/Target/AArch64/AArch64ISelLowering.cpp
2729 ↗(On Diff #170698)

I think this is suppoosed to be ".seh_setframe offset"

2731–2732 ↗(On Diff #170537)

I think it should. Fundamentally, we are trying to transfer information about one function's frame layout into an independent filter function. The idea is that we can write assembly like this:

foo:
.Lfoo = 0x48
  ...

foo$filt:
  mov r0, .Lfoo

Hm, if possible, it might be nice to codegen this whole thing after SD ISel. I wonder if there is some way to pass it through as an intrinsic and lower it later. That will avoid the whole concern of having a generic MCSymbol materialization instruction selection pattern.

We need to have a test case that checks for access to local variables from the finally and except funclets. The parent's frame pointer is the second argument to both the filter functions and termination handlers, fro what I see. We need to make sure that locals variables in funclets are being accessed off x1, and not the stack pointer.

mgrang updated this revision to Diff 171996.Oct 31 2018, 12:51 PM
mgrang retitled this revision from [COFF, ARM64] Implement support for SEH extensions __try/__except to [COFF, ARM64] Implement support for SEH extensions __try/__except/__finally.
mgrang edited the summary of this revision. (Show Details)
mgrang set the repository for this revision to rL LLVM.

Will push more comprehensive tests for ARM64 SEH soon.

majnemer added inline comments.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

Since you already did an isa, this can be cast<MCConstantExpr>(Expr).

dmajor added inline comments.Nov 8 2018, 7:57 AM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

I tried applying this patch (plus D53541, D51524, and D54248) and building Firefox, and I got a crash on null Expr here.

Let me know if you're interested in the repro, or if it's too early for me to be playing with this code, in which case sorry for the noise.

mgrang added inline comments.Nov 8 2018, 11:05 AM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

@dmajor Thanks for testing this out :) If you have a reduced test case to repro the failure that would be great. Although currently I am busy with an internal release I expect to get back to this in Dec.

mgrang added inline comments.Nov 8 2018, 12:14 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

@dmajor Also what patches (if any) to the Firefox code base are needed to build Firefox for arm64-windows. Are those patches publicly available?

dmajor added inline comments.Nov 8 2018, 12:29 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

I don't really recommend it right now, it's a bit of a hassle because you have to apply some hacks to the build system. We're working on making it easier (hopefully within a week or two). In the meantime I can try to get you a repro with creduce.

TomTan added inline comments.Nov 8 2018, 5:46 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

@dmajor, I hit the same null reference issue after cherry-pick changes you listed. Seems FrameAllocSym is not a set to SymContentsVariable here which is expected to happen in AArch64RegisterInfo::eliminateFrameIndex.

dmajor added inline comments.Nov 9 2018, 7:08 PM
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6155 ↗(On Diff #171996)

Here's the repro of the crash from creduce:

class a {
  void b();
  void c();
};
void a::b() {
  int f;
  __try {
    c();
  } __except (f) {
  }
}
rnk added a comment.Nov 13 2018, 10:46 AM

Sorry, I didn't realize this was waiting on my review. It looks like this is mostly about localescape, so let's change the patch description and code to be about that.

include/llvm/CodeGen/MachineFunction.h
332 ↗(On Diff #171996)

IMO this should really live on the AArch64-specific function info, since it's not general to x64/x86 yet.

Also, this isn't SEH specific, this is set to true only by localescape. Maybe you should rename it? localescape was intended to be orthogonal to SEH, potentially usable for non-externally visible lambdas, although the codegen right now is particularly bad.

alex added a subscriber: alex.Dec 10 2018, 5:50 AM
mgrang updated this revision to Diff 177612.Dec 10 2018, 4:10 PM
mgrang edited the summary of this revision. (Show Details)
mgrang added a reviewer: ssijaric.

@dmajor My updated change fixes the crash in your testcase.

Now, I am trying to build the SEH tests released by MS: https://github.com/Microsoft/windows_seh_tests. I started with XCPT4 tests. test5 is currently failing which I am triaging now.

TomTan requested changes to this revision.Dec 10 2018, 4:28 PM

@dmajor My updated change fixes the crash in your testcase.

Now, I am trying to build the SEH tests released by MS: https://github.com/Microsoft/windows_seh_tests. I started with XCPT4 tests. test5 is currently failing which I am triaging now.

@mgrang, the SEH fix still causes ARM64 COFF file crash when linking with CFG enabled (/guard:cf). Seems functions are not marked as IMAGE_SYM_DTYPE_FUNCTION in ARM64 COFF which was done in both x86 and arm32 COFF (see below 2 links).

https://github.com/llvm-mirror/llvm/blob/2cc0a7da876c1d8c32775b0119e1e15aaa759b9e/lib/Target/X86/X86AsmPrinter.cpp#L67

https://github.com/llvm-mirror/llvm/blob/2cc0a7da876c1d8c32775b0119e1e15aaa759b9e/lib/Target/ARM/ARMAsmPrinter.cpp#L148

This revision now requires changes to proceed.Dec 10 2018, 4:28 PM

@dmajor My updated change fixes the crash in your testcase.

Now, I am trying to build the SEH tests released by MS: https://github.com/Microsoft/windows_seh_tests. I started with XCPT4 tests. test5 is currently failing which I am triaging now.

@mgrang, the SEH fix still causes ARM64 COFF file crash when linking with CFG enabled (/guard:cf). Seems functions are not marked as IMAGE_SYM_DTYPE_FUNCTION in ARM64 COFF which was done in both x86 and arm32 COFF (see below 2 links).

https://github.com/llvm-mirror/llvm/blob/2cc0a7da876c1d8c32775b0119e1e15aaa759b9e/lib/Target/X86/X86AsmPrinter.cpp#L67

https://github.com/llvm-mirror/llvm/blob/2cc0a7da876c1d8c32775b0119e1e15aaa759b9e/lib/Target/ARM/ARMAsmPrinter.cpp#L148

Thanks @TomTan. I have a fix here: https://reviews.llvm.org/D55535

rnk added inline comments.Dec 17 2018, 6:11 PM
CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6166 ↗(On Diff #177612)

It still seems true that this isn't about seh funclets, it's about localescape / localrecover, which is supposed to be orthogonal (for use in non-ABI visible lambdas), so I'd recommend renaming this boolean.

Target/AArch64/AArch64AsmPrinter.cpp
680 ↗(On Diff #177612)

I think this assertion will fail if you use localrecover before localescape. I would make a test case for that. Is there really not some pseudo instruction we can use so that the materialization can be relaxed by the assembler?

Otherwise, there's no point to using this MCSymbol hack. We might as well have a simpler side table if we retain the requirement that localescape be code generated first.

efriedma added inline comments.Dec 17 2018, 7:19 PM
Target/AArch64/AArch64AsmPrinter.cpp
680 ↗(On Diff #177612)

Is there really not some pseudo instruction we can use

The AArch64 assembler currently doesn't have any useful support for this sort of construct. Maybe it could be extended. (Is the value an arbitrary 64-bit integer, or is there some restriction on the possible values?)

rnk added inline comments.Dec 18 2018, 9:30 AM
Target/AArch64/AArch64AsmPrinter.cpp
680 ↗(On Diff #177612)

These values are offsets from the FP (or SP) to a stack object, so they are typically small. If there are implementation limits on the size of an aarch64 stack frame, we can definitely use that to guide isel here.

mgrang updated this revision to Diff 178775.Dec 18 2018, 1:17 PM
mgrang marked an inline comment as done.
mgrang edited the summary of this revision. (Show Details)
mgrang marked 3 inline comments as done.Dec 18 2018, 1:22 PM
mgrang added inline comments.
include/llvm/CodeGen/MachineFunction.h
332 ↗(On Diff #171996)

I tried to move this to AArch64MachineFunctionInfo. I set it in AArch64RegisterInfo when I encounter LOCAL_ESCAPE and need to access it in AArch64FrameLowering. But it seems setting it in AArch64RegisterInfo is too late. So for now I have kept it on the MF.

lib/Target/AArch64/AArch64AsmPrinter.cpp
697 ↗(On Diff #178775)

@rnk Yes, you are correct that the assert here will fail is the localrecover comes before localescape.
I have tried to model this similar to X86 but the problem is AArch64 cannot do an ADRP on an MCSymbol. So I wasn't sure how to materialize this in the assembler.
Could you give me some idea on how this can be done?

mgrang marked an inline comment as done.Dec 18 2018, 1:24 PM
rnk added inline comments.Dec 18 2018, 2:56 PM
include/llvm/CodeGen/MachineFunction.h
332 ↗(On Diff #171996)

I see, we could really only do that if localescape was lowered in target dependent code. I think this is fine the way it is now. It's clearer (to me, at least) after the rename.

lib/CodeGen/AsmPrinter/WinException.cpp
548–549 ↗(On Diff #178775)

This can just be isAArch64, we already assigned that to a member boolean.

lib/Target/AArch64/AArch64AsmPrinter.cpp
697 ↗(On Diff #178775)

I guess I've been assuming that you could assemble something like this in a .s file for aarch64, like you can for x86:

main:
  movl $.Lasdf, %eax
  ret
.Lasdf = 1234

I would assume that the aarch64 equivalent would be:

main:
  mov d0, #.Lasdf
  ret
.Lasdf = 1234

But that doesn't assemble. Does gnu as support that? I don't have an aarch64 binutils build. Maybe llvm-mc should allow mov/movk to accept a symbolic label operand.

I don't think we would want to use ADRP, IIRC that's for materializing some PC-relative address, but obviously this isn't that kind of symbol.

mgrang marked an inline comment as done.Dec 18 2018, 5:10 PM
mgrang added inline comments.
lib/Target/AArch64/AArch64AsmPrinter.cpp
697 ↗(On Diff #178775)

GNU does not seem to support it as well.

Error: operand 2 should be a SIMD vector element -- `mov d0,#.Lasdf'
mgrang updated this revision to Diff 178819.Dec 18 2018, 5:12 PM
efriedma added inline comments.Dec 18 2018, 5:36 PM
lib/Target/AArch64/AArch64AsmPrinter.cpp
697 ↗(On Diff #178775)

Does gnu as support that?

"mov" is pseudo-instruction, but it will expand to at most one instruction (a movz, a movn, or an orr); this is defined in the AArch64 architecture reference. There is no variable-length pseudo-instruction; if you need a larger value, you have to write out a movz/movk sequence explicitly.

I guess the stack size is currently limited to 256MB on Windows because of the way we emit the prologue on Windows (that's the limit for the alloc_l unwind opcode). We could maybe teach LLVM to use multiple alloc_l operations, but it's probably reasonable to assume the stack for a function is less than 4GB, so we could emit just two instructions (movz+movk).

For value less than 2^32, gas supports something like the following:

movz x0, #:abs_g1:.Lasdf
movk x0, #:abs_g0_nc:.Lasdf
.Lasdf = 0x12345678

If you try this with LLVM, you currently get an error "no resolvable MOVZ/MOVK fixups supported yet", but I think that's straightforward to fix. (LLVM supports the syntax in general, just not the case where the value is defined in the same object file.)

rnk added inline comments.Dec 21 2018, 4:13 PM
lib/Target/AArch64/AArch64AsmPrinter.cpp
697 ↗(On Diff #178775)

I saw your fixup patches, and this seems like the way to go. At least, this was the most principled way I could come up with to pass information about the stack frame layout of one machine function to another. I didn't want to use a side table and lock in the code generation order dependence.

mgrang updated this revision to Diff 180700.Tue, Jan 8, 11:10 AM

Addressed comments. The MCSymbol for the frame offset is now lowered in the assembler. D56037 needs to merge first.
I am now working on test cases. Will push them soon.

mgrang updated this revision to Diff 180705.Tue, Jan 8, 11:35 AM
efriedma added inline comments.Tue, Jan 8, 11:39 AM
lib/Target/AArch64/AArch64AsmPrinter.cpp
713 ↗(On Diff #180700)

I think this "createImm" is wrong; should be 16. (It probably doesn't show up in the printed assembly, but I think it affects the encoding.)

lib/Target/AArch64/AArch64RegisterInfo.cpp
271 ↗(On Diff #180700)

This can't be correct in general... in cases involving stack realignment, it's impossible to generate code without a base pointer.

mgrang updated this revision to Diff 180756.Tue, Jan 8, 3:46 PM
mgrang updated this revision to Diff 181193.Thu, Jan 10, 5:18 PM

Added test cases for localescape and try-finally. Will add try-except test case soon.

efriedma added inline comments.Fri, Jan 11, 4:51 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
2744 ↗(On Diff #181193)

Are you sure this does the right thing in general? The link between the choice of register here, and the register used for the getFrameIndexReference call in WinException::emitEHRegistrationOffsetLabel , seems fragile at best.

rnk accepted this revision.Mon, Jan 14, 1:35 PM

Thanks for pushing the MC work to support this forward!

@efriedma has one outstanding comment, but I don't see how to untangle the choice of frame register from AArch64FrameLowering::getIndexFrameReference, or the x86 version of the same, so I'm happy with this. Please wait for his response, though.

lib/Target/AArch64/AArch64ISelLowering.cpp
2744 ↗(On Diff #181193)

It is fragile, but it's what was done for x86. Unfortunately, it's actually really hard to know which physical register is going to be used to address locals during ISel, so I just did this brittle thing and moved on. See these comments from AArch64FrameLowering::hasFP for some of the phase ordering problems:

// With large callframes around we may need to use FP to access the scavenging
// emergency spillslot.
//
// Unfortunately some calls to hasFP() like machine verifier ->
// getReservedReg() -> hasFP in the middle of global isel are too early
// to know the max call frame size. Hopefully conservatively returning "true"
// in those cases is fine.
// DefaultSafeSPDisplacement is fine as we only emergency spill GP regs.
if (!MFI.isMaxCallFrameSizeComputed() ||
    MFI.getMaxCallFrameSize() > DefaultSafeSPDisplacement)
  return true;

I think we are mostly safe because when using EH funclets, an FP is always needed, so it's just a decision between "base" and "frame" pointer registers. That is currently entirely determined by the existence of user allocas with alignments above the standard ABI stack alignment.

However, you can imagine that some later pass might try to create more highly aligned stack objects, which would require a base pointer. For example, the register allocator might spill vector registers that require more stack alignment than the ABI provides. Today, for x86 at least, LLVM punts on this problem, and simply uses unaligned instructions for the spill and reload. I don't know how to get into the analogous situation for AArch64 or what LLVM would do, though.

smeenai added inline comments.
lib/Target/AArch64/AArch64ISelLowering.cpp
2744 ↗(On Diff #181193)
rnk added inline comments.Mon, Jan 14, 2:09 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
2744 ↗(On Diff #181193)

I had forgotten about that issue, but that's exactly the case I was describing. I proposed this solution there:

I think localaddress should really lower down to a LEA of a zero-sized stack object. It's really trying to say, give me the address of the local variable area. The localrecover label assignments would instead compute the difference between this particular zero-sized stack object and their stack object.

For AArch64, LEA would really be ADC, or whatever is generated to materialize the address of an alloca. Does something like that make sense?

It occurs to me that this localescape system is pretty flawed, since not all local variables are static allocas. This source crashes clang today:

struct Foo {
  int x, y, z;
};
int filter(int);
void may_throw();
void seh_byval(Foo o) {
  __try {
    may_throw();
  } __except(filter(o.x)) {
  }
}

Oops. =/

efriedma accepted this revision.Mon, Jan 14, 3:31 PM

If you're okay with trying to resolve any remaining bugs as a followup, that's fine with me; LGTM.

lib/Target/AArch64/AArch64ISelLowering.cpp
2744 ↗(On Diff #181193)

That is currently entirely determined by the existence of user allocas with alignments above the standard ABI stack alignment.

This is not true on AArch64. There's a second condition: whether the size of the local frame is greater than 256 bytes. This condition isn't strictly required, but avoids materializing offsets in some situations. (AArch64 addressing modes support very limited immediate negative offsets.)

That doesn't really have any impact on its own, but AArch64FrameLowering::getFrameIndexReference actually chooses between FP-relative and BP-relative addressing based on the offsets, in general. So we probably need to do something here, to force the frame index to be resolved the same way as localaddress.

I don't think we can hit an issue similar to bug 37169 on AArch64; the natural stack alignment is 16, and there currently isn't anything that requires higher alignment.

It occurs to me that this localescape system is pretty flawed, since not all local variables are static allocas.

Yes, we need special handling for byval/inalloca arguments at the IR level; in general, it's impossible to compute an offset to an argument relative to the base pointer.

mgrang updated this revision to Diff 181942.Tue, Jan 15, 5:20 PM
mgrang edited the summary of this revision. (Show Details)

Added a default lowering for llvm.eh.recoverfp intrinsic.

mgrang updated this revision to Diff 182110.Wed, Jan 16, 11:36 AM

@rnk @efriedma Thanks a lot for all your valuable review comments. I'll commit this now :)

This revision was not accepted when it landed; it landed in state Needs Review.Wed, Jan 16, 11:57 AM
This revision was automatically updated to reflect the committed changes.