This patch enables LLVM to emit Win64-native SEH unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment.
Because SEH is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer.
Details
- Reviewers
nrieck rnk asl chapuni - Commits
- rG1db5995d1419: Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore…
rG4a01230db47e: Generate native unwind info on Win64
rL211691: Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore…
rL211399: Generate native unwind info on Win64
Diff Detail
- Repository
- rL LLVM
Event Timeline
Excellent. This replaces http://reviews.llvm.org/D3418 and http://reviews.llvm.org/D3419 is still needed, right?
Whoops, looks like I got .cfi_offset and .cfi_rel_offset mixed up. The original offsets in bigstructret2.ll test were actually correct.
Also, rebased on top of master.
include/llvm/CodeGen/MachineFrameInfo.h | ||
---|---|---|
486 ↗ | (On Diff #10279) | Let's try to avoid contiguous boolean parameters, especially when some have default arguments. They make it hard to do refactorings like removing the 'Immutable' parameter here. Instead, CreateFixedSpillSlot(Size, SPOffset) sounds like the right interface to me. |
lib/MC/MCAsmStreamer.cpp | ||
1138 ↗ | (On Diff #10279) | This should be a separate revert. The existing test case is incorrect, it was doing '.seh_savereg %rsi, 16' and disassembling that with %rbp. |
lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp | ||
148 ↗ | (On Diff #10279) | I think this means we will start attempting to emit cleanups for x86_64-pc-windows-msvc, which we probably aren't ready for. Can we hold off on this change, or are you reasonably sure it's safe? |
lib/Target/X86/X86FrameLowering.cpp | ||
402–410 ↗ | (On Diff #10279) | Are these comments accurate when we have an FP *and* the stack needs realignment? It seems like we should move the win64-specific XMM CSR saves to happen before the stack realignment. This should be safe since win64 also gives us 16-byte stack alignment. |
738–742 ↗ | (On Diff #10279) | The comment below about "Don't care about the rest of the stack allocation" should get merged in here. I spent a while trying to understand why this scary comment is OK, and the comment below explains it. |
744 ↗ | (On Diff #10279) | We have range based for loops now, so this can be: for (const CalleeSavedInfo &Info : CSI) { ... } |
773 ↗ | (On Diff #10279) | Format. You can drop the braces. |
910 ↗ | (On Diff #10279) | Seems like a superfluous formatting change |
1092 ↗ | (On Diff #10279) | :( Is TargetFrameLowering::getCalleeSavedSpillSlots() not powerful enough to express the offsets you need? If it isn't, I think the right way forward is to add a more flexible interface to TFL, like TFL::appendCalleeSavedSpillSlot(std::vector<CSI> &CSIs), where the default implementation forwards on to check 'TFL->getCalleeSavedSpillSlots(NumFixedSpillSlots)' so existing targets don't require changes. |
lib/Target/X86/X86ISelLowering.cpp | ||
600–603 ↗ | (On Diff #10279) | Can we flip this to the positive sense? What targets actually need it? win32 only? |
include/llvm/CodeGen/MachineFrameInfo.h | ||
---|---|---|
486 ↗ | (On Diff #10279) | Will do. |
lib/MC/MCAsmStreamer.cpp | ||
1138 ↗ | (On Diff #10279) | I have it as a separate commit in git, it just got squashed when posting diff to Phabricator. Do you want me to actually submit it as a separate review? |
lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp | ||
148 ↗ | (On Diff #10279) | Not sure I understand... Can you please elaborate? |
lib/Target/X86/X86FrameLowering.cpp | ||
402–410 ↗ | (On Diff #10279) | Stack realignment happens before stack pointer decrement, so XMM slots would be be in "invalid" zone at that point. We'd have to swap re-alignment and SP decrement in order to do what you suggest. I did not want to mess with existing frame setup order too much... |
738–742 ↗ | (On Diff #10279) | okay |
910 ↗ | (On Diff #10279) | okay, will undo |
1092 ↗ | (On Diff #10279) | The problem with getCalleeSavedSpillSlots() is that it would reserve spill slots regardless of whether the register actually needs to be spilled, which would increase stack usage unnecessarily. I was thinking of using hasReservedSpillSlot() for this purpose, because, in principle, it could create spill slots on-the-fly, but then we'd be relying on the order of register enumeration. Also, it'd need to store state somewhere, and the code overall would be more convoluted. How about changing getCalleeSavedSpillSlots() signature to take non-const vector<CalleeSavedInfo>? If that's too hacky, I'd propose to create TargetFrameLowering::assignSpillSlots(vector<CSI>& ) callback, that would allow to override PEI::calculateCalleeSavedRegisters() in a target-specific manner. |
lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
1138 ↗ | (On Diff #10279) | No, I can commit it. |
lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp | ||
148 ↗ | (On Diff #10279) | Ignore this comment. I worry this will break 'clang-cl -m64' because we emit 'invoke' instructions in the frontend to run C++ destructors. Clang is currently relying on the fact that LLVM drops the landing pads on the floor and generates regular calls. Clang shouldn't be relying on that, though. |
lib/Target/X86/X86FrameLowering.cpp | ||
402–410 ↗ | (On Diff #10279) | OK, that makes sense. We may end up spilling xmm registers into the stack realignment gap, but that should be OK. Can you add a comment something like: ; It's OK if the stack was realigned and xmm registers are spilled into the realignment gap. |
779 ↗ | (On Diff #10279) | This can also be range-based. |
1092 ↗ | (On Diff #10279) |
This is basically what I was trying to suggest. The default implementation should do what calculateCalleeSavedRegisters() does now, so you don't have to change every target. |
lib/MC/MCObjectFileInfo.cpp | ||
---|---|---|
635 ↗ | (On Diff #10279) | AFAIK, the COFF backend only supports Windows ATM. Why not make this an assert? assert(T.isOSWindows() && "Windows is the only supported COFF target"); if (T.getArch() != Triple::x86_64) { LSDASection = ... } |
lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp | ||
168 ↗ | (On Diff #10279) | Convert to an assert and simplify? |
I went and tried this out but ran into an assertion. This is odd because at that stage I wasn't actually using COFF, but rather ELF in memory for MCJIT. Maybe there's a check missing whether we are using COFF in addition to Win64? See below:
Program: C:\mingw-builds\msys64\home\kfischer\julia\usr\bin\julia.exe File: C:/mingw-builds/msys64/home/kfischer/julia/deps/llvm-svn/lib/CodeGen/MachineInstr.cpp, Line 674 Expression: (isImpReg || Op.isRegMask() || MCID->isVariadic() || OpNo < MCID->getNumOperands() || isMetaDataOp) && "Trying to add an operand to a machine instr that is already done!"
Did you do a clean build? I ran into something similar after adding SEH instructions, and the solution was to delete all .o and .inc files under Target\X86. Apparently make dependencies are not quite reliable.
I did not. Doing that now. Will report back. The make dependencies should really be figured out though.
Ok, now I'm getting "LLVM ERROR: No open Win64 EH frame function!" (again with ELF in the first stage).
Have you run the example at http://llvm.org/bugs/show_bug.cgi?id=19905 through llvm with this change in?
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1131 ↗ | (On Diff #10279) | One nice effect of having a "TargetFrameLowering::assignSpillSlots" is that you won't have to create frame indices only to delete them i.e. you'd have CSR's spilled to fixed slots to begin with. |
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
600–603 ↗ | (On Diff #10279) | This is to make sure that landing pads do get emitted for win64. Honestly, I have no idea what the inverse of this set looks like. |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1084 ↗ | (On Diff #10416) | Do you want me to run clang-format on all files I've changed? |
You can run clang-format from the VC IDE on selected (marked) text only.However, it may be a good idea to reformat all the file if it does not conform to clang format.
Reid, what do you think?
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1084 ↗ | (On Diff #10416) | No, typically it's used interactively only on lines that are being changed. See git-clang-format. Anyway, I can do that if this is done. |
I tried to commit today, but there were conflicts while rebasing. I attempted to resolve them, but I still had this test failure after that:
test/CodeGen/X86/x86-64-static-relo-movl.ll
Can you look into this? The test uses a x86_64-pc-win32-macho triple, which makes us hit "LLVM ERROR: No open Win64 EH frame function!". We probably shouldn't emit win64 style unwind info when generating MachO object files.
Is 'x86_64-pc-win32-macho' a valid target? Windows 64 ABI in OSX object file seems like a weird combination.
Not necessarily by itself, but it's not uncommon to have macho or elf in memory for MCJIT (as I mentioned above). Therefore, every effort should be made to make sure it doesn't break.
Looks like something wrong with .pdata segment. It contains Function start address == Function end address and not always sorted by start address.
Yes, of course. I trying to compile main.ll and get main.obj. llvm toolchain is builded from " git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211430 "
Hmm, something must be wrong at the MC layer. When I assemble the output of llc with gnu assembler (after replacing register numbers with names in seh directives), it seems to produce a correct .obj.
@chapuni, Which tests broke? Is there a PR?
I saw that MultiJitTest.JitPool fails, but it was failing before my change too, so I ignored it.
I met entire JITTests.exe and lli.exe (and clang-interpreter) got broken, not only MultiJitTest.JitPool.
Command 0 Stderr: pseudo instructions should be removed before code emission UNREACHABLE executed at C:\bb-win7\msbuild-llvmclang-x64-msc17-DA\llvm-project\llvm\lib\Target\X86\X86CodeEmitter.cpp:1106!
FYI,
mingw x64 Release+Asserts http://bb.pgr.jp/builders/ninja-clang-x64-mingw64-RA/builds/3572
msc17 x64 Debug+Asserts http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc17-DA/builds/731
I could help investigating if you couldn't reproduce failures.
I was wrong with example above. Looks like the only problem is sorting and sometimes overlapping ranges of Function start & end. For example, in cases when exist two or more .text sections there is no way at all to sort function starts (in one .pdata section)
I think, solution is to create one .pdata (and maybe .xdata) for each exception info as MS VC compiler do
My small patch (can be applied with git or by hand) with workaround some of "invalid .pdata contributions" issues. It creates new pdata and xdata section for every function
Where is the patch? I don't see it in my inbox. That sounds like the
correct solution: the individual .pdata sections should be comdat
associative with the individual .text sections.
Patch was attached to phabricator (
http://reviews.llvm.org/file/data/h36ma4imbeunwhbu56fp/PHID-FILE-2w25ko3anof4isbyyzka/pdata_patch.patch
). Also I attach it to this email.
the individual .pdata sections should be comdat associative with the individual .text sections.
Maybe yes or maybe not. Having only one element in .pdata totally eliminate
sort routine.
Anyway this is a temporary solution. I hope someone of llvm maintainers
reimplement this patch as it should
2014-07-04 9:05 GMT+06:00 Reid Kleckner <rnk@google.com>:
Where is the patch? I don't see it in my inbox. That sounds like the
correct solution: the individual .pdata sections should be comdat
associative with the individual .text sections.