Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
910 ↗ | (On Diff #163457) | default should be the first case. |
lib/Target/AArch64/AArch64InstrInfo.cpp | ||
1606 ↗ | (On Diff #163457) | Can you move "MBB.getParent()-" to the next line so that it's more readable? |
test/CodeGen/AArch64/wineh-funclets.ll | ||
1 ↗ | (On Diff #163457) | Check 80 char limits here. |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
910 ↗ | (On Diff #163457) | Will change it in the updated patch. |
lib/Target/AArch64/AArch64InstrInfo.cpp | ||
1606 ↗ | (On Diff #163457) | Will do in the updated patch. |
test/CodeGen/AArch64/wineh-funclets.ll | ||
1 ↗ | (On Diff #163457) | Is the 80 char limit on the RUN line a rule? If so, many test cases don't follow this. A single RUN is not split into multiple lines from what I've seen. |
test/CodeGen/AArch64/wineh-funclets.ll | ||
---|---|---|
1 ↗ | (On Diff #163457) | < many test cases don't follow this. Write your code to fit within 80 columns of text. This helps those of us who like to print out code and look at your code in an xterm without resizing it. Plus, horizontal scrolling is no fun :) |
Neat. :)
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
645 ↗ | (On Diff #163556) | "containing" |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
1638 ↗ | (On Diff #163556) | UnwidHelp -> UnwindHelp? |
1639–1641 ↗ | (On Diff #163556) | Should this be moved after the terminator check? |
1675–1679 ↗ | (On Diff #163556) | This is used to populate dispFrame in _s_HandlerType IIRC. IIUC, you will have issues with certain catch handlers without this properly populated. |
I don't really know how exception handling on windows works or what a funclet is. So here's just a couple style comments and questions
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
909 ↗ | (On Diff #163556) | Can be const MachineInstr& |
934–935 ↗ | (On Diff #163556) | Out of interest: How is getWinEHFuncletFrameSize() different from MachineFrameIndo::getStackSize()? |
1182 ↗ | (On Diff #163556) | Add a comment explaining why you need a frame pointer for funclet? |
1648 ↗ | (On Diff #163556) | In case the previous loop ends with MBBI == MBB.end() all the following tests will be undefined behavior. |
1652 ↗ | (On Diff #163556) | My understanding is that all of this stuff is just compiler generated extra instructions? In that case I recommend using an empty debug location instead (I know other places of the code get this wrong too...) |
1662 ↗ | (On Diff #163556) | Use doxygen /// comments. |
1668–1672 ↗ | (On Diff #163556) | doesn't this change the behavior for non-funclets too? If yes this needs to go into a separate patch and needs some comments why. |
lib/Target/AArch64/AArch64MCInstLower.cpp | ||
256–268 ↗ | (On Diff #163556) | Why is this in MCInstLower and not AsmPrinter.cpp? Maybe you can even specify it in tablegen (see for example ARMs LDMIA_RET and similar) |
test/CodeGen/AArch64/wineh-funclets.ll | ||
60 ↗ | (On Diff #163556) | unused basic block? |
76 ↗ | (On Diff #163556) | You should be able to drop most of these attributes. |
79–83 ↗ | (On Diff #163556) | I'm pretty sure you can drop these lines from the test |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
1675–1679 ↗ | (On Diff #163556) | Ah, it could be that dispFrame is not needed for ARM64. In that case, WinException.cpp should not call getWinEHParentFrameOffset and the ParentFrameOffset should not be emitted as it will add padding between the HandlerType elements. |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
1675–1679 ↗ | (On Diff #163556) | Hmmm... So after a bunch of poking around with ARM64 CL.exe, I can confirm that they are allocating the dispFrame field and always fill it with zero! |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
645 ↗ | (On Diff #163556) | Will fix in the updated test case. |
909 ↗ | (On Diff #163556) | Will change in the updated patch. |
934–935 ↗ | (On Diff #163556) | The stack size (getStackSize()) includes all the locals in its size calculation. We don't include these locals when computing the stack size of a funclet, as they are allocated in the parent's stack frame and accessed via the frame pointer from the funclet. Otherwise, we would be double allocating stack space in the funclet that we don't need. We only save the callee saved registers in the funclet, which are really the callee saved registers of the parent function, including the funclet. Perhaps, only the non-volatile registers that the funclet modifies should be saved/restored in the funclet prologue/epilogue, but we don't keep track of these separately. I will document this. |
1182 ↗ | (On Diff #163556) | Will do in the updated patch. |
1638 ↗ | (On Diff #163556) | Will fix in the updated patch. |
1639–1641 ↗ | (On Diff #163556) | Yes, it's definitely a better place for it. |
1652 ↗ | (On Diff #163556) | Will fix in the updated patch. |
1662 ↗ | (On Diff #163556) | Will fix in the updated patch. |
1675–1679 ↗ | (On Diff #163556) | Thanks for double checking!! Will leave it as is. |
lib/Target/AArch64/AArch64MCInstLower.cpp | ||
256–268 ↗ | (On Diff #163556) | X86 did the expansion in X86MCInstLower.cpp. I would keep this way in the current patch, and look into tablegen later. I'll add a comment. |
test/CodeGen/AArch64/wineh-funclets.ll | ||
60 ↗ | (On Diff #163556) | The block is generated by clang, and SImplifyCFG removes it. I'll come up with a better test case. |
76 ↗ | (On Diff #163556) | Will update the test case. |
79–83 ↗ | (On Diff #163556) | Will update the test case. |
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
1668–1672 ↗ | (On Diff #163556) | This is used to calculate displacement of objects from the stack pointer, such as the displacement of the unwind object when filling up the exception handling table on Windows. This is the only place where it is used for AArch64. All the frame indices belong to the parent function (e.g. never a funclet). I put it in this patch, because the unwind object above is created only if the function has funclets. It may be hard to write a test case that checks the .s file, as it will crash with an assert in the WinException sanity check (since the displacement will be that from the frame pointer). I would like to keep in this patch, if at all possible. If there is a really strong objection, I can move it to a separate patch. |
lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1278 ↗ | (On Diff #172089) | Move this into the assert; it will cause the release only build to fail. |
Remove an unused variable. Fix a nasty bug that Eli just pointed out, where I didn't insert a PrologEnd in funclets in this patch. Previously, I relied on an RAII object to this for me when we would go out of scope. I removed that logic, but forgot to insert a prolog end marker for the funclet case.
With this patch, the following should now pass:
#include <stdio.h> int main() { int B[100]; int x = 4; B[2] = 2; B[4] = 0; try { throw B[2]; } catch (int i) { printf("Caught %d\n", i); B[4] = B[i] + i; try { throw 'c'; } catch (char c) { printf("Caught char %c\n", c); } } return x + B[4]; }
with the correct print outs and a return value of 8.
I'm not really an expert on funclets, but the code looks fine to me.
The test coverage here is a little thin... in particular, there isn't any coverage of the emitted unwind opcodes. I'll amend the testcase to check the llvm-readobj output. (Sanjin isn't available for the next few weeks, so I'm going to handle any further revisions to the patch.)
Reid, do you have an opinion on what test coverage is necessary here?
It's probably important to handle each and every kind of prologue. I don't know if there are any special corners cases for arm64 prologues, but for x64, this ended up being:
- normal prologue
- normal with -fno-omit-frame-pointer
- stack realignment caused by a highly aligned alloca
- normal with dynamic alloca or VLA
- dynamic alloca with highly aligned static alloca
SEH and C++ EH have major differences, so test both. Alternatively, you can focus on one in one patch and defer work on the other.
SEH __except blocks in particular had to do some extra work to restore SP and register state when recovering from an exception, especially when the frame has dynamic SP updates.
I still need to go back and review, but hopefully that helps.
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
829–831 ↗ | (On Diff #172305) | We had bugs here with XMM registers, which I don't think we fixed: https://llvm.org/pr32507 How does the AArch64 prologue/epilogue save and restore registers? If you load/store to a fixed frame index stack object, it's not going to work well. If it does something analogous to x86 "PUSH" instructions, it'll work fine. |
This patch forces all functions that contain funclets to emit a frame pointer. (The existing testcase covers that.)
I'll do some testing with realignment and dynamic allocas. I don't think AArch64 prologue lowering has any other special cases.
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
829–831 ↗ | (On Diff #172305) | Spills of callee-saved registers should always use sp-relative loads/stores, as far as I know. |
Oh, I guess there is one special case for the prologue I didn't mention: varargs functions. I think the patch handles that correctly already, but I'll add a test.
For both dynamic allocas and stack realignment, the initial implementation of unwinding missed a significant rule: "r29 must point to the bottom of stack". That's probably straightforward to resolve, but I don't really want to do it in this patch (and I'm not sure what the practical effect is). Beyond that, there's probably more handling necessary to correctly handle a dynamic alloca inside a funclet (since we need to reset the frame pointer, but we still need access to the old frame pointer). And I don't really want to block this patch on solving that; it's obscure enough that it won't affect most code.
I have a couple small fixes for stack realignment, a fix for a MachineVerifier failure, and a testcase for realignment; I'll upload a new version.
The following testcase is also broken, for reasons I don't understand. I don't think it's a problem with any of the code in this patch, though; the throw out of the "catch" block somehow picks the wrong catch block, but this patch doesn't touch that code.
void a() { try { try { throw 1; } catch (...) { throw 1; } } catch (...) {}}
Add a bit more test coverage, to cover cases with and without base pointers. Tweak base pointer emission so we emit and use a base pointer where appropriate (including some cases where it isn't strictly necessary, but allows more efficient addressing modes).
I think this covers all the interesting cases for C++ exceptions. (There will be a separate followup patch for SEH exceptions.)
It works for x64 and x86, so I guess it's arm-specific. The state table emission code is portable, so maybe the bug is in the arm ip2state mapping logic?
It works for x64 and x86, so I guess it's arm-specific. The state table emission code is portable, so maybe the bug is in the arm ip2state mapping logic?
Looked a bit more; it looks like the problem is actually that the "throw" is the last instruction in the function, so unwinding gets confused. There's some code in X86TargetMachine::X86TargetMachine which turns on TrapUnreachable, which avoids the problem. It looks like MSVC does something similar (it inserts a nop instead of a trap, but same idea). I'll post a patch separately.
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
829–831 ↗ | (On Diff #172305) | They should always be saved off the funclet's SP pointer. This is what cl does, if I remember correctly. |
Hi Eli,
Yin and I looked at the rule that says fp must be at the bottom if alloca is present. We tried a few (simple) test cases to see if this would be a problem. We didn't encounter any issues with our very limited testing, so decided not to implement it at this time. From my recollection, the fp (or the fp/lr pair) has to be separated from the code that saves the callee-saved registers, and handled on its own so that it ends up pointing to the bottom.
Thank you for taking over this!!! I may have some time time between flights to look at this a bit more, depending on the internet connection.
I am afraid I don't know much about exceptions on Windows in general. A few comments below.
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
1497 ↗ | (On Diff #172668) | Small typo: in the in -> in the |
2059 ↗ | (On Diff #172668) | Why is this needed? |
2114 ↗ | (On Diff #172668) | typo paren't -> parent's |
lib/Target/AArch64/AArch64RegisterInfo.cpp | ||
260 ↗ | (On Diff #172668) | Isn't this already caught by the next nested ifs? |
test/CodeGen/AArch64/wineh-try-catch-realign.ll | ||
10 ↗ | (On Diff #172668) | Is there a chance that we can test the PEI/FrameLowering part with more specific MIR tests? Especially the checks for what the base register should be. The fact that we don't serialize AArch64FunctionInfo may make this impossible for now. |
We came up with a different solution for this called SEH_Epilogue. It basically inserts a nop if the last real instruction was a call. It's slightly complicated because it looks backwards across empty basic blocks and things. You can see it here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Target/X86/X86MCInstLower.cpp#L1800
I've been meaning to see if we can remove the code that enables TrapUnreachable on Windows.
lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
2059 ↗ | (On Diff #172668) | It's not, as far as I can tell; I'll get rid of it. |
lib/Target/AArch64/AArch64RegisterInfo.cpp | ||
260 ↗ | (On Diff #172668) | Yes, I refactored it and didn't realize it. Will fix. |
test/CodeGen/AArch64/wineh-try-catch-realign.ll | ||
10 ↗ | (On Diff #172668) | Yes, I could write an MIR test using -run-pass=prologepilog, but I don't see the point. The CHECK lines would be essentially identical, just checking for MIR opcodes instead of assembly instructions. The MIR doesn't directly record whether we have a base register, or what it is; it's only used as part of frame lowering. |
Reid, are you okay with merging this? I'd prefer to get this merged sooner rather than later. (If you don't have to review carefully right now, I can address any minor issues post-commit.)
lgtm
Sorry, I lost track of this.
lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
1681 ↗ | (On Diff #173041) | Yep, makes sense, put the address of the target MBB in X0. Do you have any assurance that X0 isn't used by normal epilogue code? At least on x86, sometimes RAX gets used for large SP updates. |
test/CodeGen/AArch64/wineh-try-catch.ll | ||
13–15 ↗ | (On Diff #173041) | Tests with comments about the correctness of numbers that we filecheck for below are always worrying, but I have no better suggestions. :) |
lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
1681 ↗ | (On Diff #173041) | The current AArch64 emitFrameOffset never does that. And even if it did in some cases, a funclet's stack can't be more than a few hundred bytes anyway. |