I've fixed up https://reviews.llvm.org/D9654. It's a follow-up to https://reviews.llvm.org/D34386. Please let me know if you have any further suggestions. Thanks!
Details
Diff Detail
Event Timeline
llvm-commits should be added as a subscriber when the review is created to mirror the review on the mailing list
I'll repost some of my comments from D34386. I think it would be good to avoid unrelated whitespace changes to reduce the noise in the diff. Also, could you add more context to the diff? That makes reviewing easier :)
I think you should also include a test for your change. You could probably extend test/CodeGen/X86/stack-probe-size.ll, by adding an additional RUN line with a triple where stack probes aren't generated at the moment and add probe-stack attributes to some functions.
lib/CodeGen/MachineFunction.cpp | ||
---|---|---|
216–224 ↗ | (On Diff #103236) | I think this should be renamed or change its definition to reflect its name, we will probe the stack on x86 Windows regardless of the whether or not this attribute was supplied. Perhaps hasProbeStackAttr? |
lib/Target/X86/X86FrameLowering.cpp | ||
800–801 | Could this be a StringRef? | |
1023–1024 | You could make a function which gets the stack probe symbol name, then UseStackProbe could be defined in terms of that name being non-empty. | |
1243–1246 | Test case for this? | |
1251–1252 | Saving RBX seems new. I'd have a comment describing both of these saves. | |
1259 | Magic number should have a comment or stuck in a const variable with a nice name. Even better would be both. |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
800–801 | I originally wanted to do that, but the docs say that StringRefs don't have null termination, which the addExternalSymbol API needs. If you'd prefer to work around that in another way, let me know. |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
800–801 | Hmm. Isn't there a liveness problem then? Symbol.c_str() is only valid until this functoin returns. addExternalSymbol doesn't make a copy of its input unless I am mistaken. |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
800–801 | If I am correct, I think you will want to use createExternalSymbolName. |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1259 | Oh, I guess this is the page size. I don't see why user-requested stack probes should need this to be supplied explicitly; it just bloats the size of every function prolog for something that is going to be constant on the architecture. Will remove. |
@pcwalton No, both the spill and the "magic" number are necessary. See https://reviews.llvm.org/D9858. ebx is the size of the frame, and 0x1000 is the probe interval to be performed (in other words, page size).
@pcwalton But I agree that it's better to change both sides of the __probestack ABI to eliminate the argument and the spill (on x86_64), of course.
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
510 | Why is this by ref? | |
801 | I think it is more natural to see: StringRef Symbol = STI.getTargetLowering()->getStackProbeSymbolName(MF); | |
1243–1245 | Am I correct in thinking that we will save and restore R11 when we previously would not in the large code model for Win64? If so, was that a bug in the implementation of the large code model for Win64? | |
lib/Target/X86/X86ISelLowering.cpp | ||
36342 | const StringRef looks weird, StringRefs are immutable. | |
36344–36347 | I'd kill the braces here. | |
36355–36357 | Ditto. | |
lib/Target/X86/X86ISelLowering.h | ||
1059 | Ditto regarding const StringRef. |
That patch never landed, and the current approach, with customizable symbol names, is to move the responsibility out of compiler-rt. So I would argue that https://reviews.llvm.org/D9858 is irrelevant.
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1243–1245 | I'm not familiar enough with the Windows 64-bit ABI to be able to answer that question. I would assume that it would have to have been a bug in the previous implementation, as I don't see any other way to make the call to __chkstk work… |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1253 | Hmm, I believe a testcase for this situation hasn't been added. |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1253 | Well, I'm not sure how R11 can actually be live-in in the function prolog right now. As I understand things, the reason why EAX is spilled if necessary is to handle the edge case on 32-bit x86 where inreg is specified for a function parameter, which causes EAX to be live-in at the very start of the function. But there's no corresponding inreg attribute for x86-64 that causes R11 to become live-in that early. Checking for that case even for R11 seemed prudent to me though, because (a) the corresponding situation can happen for EAX and it seemed cleaner to use the same logic for both registers instead of duplicating some of it; (b) in the future, if somehow R11 can become live-in at function start, this logic will be needed. Am I interpreting the situation correctly, and if so, what would you prefer I do regarding testing? |
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1253 | I am quite certain that R11 should *not* be spilled this way. LLVM uses it as a scratch register in quite a few places. E.g. the comment above: // For the large code model, we have to call through a register. Use R11, // as it is scratch in all supported calling conventions. the GetScratchRegister function in X86FrameLowering.cpp, the X86TargetLowering::getScratchRegisters, and the X86TargetLowering::LowerINIT_TRAMPOLINE function, which needs it for essentially the same job. You could perhaps replace the call with one of the getScratchRegister functions but it seems fine to hardcode it while not spilling the way it is, too. |
Addressed review comments.
I removed the R11 spill code as requested. I left in the refactoring that moves the existing EAX spill and reload code into separate functions, as X86FrameLowering::emitPrologue() is rather large and that seemed like a worthwhile refactoring.
I am not entirely sure that the refactored code is identical to what was left behind. Moreover, the refactoring is now incidental with respect to the bulk of the change. I'd request that if you were to pursue it, to please factor it out into another review.
Looking pretty good, I think we are almost there :)
lib/Target/X86/X86FrameLowering.cpp | ||
---|---|---|
1053–1055 | We can use the redzone and stack probes at the same time? This surprises me. Also, there is no test for this. |
It would be nice to make the stackprobesize a proper TLI attribute, so that OSes can decide how much guarding they are willing to guarantee. I think I would also like to have an option to inline the probing without the complexity of the function call. But both can be done as a separate step.
Why is this by ref?