This is an archive of the discontinued LLVM Phabricator instance.

[PATCH 2/2] Implement "probe-stack" on x86
ClosedPublic

Authored by pcwalton on Jun 19 2017, 11:23 PM.

Details

Summary

I've fixed up https://reviews.llvm.org/D9654. It's a follow-up to https://reviews.llvm.org/D34386. Please let me know if you have any further suggestions. Thanks!

Diff Detail

Event Timeline

pcwalton created this revision.Jun 19 2017, 11:23 PM
majnemer edited edge metadata.Jun 20 2017, 9:05 AM

How does this review differ from D34386?

llvm-commits should be added as a subscriber when the review is created to mirror the review on the mailing list

I'll repost some of my comments from D34386. I think it would be good to avoid unrelated whitespace changes to reduce the noise in the diff. Also, could you add more context to the diff? That makes reviewing easier :)

I think you should also include a test for your change. You could probably extend test/CodeGen/X86/stack-probe-size.ll, by adding an additional RUN line with a triple where stack probes aren't generated at the moment and add probe-stack attributes to some functions.

pcwalton updated this revision to Diff 103236.Jun 20 2017, 10:56 AM

Added more context, added missing tests, removed whitespace-only changes.

majnemer added inline comments.Jun 20 2017, 11:26 AM
lib/CodeGen/MachineFunction.cpp
216–224 ↗(On Diff #103236)

I think this should be renamed or change its definition to reflect its name, we will probe the stack on x86 Windows regardless of the whether or not this attribute was supplied.

Perhaps hasProbeStackAttr?

lib/Target/X86/X86FrameLowering.cpp
800–801

Could this be a StringRef?

1023–1024

You could make a function which gets the stack probe symbol name, then UseStackProbe could be defined in terms of that name being non-empty.

1243–1246

Test case for this?

1251–1252

Saving RBX seems new. I'd have a comment describing both of these saves.

1259

Magic number should have a comment or stuck in a const variable with a nice name. Even better would be both.

pcwalton added inline comments.Jun 20 2017, 12:10 PM
lib/Target/X86/X86FrameLowering.cpp
800–801

I originally wanted to do that, but the docs say that StringRefs don't have null termination, which the addExternalSymbol API needs. If you'd prefer to work around that in another way, let me know.

majnemer added inline comments.Jun 20 2017, 12:15 PM
lib/Target/X86/X86FrameLowering.cpp
800–801

Hmm. Isn't there a liveness problem then? Symbol.c_str() is only valid until this functoin returns. addExternalSymbol doesn't make a copy of its input unless I am mistaken.

majnemer added inline comments.Jun 20 2017, 12:21 PM
lib/Target/X86/X86FrameLowering.cpp
800–801

If I am correct, I think you will want to use createExternalSymbolName.

pcwalton added inline comments.Jun 20 2017, 12:24 PM
lib/Target/X86/X86FrameLowering.cpp
1251–1252

I don't know why the original patch author chose to spill RBX. Seems not needed. I'll remove it.

1259

As before, I don't know what this magic number is for. Seems unneeded. I'll remove it.

pcwalton added inline comments.Jun 20 2017, 12:27 PM
lib/Target/X86/X86FrameLowering.cpp
1259

Oh, I guess this is the page size. I don't see why user-requested stack probes should need this to be supplied explicitly; it just bloats the size of every function prolog for something that is going to be constant on the architecture. Will remove.

@pcwalton No, both the spill and the "magic" number are necessary. See https://reviews.llvm.org/D9858. ebx is the size of the frame, and 0x1000 is the probe interval to be performed (in other words, page size).

@pcwalton But I agree that it's better to change both sides of the __probestack ABI to eliminate the argument and the spill (on x86_64), of course.

pcwalton updated this revision to Diff 103293.Jun 20 2017, 4:45 PM

Addressed review comments.

majnemer added inline comments.Jun 20 2017, 5:20 PM
lib/Target/X86/X86FrameLowering.cpp
510

Why is this by ref?

801

I think it is more natural to see: StringRef Symbol = STI.getTargetLowering()->getStackProbeSymbolName(MF);

1243–1245

Am I correct in thinking that we will save and restore R11 when we previously would not in the large code model for Win64? If so, was that a bug in the implementation of the large code model for Win64?

lib/Target/X86/X86ISelLowering.cpp
36342

const StringRef looks weird, StringRefs are immutable.

36344–36347

I'd kill the braces here.

36355–36357

Ditto.

lib/Target/X86/X86ISelLowering.h
1059

Ditto regarding const StringRef.

@pcwalton No, both the spill and the "magic" number are necessary. See https://reviews.llvm.org/D9858. ebx is the size of the frame, and 0x1000 is the probe interval to be performed (in other words, page size).

That patch never landed, and the current approach, with customizable symbol names, is to move the responsibility out of compiler-rt. So I would argue that https://reviews.llvm.org/D9858 is irrelevant.

lib/Target/X86/X86FrameLowering.cpp
1243–1245

I'm not familiar enough with the Windows 64-bit ABI to be able to answer that question. I would assume that it would have to have been a bug in the previous implementation, as I don't see any other way to make the call to __chkstk work…

pcwalton updated this revision to Diff 103308.Jun 20 2017, 6:05 PM

Addressed comments.

tamird added a subscriber: tamird.Jun 20 2017, 7:47 PM
emaste added a subscriber: emaste.Jun 21 2017, 5:52 AM
majnemer added inline comments.Jun 21 2017, 7:04 AM
lib/Target/X86/X86FrameLowering.cpp
1253

Hmm, I believe a testcase for this situation hasn't been added.

pcwalton added inline comments.Jun 21 2017, 10:48 AM
lib/Target/X86/X86FrameLowering.cpp
1253

Well, I'm not sure how R11 can actually be live-in in the function prolog right now. As I understand things, the reason why EAX is spilled if necessary is to handle the edge case on 32-bit x86 where inreg is specified for a function parameter, which causes EAX to be live-in at the very start of the function. But there's no corresponding inreg attribute for x86-64 that causes R11 to become live-in that early. Checking for that case even for R11 seemed prudent to me though, because (a) the corresponding situation can happen for EAX and it seemed cleaner to use the same logic for both registers instead of duplicating some of it; (b) in the future, if somehow R11 can become live-in at function start, this logic will be needed.

Am I interpreting the situation correctly, and if so, what would you prefer I do regarding testing?

whitequark added inline comments.Jun 21 2017, 12:12 PM
lib/Target/X86/X86FrameLowering.cpp
1253

I am quite certain that R11 should *not* be spilled this way. LLVM uses it as a scratch register in quite a few places. E.g. the comment above:

// For the large code model, we have to call through a register. Use R11,
// as it is scratch in all supported calling conventions.

the GetScratchRegister function in X86FrameLowering.cpp, the X86TargetLowering::getScratchRegisters, and the X86TargetLowering::LowerINIT_TRAMPOLINE function, which needs it for essentially the same job.

You could perhaps replace the call with one of the getScratchRegister functions but it seems fine to hardcode it while not spilling the way it is, too.

pcwalton updated this revision to Diff 103462.Jun 21 2017, 2:13 PM

Addressed review comments.

I removed the R11 spill code as requested. I left in the refactoring that moves the existing EAX spill and reload code into separate functions, as X86FrameLowering::emitPrologue() is rather large and that seemed like a worthwhile refactoring.

LGTM, but I'd still like to hear from @majnemer

Addressed review comments.

I removed the R11 spill code as requested. I left in the refactoring that moves the existing EAX spill and reload code into separate functions, as X86FrameLowering::emitPrologue() is rather large and that seemed like a worthwhile refactoring.

I am not entirely sure that the refactored code is identical to what was left behind. Moreover, the refactoring is now incidental with respect to the bulk of the change. I'd request that if you were to pursue it, to please factor it out into another review.

pcwalton updated this revision to Diff 103477.Jun 21 2017, 3:14 PM

Addressed comments. Reverted the EAX save refactoring.

Looking pretty good, I think we are almost there :)

lib/Target/X86/X86FrameLowering.cpp
1053–1055

We can use the redzone and stack probes at the same time? This surprises me. Also, there is no test for this.

pcwalton updated this revision to Diff 103515.Jun 21 2017, 8:00 PM

Addressed review comments.

This revision is now accepted and ready to land.Jun 21 2017, 11:14 PM
joerg added a subscriber: joerg.Jun 22 2017, 7:05 AM

It would be nice to make the stackprobesize a proper TLI attribute, so that OSes can decide how much guarding they are willing to guarantee. I think I would also like to have an option to inline the probing without the complexity of the function call. But both can be done as a separate step.

tamird removed a subscriber: tamird.Jun 22 2017, 7:05 AM
This revision was automatically updated to reflect the committed changes.