Page MenuHomePhabricator

[PowerPC] Implement probing for dynamic stack allocation
ClosedPublic

Authored by lkail on Jun 7 2020, 8:57 PM.

Details

Summary

This patch is part of supporting -fstack-clash-protection. Mainly do such things compared to existing lowerDynamicAlloc

  • Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get actual frame pointer and final stack pointer.
  • Synthesize a loop to probe by blocks.
  • Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in prologepilog.

Diff Detail

Unit TestsFailed

TimeTest
70 mswindows > Clang.CodeGen::Unknown Unit Message ("")
Script: -- : 'RUN: at line 1'; c:\ws\buildkite-windows-2\llvm-project\premerge-checks\build\bin\clang.exe -cc1 -internal-isystem c:\ws\buildkite-windows-2\llvm-project\premerge-checks\build\lib\clang\11.0.0\include -nostdsysteminc -emit-llvm C:\ws\buildkite-windows-2\llvm-project\premerge-checks\clang\test\CodeGen\vector-logic-not.cpp -o - | c:\ws\buildkite-windows-2\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\buildkite-windows-2\llvm-project\premerge-checks\clang\test\CodeGen\vector-logic-not.cpp

Event Timeline

lkail created this revision.Jun 7 2020, 8:57 PM
lkail updated this revision to Diff 269102.Jun 7 2020, 9:02 PM
lkail edited the summary of this revision. (Show Details)Jun 7 2020, 9:05 PM

Some minor comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11613

Can we make it as unsigned integer ?

11642

Adding some comments to indicate what the CFG looks like will make the code more clear.

11696

Can we use the CTR loop ?

llvm/test/CodeGen/PowerPC/stack-clash-dynamic-alloca.ll
3

Check the run for pwr9

lkail updated this revision to Diff 272328.Jun 21 2020, 8:52 PM
lkail marked an inline comment as done.

Rebased and address steven's comments.

lkail added inline comments.Jun 21 2020, 8:53 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11613

Looks getAsInteger already does it.

template <typename T>
std::enable_if_t<!std::numeric_limits<T>::is_signed, bool>
getAsInteger(unsigned Radix, T &Result) const;
lkail marked an inline comment as done.Jun 23 2020, 7:00 PM
lkail added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11696

Yes, we can, but it looks not worth the effort, since the trip count can be zero, we have to add additional code to check this case.

lkail marked an inline comment as done.Jun 23 2020, 7:01 PM
jsji added inline comments.Jun 25 2020, 3:25 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7838

Meaningless comment.

7840

Meaningless comment.

11608

Can we set the size to sys::Process::getPageSizeEstimate()? Typical systems are all giving me 65536, are we probing too much if we set it to 4096 here?

11616

What happen with "stack-probe-size"="65536"?

The imm is known value, if we can NOT load an imm > 32768 with one instruction, we should be able to load it in two instructions. Why we have to assert here?

11617

What about "stack-probe-size"="0"? When user would like to opt-out probe for some specific function? See https://reviews.llvm.org/rL225360`. We will now generate code divided by 0...

Also what about "stack-probe-size"="1"? I think we should also round the size to stack alignment similar to Z.

11621

I don't quite understand this. Why we can NOT do this similar to x86 or SystemZ? using PHI nodes.

11637

LLVM_BB ? Why LLVM? Can we use more meaningful name?

11655

calculate?

lkail updated this revision to Diff 273990.Jun 28 2020, 8:12 PM

Address @jsji 's comments. Add test case for large probe size.

lkail marked 4 inline comments as done.Jun 28 2020, 8:13 PM
lkail marked 2 inline comments as done.Jun 28 2020, 8:20 PM
lkail added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11608

Yes, sysconf gives 65536 on my rhel POWER machine and 4096 on my rhel X86 machine. I'm quite concerned if users do cross compilation if the host has large page size than the target.

11621

The comment to compare with PPCRegisterInfo::lowerDynamicAlloc might be a little bit misleading. My point is PPCTargetLowering::emitProbedAlloca is unable to expaned in prologepilog like PPCRegisterInfo::lowerDynamicAlloc. Should I delete this comparison?

Looks we don't need phi nodes in the loop, only r1, which is a physical register, is updated.

jsji accepted this revision as: jsji.Jun 30 2020, 1:39 PM

LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
11608

How about we introduce an option for user to switch the default value here, eg: default is still the smallest, 4096, but we provide another choice to use system default (65536 most of the time) .

11621

yeah, maybe remove the comments comparing to lowerDynamicAlloc.

llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
611

Meaningless comments.

This revision is now accepted and ready to land.Jun 30 2020, 1:39 PM
This revision was automatically updated to reflect the committed changes.