This is an archive of the discontinued LLVM Phabricator instance.

Added propagation of not big initial stack size of master thread to workers.
ClosedPublic

Authored by AndreyChurbanov on May 27 2019, 9:37 AM.

Details

Summary

Fix for https://bugs.llvm.org/show_bug.cgi?id=26540.

Currently implemented only for non-Windows 64-bit platforms (not requested on Windows so far;
32-bit systems have small memory, so it is tricky to find reasonable threshold to limit propagation).

For 64-bit the limit of 256 MB chosen (almost arbitrary, can be debatable...).

Diff Detail

Repository
rL LLVM

Event Timeline

Moved declaration of rlim under #if to avoid "unused variable" warning.

Generally, I think this is the right thing to do, minor questions though.

Could you change the commit subject to describe the change, the bug reference should be in the commit message.

I'm not familiar with the test system here but is it checking the output of the test looking for "passed" or how does it determine if the test succeeded?

runtime/src/z_Linux_util.cpp
1844 ↗(On Diff #202424)

I would have assumed we have a limit and we propagate min(MASTER_STACK_SIZE, LIMIT_STACK_SIZE) so if you are a bit over the "limit" you don't fall back to the potentially way lower default.

runtime/test/misc_bugs/stack-propagate.c
62 ↗(On Diff #202424)

Can you return 1 on failure?

AndreyChurbanov retitled this revision from Fix for https://bugs.llvm.org/show_bug.cgi?id=26540. to Added propagation of not big initial stack size of master thread to workers..May 31 2019, 7:42 AM
AndreyChurbanov edited the summary of this revision. (Show Details)
AndreyChurbanov marked 2 inline comments as done.May 31 2019, 8:07 AM
AndreyChurbanov added inline comments.
runtime/src/z_Linux_util.cpp
1844 ↗(On Diff #202424)

Not sure I got the question here.

Couple more details anyway. We've got request to propagate 8MB, because it was default limit on user's system and we had 4MB internal default in the library, that caused crash. Then we can as well propagate 16MB, 20MB, etc. But we cannot propagate 3GB because it slows down a lot of programs. So we have to stop propagation at some stack size limit. I haven't investigated what is the perfect value for upper bound we want to propagate. Would guess it can depend on a lot of factors... So just (arbitrary) chosen 256MB (haven't seen slowdowns with this value), and "a bit over" means "do not propagate" for me. Anyway we have standard external control, so if people need particular stack size for worker threads they can explicitly express that, same as set any stack size for master thread using system means and compiler/linker.

runtime/test/misc_bugs/stack-propagate.c
62 ↗(On Diff #202424)

Sure, I will fix the test.

Test fixed: added "return 1" on possible failure of the test.

AndreyChurbanov marked an inline comment as not done.May 31 2019, 8:14 AM

Regarding test system,

how does it determine if the test succeeded?

I am pretty sure it checks return code, 0 - success.

I'm not familiar with the test system here but is it checking the output of the test looking for "passed" or how does it determine if the test succeeded?

For most of the OpenMP runtime tests, the LIT test system will check the return value of the test. If the // RUN: line (at the top) of a test has FileCheck in it then it is checking both the return values and the output for certain streams of tokens, this can be seen in the OMPT tests. Technically you can pipe the output into any tool but LLVM uses FileCheck for this usually.

jdoerfert added inline comments.May 31 2019, 8:50 AM
runtime/src/z_Linux_util.cpp
1844 ↗(On Diff #202424)

My point is the odd behavior shown in the table below, assuming KMP_DEFAULT_STKSIZE * 64 = 256MB.

master stack sizethread stack size
255MB255MB
256MB256MB
257MB4MB

If you change:

__kmp_stksize = rlim.rlim_cur;
// if system stack size is too big then don't propagate it to workers
if (__kmp_stksize > KMP_DEFAULT_STKSIZE * 64) // just a heuristics...
  __kmp_stksize = KMP_DEFAULT_STKSIZE;

to

// Limit the propagated stack size at a empirically choosen value.
__kmp_stksize = min(rlim.rlim_cur, KMP_DEFAULT_STKSIZE * 64);

you will get:

master stack sizethread stack size
255MB255MB
256MB256MB
257MB256MB
runtime/test/misc_bugs/stack-propagate.c
62 ↗(On Diff #202424)

If the return values is checked then this change should be sufficient.

AndreyChurbanov marked an inline comment as done.May 31 2019, 9:02 AM
AndreyChurbanov added inline comments.
runtime/src/z_Linux_util.cpp
1844 ↗(On Diff #202424)

Thanks for explanation, got your point now.
I'll think a bit of this change.

Addressed Johannes' comment. At the same time I've reduced the upper bound of the size to be propagated to 64MB (from initial 256MB), because with this change more applications will be affected (those require big stack for master and small stack for workers).

Thanks for the update. I inlined two more comments.

runtime/test/misc_bugs/stack-propagate.c
42 ↗(On Diff #202919)

Do we need to print passed here?

jdoerfert added inline comments.Jun 4 2019, 9:30 PM
runtime/src/kmp_settings.cpp
294 ↗(On Diff #202919)

[My initial comment got somehow lost:( ]

I have the same problem with this logic as I had before. I would like us to saturate at the edges and not jump to an arbitrary default value suddenly.

Assuming I got the min/max right, I would like:

*val = min(KMP_MAX_STKSIZE, max(KMP_MIN_STKSIZE, *val));

If there is a problem with that, please describe it, preferably also in the code above.

Review comments addressed:

  • removed printf("passed") when the test is skipped;
  • changed default size to lower/upper limit for consistency during size adjustment.
jdoerfert accepted this revision.Jun 5 2019, 8:19 AM

LGTM, thanks.

This revision is now accepted and ready to land.Jun 5 2019, 8:19 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2019, 9:12 AM