This is an archive of the discontinued LLVM Phabricator instance.

Excluded untied tasks from task stealing constraint
ClosedPublic

Authored by jlpeyton on Jun 9 2016, 1:09 PM.

Details

Summary

If either current_task or new_task is untied then skip task scheduling constraint checks, because untied tasks are not affected by the task scheduling constraints.

Diff Detail

Repository
rL LLVM

Event Timeline

jlpeyton updated this revision to Diff 60228.Jun 9 2016, 1:09 PM
jlpeyton retitled this revision from to Excluded untied tasks from task stealing constraint.
jlpeyton updated this object.
jlpeyton added a reviewer: AndreyChurbanov.
jlpeyton set the repository for this revision to rL LLVM.
jlpeyton added a subscriber: openmp-commits.
AndreyChurbanov accepted this revision.Jun 11 2016, 2:36 AM
AndreyChurbanov edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jun 11 2016, 2:36 AM
This revision was automatically updated to reflect the committed changes.
pawosm01 reopened this revision.Oct 22 2016, 4:19 AM
pawosm01 added a subscriber: pawosm01.

Regression found. See comment in line 1754.

openmp/trunk/runtime/src/kmp_tasking.c
1754 ↗(On Diff #60562)

This line causes race condition which leads to occasional livelock in strassen_taskdep benchmark from Kastors suite (a benchmark that heavily uses task dependencies). Could be replicated on AArch64 and x86_64, typically working on odd number of cores (5, 7, etc.).

This revision is now accepted and ready to land.Oct 22 2016, 4:19 AM
pawosm01 requested changes to this revision.Oct 22 2016, 4:19 AM
pawosm01 added a reviewer: pawosm01.
This revision now requires changes to proceed.Oct 22 2016, 4:19 AM
openmp/trunk/runtime/src/kmp_tasking.c
1754 ↗(On Diff #60562)

I agree the change in this line is a bug. But once it has beed committed, could you please advise how to better fix the issue?

Is it OK to create new review request that reverts the change in this line and close current one?

Or should I revert both changes and re-submit the diff with only the first change that looks OK?

Or can I just revert the change in this line and close the review without changing the diff?

Thanks, Andrey

I don't really know what to do with this, I guess the original author knows best what's next...

jlpeyton added inline comments.Oct 26 2016, 11:36 AM
openmp/trunk/runtime/src/kmp_tasking.c
1754 ↗(On Diff #60562)

Can you give the exact libomp revision you're using, how you are calling strassen_taskdep, operating system, compiler?

pawosm01 added a comment.EditedOct 26 2016, 11:50 AM

Runtime:

I'm using libomp from latest git master built using clang and gcc (tried both builds with the same result)

Benchmark:

strassen_taskdep from kastors-1.1 built using clang from git master (it never catched livelock when built with gcc and having libomp as a GOMP drop-in replacement!)

Machine:

processor : 31
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
stepping : 7
microcode : 0x710
cpu MHz : 1200.000
cache size : 20480 KB
physical id : 1
siblings : 16
core id : 7
cpu cores : 8
apicid : 47
initial apicid : 47
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 5787.09
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

System:

3.13.0-86-generic #131-Ubuntu SMP Thu May 12 23:33:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Starting:

$ KMP_TOPOLOGY_METHOD=cpuinfo OMP_PROC_BIND=close OMP_DISPLAY_ENV=true OMP_STACKSIZE=64M OMP_NUM_THREADS=7 bin/strassen_taskdep -i 5 -n 4096 -d 5 -s 64

Successful run should print:

OPENMP DISPLAY ENVIRONMENT BEGIN

 _OPENMP='201511'
[host] OMP_CANCELLATION='FALSE'
[host] OMP_DEFAULT_DEVICE='0'
[host] OMP_DISPLAY_ENV='TRUE'
[host] OMP_DYNAMIC='FALSE'
[host] OMP_MAX_ACTIVE_LEVELS='2147483647'
[host] OMP_MAX_TASK_PRIORITY='0'
[host] OMP_NESTED='FALSE'
[host] OMP_NUM_THREADS='7'
[host] OMP_PLACES='cores'
[host] OMP_PROC_BIND='close'
[host] OMP_SCHEDULE='static'
[host] OMP_STACKSIZE='64M'
[host] OMP_THREAD_LIMIT='2147483647'
[host] OMP_WAIT_POLICY='PASSIVE'

OPENMP DISPLAY ENVIRONMENT END

Program : bin/strassen_taskdep
Size : 4096
Iterations : 5
Cutoff Size : 64
Cutoff depth : 5
Threads : 7
Time(sec):: avg : 5.616665 :: std : 0.431984 :: min : 5.050717 :: max : 6.352281 :: median : 5.648555

AndreyChurbanov accepted this revision.Nov 1 2016, 5:10 AM

I've submitted https://reviews.llvm.org/D26182 that fixes the problem pointed to by Paul.

Paul, can you accept the changes in here so we can close this revision?

jlpeyton accepted this revision.Nov 1 2016, 9:45 AM
jlpeyton added a reviewer: jlpeyton.
pawosm01 accepted this revision.Nov 1 2016, 10:07 AM
pawosm01 edited edge metadata.
This revision is now accepted and ready to land.Nov 1 2016, 10:07 AM