Page MenuHomePhabricator

[OPENMMP] Resolve lost LoopTripCnt for subsequent loops in same thread.
ClosedPublic

Authored by ronlieb on Jul 16 2019, 10:24 AM.

Details

Summary

Remove loopTripCnt from threaded device stack after consuming it.
Added a libomptarget DP message to aid in future debugging and to
validate the added testcase, which only runs in Debug build.

%ld => %lu
removed extraneous braces.

Diff Detail

Repository
rL LLVM

Event Timeline

ronlieb created this revision.Jul 16 2019, 10:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 16 2019, 10:24 AM
ABataev added inline comments.Jul 16 2019, 10:38 AM
libomptarget/src/omptarget.cpp
738 ↗(On Diff #210130)

Use %lu instead, it is an unsigned type.

libomptarget/test/offloading/looptripcnt.c
38 ↗(On Diff #210130)

I think, instead of the debug messages better to check the results of the printfs.

Can we validate the result differently, without debug messages?

libomptarget/test/offloading/looptripcnt.c
21 ↗(On Diff #210130)

I think the standard doesn't expect a structured block, but a for loop directly.

ronlieb marked an inline comment as done.Jul 16 2019, 10:49 AM
ronlieb added inline comments.
libomptarget/test/offloading/looptripcnt.c
38 ↗(On Diff #210130)

I initially thought about that, but I am worried that depending on what architecture you run on it might produce different numbers of teams, if the thread limit was a lot smaller than on a gpu for example.

but what I am really trying to test is that we found the loopTripCount to pass into the runtime.

if you strongly prefer that I still do so, I will be happy to change the test. I would of course still keep the DP message for diagnostic purposes down the road.

ABataev added inline comments.Jul 16 2019, 11:01 AM
libomptarget/test/offloading/looptripcnt.c
38 ↗(On Diff #210130)

It is always better to check the real results rather than debug messages. If you could write a test, which is able to provide a stable check, then do this.

ronlieb marked 2 inline comments as done.Jul 16 2019, 12:13 PM
ronlieb added inline comments.
libomptarget/test/offloading/looptripcnt.c
21 ↗(On Diff #210130)

will fix shortly.

38 ↗(On Diff #210130)

ok, so I spent some time trying to do this. The issue I am encountering is
that the behavior that needs to be verified is that we properly find the loop trip count in the function 'target' within omptarget.cpp. This is then passed along to function __tgt_rtl_run_target_team_region in plugins.cpp which "may" choose to modify the blocksPerGrid or not. So depending on your target, you may see different numbers of teams active if you query omp_get_num_teams(). The observed behavior on different hosts is tenuous at best in determining if loop_trip_count was properly found. The point of this test is to make sure we dont loose the loop_trip_count. In this situation i think it best to catch the problem in Debug build/test where we can explicitly observe the DP message.

ABataev added inline comments.Jul 16 2019, 12:22 PM
libomptarget/test/offloading/looptripcnt.c
38 ↗(On Diff #210130)

Then better to check for the debug message, I think. Other opinions?

ronlieb updated this revision to Diff 210147.Jul 16 2019, 12:27 PM
ronlieb edited the summary of this revision. (Show Details)

@grokos any comments here please?
hoping to get this patch in before the 9.0 branch.

grokos accepted this revision.Jul 17 2019, 6:52 AM

@grokos any comments here please?
hoping to get this patch in before the 9.0 branch.

The patch should definitely be committed as it fixes a potential problem with multiple threads. Regarding the test, I'm a bit torn. I don't like the idea of relying on debug messages either, but you are right that the plugin may change the number of teams so we cannot predict consistently what omp_get_num_teams() will return on each architecture. I will accept the patch as-is so that the bugfix in merged into the 9.0 release but we will need to revise the test in the future.

This revision is now accepted and ready to land.Jul 17 2019, 6:52 AM

Thanks folks, publishing now

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2019, 10:09 AM