This is an archive of the discontinued LLVM Phabricator instance.

[OpenMPIRBuilder] introduce createStaticWorkshareLoop
ClosedPublic

Authored by ftynse on Dec 2 2020, 3:59 AM.

Details

Summary

Introduce a function that creates a statically-scheduled workshare loop
out of a canonical loop created earlier by the OpenMPIRBuilder. This
basically amounts to injecting runtime calls to the preheader and the
after block and updating the trip count. Static scheduling kind is
currently hardcoded and needs to be extracted from the runtime library
into common TableGen definitions.

Diff Detail

Event Timeline

ftynse created this revision.Dec 2 2020, 3:59 AM
ftynse requested review of this revision.Dec 2 2020, 3:59 AM
SouraVX added a project: Restricted Project.
SouraVX added subscribers: AMDChirag, SouraVX.

Only Nits from me. @Meinersbur wdyt?

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
270

The preheader (only) solution is for static schedules but that is OK for now.

667

nice!

751

@Meinersbur: Eventually we might want to keep pointers identifying these different parts as we will add instructions to the blocks and therefore might break these invariants. Not necessary now though.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1006

Nit: Ty or IVType

1019

I'm not a fan of LoopInfo as a name, CLI could be our "go to" acronym?

1089

Add a TODO above this block stating this should eventually move to the CLI or a "CLIUpdate" interface.

Meinersbur added inline comments.Dec 2 2020, 11:05 AM
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
270

The method;s name is "createStaticWorkshareLoop"

281

Return the modified CanonicalLoopInfo instead. The IP can be determined using getAfterIP().

Mostly for consistency with additional createXYZLoop methods, which create a new CanonicalLoopInfo instead of modifying an existing one.

746

Maybe make this member private (to the OpenMPIRBuilder implementation)? Other canonical loops such as tiling or OpenMPIRBuilder itself assuming a specific trip count during finialize() might depend on a specific trip count.

There's nothing like this now nor do I plan to add such things, but until we have a use case why front-end should directly modify the trip-cout, I'd hide this from the API.

750

add a call to assertOK() here?

751

Uses of it within the CanonicalLoop still needs to be identified, for changes like this. This makes explicitly storing it redundant. Same would also apply to most BasicBlocks which have a fixed position relative to the header block, so we could traverse the CFG to find them instead of storing them.

I don't have a good reason for either direction, I suggest to do what's easier for now.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1019

+1

LoopInfo clashes with llvm::LoopInfo and clang::LoopInfo types. The latter already makes it impossible to include some LLVM headers into CGLoopInfo.cpp

1092–1093

Could you emit the fini call into the getExit block instead? getAfterIP is meant for user code that executes, well, after the runtime boilerplate has finished.

1095

Call LoopInfo->assertOK() when the changes are done?

Thanks @ftynse for this patch. This is great and will speed us up by a month.

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
282

No change required but just asking a question for my understanding,
Can createWorkshareLoop call createCanonicalLoop internally?

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1094

A barrier insertion with a TODO to look at nowait attribute?

Meinersbur added inline comments.Dec 2 2020, 11:09 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1088

Did you consider using replaceAllUsesWith? setTripCount should already have removed the instances ub cond and latch.

Meinersbur added inline comments.Dec 2 2020, 11:12 AM
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
282

In the patch I am working on, I extracted most of createCanonicalLoop into a private createCanonicalLoopSkeleton to be used by other createXYZLoop methods (in my case: tiling).

However, this patch modifies an existing CanonicalLoopInfo, instead of creating a new one.

ftynse updated this revision to Diff 309487.Dec 4 2020, 2:03 AM
ftynse marked 18 inline comments as done.

Address reviews

ftynse added inline comments.Dec 4 2020, 2:11 AM
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
270

I only implemented the static schedule for now. My idea was to have createStatic, createDynamic, etc and eventually add a createWorkshareLoop that additionally takes a schedule kind flag and dispatches to the right function.

282

If I understood @Meinersbur's comment on another patch correctly, the overall design is to create canonical loops and transform them later (e.g., tile or workshare) with dedicated functions. createStaticWorkshareLoop follows this idea and therefore does _not_ create the loop, only the worksharing constructs.

746

Moved to implementation.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1088

This replaces the uses of the induction variable (phi of the header block) with induction-variable + lower-bound-from-runtime-call, setTripCount updated the trip count (another argument in icmp in the condition block).

Switched to replaceAllUsesWithIf though.

1094

Added a flag to the function. Let's have the caller decide if they need a barrier or not. For example, clang doesn't insert a barrier after the last loop in the "parallel" segment even if the loop doesn't have "nowait".

@ftynse would you mind having a look at D87247 ?

LGTM. Please wait for approval from @Meinersbur.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1106–1108

Nit: There is a create barrier function in the OpenMPIRBuilder.
CreateBarrier(Builder.saveIP(), llvm::omp::OMPD_barrier);

This revision is now accepted and ready to land.Dec 6 2020, 10:12 AM

Thanks for the patch. Just a couple of nits

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1006

I strongly suspect that @jdoerfert was referring to argument Type* type

1032

Nit: To keep a consistent behavior throughout the builder, please change to:

if (!updateToLocation(Loc))
    return Loc.IP;
1047–1049

NIT: please name all alloca's above - easier to track in IR.

ftynse updated this revision to Diff 309845.Dec 7 2020, 2:26 AM
ftynse marked an inline comment as done.

Address more reviews.

ftynse marked an inline comment as done.Dec 7 2020, 2:28 AM
ftynse added inline comments.
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1032

It's already inconsistent, createCanonicalLoop doesn't follow this pattern, and we cannot return Loc.IP here because the return type is CanonicalLoopInfo *. Returning nullptr instead.

1106–1108

Yes, but it would emit another call to get_thread_num and OMPD_barrier could misled somebody to think the emitted code comes from a barrier directive, which it is not.

jdoerfert added inline comments.Dec 7 2020, 8:06 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1106–1108

but it would emit another call to get_thread_num

That is not a problem we address here. They are deduplicated later.

OMPD_barrier could misled somebody to think the emitted code comes from a barrier directive, which it is not.

The OpenMP runtime is integrated with existing tooling. Barriers "know" what caused them and that is available to profilers, debuggers, etc. We should not break with this tooling capability. The worksharing type for barriers is available via:
createBarrier(Loc, OMPD_for, /* ForceSimpleCall */ false, /* CheckCancelFlag */ false)
The last argument is false only for now. The cancel flag capabilities will be needed later as well, at least from the OpenMP frontends.

ftynse updated this revision to Diff 309938.Dec 7 2020, 9:24 AM
ftynse marked an inline comment as done.

Address even more review

This revision was landed with ongoing or failed builds.Dec 7 2020, 1:32 PM
This revision was automatically updated to reflect the committed changes.