This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP] Codegen for 'if' clause
ClosedPublic

Authored by ABataev on Jul 29 2014, 10:29 PM.

Details

Summary

Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive.
If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:

global_thread_id.addr = alloca i32
store i32 global_thread_id, global_thread_id.addr
zero.addr = alloca i32
store i32 0, zero.addr
kmpc_serialized_parallel(loc, global_thread_id);
microtask(global_thread_id.addr, zero.addr, captured_struct/*context*/);
kmpc_end_serialized_parallel(loc, global_thread_id);

Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call.

Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).

Diff Detail

Repository
rL LLVM

Event Timeline

ABataev updated this revision to Diff 12009.Jul 29 2014, 10:29 PM
ABataev retitled this revision from to [OPENMP] Codegen for 'if' clause.
ABataev updated this object.
ABataev edited the test plan for this revision. (Show Details)
ABataev added a subscriber: Unknown Object (MLST).

Hello John, Thanks you very much for the review! Here are my answers.

I apologize for the outrageous delays; we’ve been a little busy.

Oh, don't worry. I was just afraid that you're completely done with the
clang because of swift. :-)

The preconditions on this method aren’t clear. Does it *assume* that there’s
only one clause or does it *check*? Right now, it seems to be assuming that
it doesn’t have more than one, but checking that it doesn’t have zero, which
is weird.

The precondition is quite simple: there must be exactly 0 or 1 clause of
the specified kind. This method returns the single clause if it exists.
Otherwise, if there is no such clause, it just returns nullptr.

Elsewhere in the code, you’re conditionally checking whether the
result is null. Is there a language rule that you can only have one if clause?

This method is required for clauses llike "if", "num_threads" etc. These
clauses may be specified only once per directive, but also they can be
skipped, if they are not required. So, directives may have 0 or 1
instance of such clauses.

Just dereference the iterator here. Your assert can then increment the
original iterator, and you don’t have to worry about copying the iterator in
non-asserts builds.

Done, thanks.

This subclass should go in a separate, OpenMP-specific header.

Ok, moved to CGOpenMPRuntime.h

You seem to be inconsistent about the capitalization here, and on reflection,
honestly, I think this abbreviation is completely illegible. And I’m not sure
that calling it the “global thread ID” is actually meaningful when it’s clearly
not global in any real sense.

Let’s take everywhere you’ve got “Gtid” or “GTid”, including in the code
already committed, and rename this to “ThreadID”. So this would be:

const VarDecl *getThreadIDVariable() const { return ThreadIDVar; }

Ok, done.

Also, this would be a great place to put a comment explaining what this
variable actually is. For example, that it’s a variable of type ident_t*.

I'll modify the comment. The type of this variable is kmp_int32, i.e.
just int32.

If this is an invariant of the CGOpenMPRegionInfo, shouldn’t it be asserted
on in that structure, maybe in its constructor?

Moved to constructor.

You have this logic repeated several times. It would be reasonable
to have a

LValue getThreadIDVariableLValue(CodeGenFunction &CGF);

helper on the region info.

Also, you seem to vary between using the type of the variable and assuming
that ident_t == int32_t. It’s probably better to trust the type of the variable
consistently.

Created this function, replaced the code by function call.
The type of ThreadID is int32_t, not ident_t. ident_t is a structure
that has info about debug location and some internal flags, ThreadID is
just a 32-bit integer. The variable is declared only in outlined
parallel region.
If the construct is not in parallel region (for example, there is a
standalone '#omp atomic' or other non-parallel directive), this variable
does not exist and we have to create a temp with type int32_t for this var.

Please add a comment describing the intended behavior of this function and
how its result will be used. The rule seems to be “if we’re inside a captured
statement, use the region info’s thread-ID variable and ignore the thread ID
we just computed; otherwise, stash that thread ID in a temporary and return
the address of that”. That is a very strange rule. Describing *why* it does
that would be invaluable to people trying to maintain this in the future.

Ok, I'll add the description.

I’m not sure what rule you’re following for what goes in CGOpenMP vs.
CGOpenMPRuntime. All of the code you’re emitting here is very
runtime-specific. I think you could very easily abstract out a virtual
interface for doing runtime-specific logic, GCXXABI-style. e.g. “emit
a function for this captured statement”, “run this captured statement
serially”, “run this captured statement in parallel”, etc. All of that could
be in CGOpenMPRuntime.cpp, and you could keep the general logic for
interpreting down statements in CGOpenMP.cpp.

I'll look at GCXXABI and will try do my best to improve CodeGen for
OpenMP, thanks.

It would be better for this to not be templated and take a std::function.

Will do.

Best regards,

Alexey Bataev

Software Engineer
Intel Compiler Team

17.09.2014 12:36, John McCall пишет:

ABataev updated this revision to Diff 13821.Sep 18 2014, 12:17 AM

Rework after review.

hfinkel accepted this revision.Oct 8 2014, 7:40 AM
hfinkel edited edge metadata.

LGTM, thanks!

include/clang/AST/StmtOpenMP.h
132 ↗(On Diff #13821)

Make sure this gets the improved comment text from D5145.

lib/CodeGen/CGOpenMPRuntime.cpp
141 ↗(On Diff #13821)

Why not use Loc as the source location?

144 ↗(On Diff #13821)

And here too.

I'm under the impression that it is better to have a bunch of code pointing to the same source location than to have code generated with no source locations.

264 ↗(On Diff #13821)

Capitalize O in otherwise. -- Move to previous line.

268 ↗(On Diff #13821)

Remove empty comment line.

This revision is now accepted and ready to land.Oct 8 2014, 7:40 AM
rjmccall edited edge metadata.Oct 8 2014, 6:38 PM

With these changes (and Hal's), looks good to me.

include/clang/AST/StmtOpenMP.h
132 ↗(On Diff #13821)

This comment should clarify that it returns nil if there isn't a clause of that kind. It only asserts that there aren't multiple clauses of that kind.

lib/CodeGen/CGOpenMPRuntime.cpp
134 ↗(On Diff #13821)

It'd be nice to leave a comment here like "Check whether we've already cached a load of the thread id in this function."

145 ↗(On Diff #13821)

Is there a reason we can't safely always load in the entry block?

149 ↗(On Diff #13821)

This comment doesn't seem be accurate. You're actually generating a call.

161 ↗(On Diff #13821)

Would it make more sense for the CodeGenFunction to just have a lazily-allocated cache of arbitrary information associated with the OpenMPRuntime, rather than having tons of extra side-tables?

241 ↗(On Diff #13821)

Please leave a blank line before the emission of each call so that these don't all run together.

<- That is, put one here. (Before the comment line.)

246 ↗(On Diff #13821)

<- And here.

254 ↗(On Diff #13821)

<- And here.

lib/CodeGen/CGOpenMPRuntime.h
37 ↗(On Diff #13821)

We don't necessarily need to do this right now, but eventually I think it makes sense to abstract this type so that it's private to the target runtime.

160 ↗(On Diff #13821)

You don't need to comment CGF parameters; they're everywhere, and their purpose is obvious.

161 ↗(On Diff #13821)

This kind of comment doesn't add anything that's not expressed in the type of the parameter.

A SourceLocation parameter to an Emit function has a conventional meaning in CodeGen, which is "here's a location to associate with this operation, for debug locations or whatever else". You don't need to comment such parameters unless the SourceLocation's going to be used for some other purpose.

192 ↗(On Diff #13821)

Please describe the expected type of the value. It's something like void(*)(kmp_int32, struct context_vars*), right?

lib/CodeGen/CGStmtOpenMP.cpp
26 ↗(On Diff #13821)

This deserves a comment, if just to explain what the parameter to the CodeGen callback means.

Also, please take the std::function as a const &.

Hal, John, thanks for the review. I'll split this patch in two parts: one with some improvements and one with actual codegen for IfClause.

include/clang/AST/StmtOpenMP.h
132 ↗(On Diff #13821)

Ok, I replaced the comment and added additional comment that it returns nullptr if the directive does not have associated clause of the specified kind at all.

lib/CodeGen/CGOpenMPRuntime.cpp
134 ↗(On Diff #13821)

I'll add one.

141 ↗(On Diff #13821)

Ok, I'll fix it.

144 ↗(On Diff #13821)

Yes, will be done.

145 ↗(On Diff #13821)

Yes, actually this ThreadIdVar may be a function parameter. CodeGenFunction generates additional alloca for parameters and stores param to this generated new alloca. If currently we're not in EntryBlock I can't define where to put instructions with value loading, because I don't know where this store instruction. Loads must be dominated by this store, but I just can't guarantee it.

149 ↗(On Diff #13821)

Agree, will be fixed.

161 ↗(On Diff #13821)

Agree, I'll rework it for OpenMPLocMap and OpenMPThreadIDMap.

241 ↗(On Diff #13821)

Ok

246 ↗(On Diff #13821)

Ok

254 ↗(On Diff #13821)

Ok

264 ↗(On Diff #13821)

Ok, I'll do.

268 ↗(On Diff #13821)

Ok

lib/CodeGen/CGOpenMPRuntime.h
37 ↗(On Diff #13821)

Ok, I'll move to CGOpenMPRuntime.cpp in a separate patch.

160 ↗(On Diff #13821)

Ok

161 ↗(On Diff #13821)

Got it

192 ↗(On Diff #13821)

Almost. I'll add proper description of this function. Actually it returns void(*)(kmp_int32, kmp_int32, struct context_vars*)

lib/CodeGen/CGStmtOpenMP.cpp
26 ↗(On Diff #13821)

Ok, will do.

rjmccall added inline comments.Oct 10 2014, 12:03 PM
lib/CodeGen/CGOpenMPRuntime.cpp
145 ↗(On Diff #13821)

Makes sense. Optimized builds will completely eliminate this as a concern, I think.

ABataev closed this revision.Oct 12 2014, 11:13 PM
ABataev updated this revision to Diff 14784.

Closed by commit rL219597 (authored by @ABataev).