Page MenuHomePhabricator

[OpenMP IRBuilder, MLIR] Add support for OpenMP do schedule dynamic
Needs ReviewPublic

Authored by Leporacanthicus on Feb 24 2021, 8:18 AM.

Details

Summary

The implementation supports static schedule for Fortran do loops. This
implements the dynamic variant of the same concept.

Diff Detail

Event Timeline

Leporacanthicus requested review of this revision.Feb 24 2021, 8:18 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 24 2021, 8:18 AM

Please note that this is not intended as the final commit, but rather a basis for asking some advice on how to move forward.

The main stumbling point, which may be my lack of understanding of what it's supposed to do: the CanonicalLoopInfo assumes that the the cond block has a CmpInst as the first instruction. In the dynamic, the corresponding block [in my understanding] starts with a call instruction to fetch the "next" set of data to process. This causes the assertOK to fail, hence it is commented out on line 1300 in the patch.

SouraVX added a project: Restricted Project.Feb 24 2021, 8:29 AM

Could you please add a Unit test for this(as we don't have clang, flang, or MLIR interfacing for dynamic workshare loops).

kiranchandramohan retitled this revision from Add support for OpenMP do schedule dynamic to [OpenMP IRBuilder] Add support for OpenMP do schedule dynamic.Feb 24 2021, 9:05 AM
jdoerfert added inline comments.Feb 24 2021, 9:26 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1193

I think we sue doxygen comments /// for static functions as well.

1311

I don't know what this does but it looks brittle. We should have handles for all values, e.g., "the old condition", which we can remove is necessary by following the handle and the operands (with a single use).

kiranchandramohan retitled this revision from [OpenMP IRBuilder] Add support for OpenMP do schedule dynamic to [OpenMP IRBuilder, MLIR] Add support for OpenMP do schedule dynamic.Feb 24 2021, 9:37 AM
Meinersbur added inline comments.Feb 24 2021, 10:27 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1226–1227

The InsertionPointTy of Loc is not used with methods that apply to a canonical loop. Consider passing only the DebugLoc. The same seem to apply to createStaticWorkshareLoop.

1238–1257

Since these are shared with createStaticWorkshareLoop, did you consider extractin it into a common function?

1273–1274

Usually the name of the parameter comes before the argument

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1193

No problem, I will update - although the other functions in this file aren't using that style (see line 1054-1057) - should I fix all of them (in a separate commit, I expect?) or just the two new functions?

1311

Agreed. There needs to be a better way to do this... ;)

What it "does" is: It removes the "instructions after the ones I added" . What I would like to do is actually replace the existing cond with a new one[1] (and assuming nothing else uses that original block, remove it). There's no good way to do that with the current interface on CanonicalLoopInfo, so I did this to at least get me to a BasicBlock that doesn't get rejected for having two different terminations (is that the right term?).

[1] That's what I think I want to do. My understanding, and this is mainly based on what clang++ generates for "the same" C++ code to the Fortran code I'm experimenting with here, is that my cond BB should contain a call to the kmpc_dispatch_next and check the return value from that, and either exit or keep going - I may be completely wrong in this understanding. The original code, that I copied from the static variant is doing a compare to the full loop count, but in dynamic the work is not necessarily finished in order, happy to be corrected on these things, I'm new to both Fortran and OpenMP,

The main stumbling point, which may be my lack of understanding of what it's supposed to do: the CanonicalLoopInfo assumes that the the cond block has a CmpInst as the first instruction. In the dynamic, the corresponding block [in my understanding] starts with a call instruction to fetch the "next" set of data to process. This causes the assertOK to fail, hence it is commented out on line 1300 in the patch.

After applying worksharing-loop, the result is not a canonical loop nest anymore (one reason that the number of iterations is not known in advance). Hence, the CanonicalLoopInfo structure does not need to be preserved[*].

(*) I am working on making the chunk-loop accessible for loop transformations in OpenMP 6.0, which would require representing it as a CanonicalLoopInfo.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1259–1260

With a Chunk-size that is not one, you need at least two loops: One for all iterations of a chunk and another surrounding while-loop that ask the runtime for the next chunk.

This two-loop structure should also apply to the static schedule if the chunk is set and is greater than one. It don't see this in createStaticWorkshareLoop, it might be broken for these cases.

1279

With dynamic schedule, the trip count is not known it advance.

Updated files based on selected review commets.
Fixed a few issues with the code-generation.
Added a basic test (similar to static workshare loop.

Meinersbur added inline comments.Mar 11 2021, 2:43 PM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1233–1234

The InsertionPoint part is ignored anyway. It would make more sense to only pass the DebugLoc, like tileLoops.

1269–1272

For schedule(dynamic), the default chunk size is indeed one.

1276–1277

The could just be added to OMPConstants.h, not necessary to involve tablegen.

kmp_sched in kmp.h is already redundant with omp_sched from omp.h. To not introduce dependency issues atm, I suggest to just reproduce it them OMPConstants.h with a comment to remarks they have to keep the in sync.

However, I found that libomptarget/plugins/amdgpu/src/rtl.cpp already includes from libLLVMFrontend.

1305

[style] LLVM's coding style does not use "Almost Always Auto"

Consider using getHeder()->front()

1306–1310

cast already contains an assertion if the argument is not the casted-to type.

1321–1322

Comment is outdated? No CanonicalLoopInfo needs to be preserved.

1358–1360

Consider extracting all the info you need from CanonicalLoopInfo at the beginning and then abandoning the structure, since starting from the first CFG modification, it does not describe a canonical loop anymore but methods such as getAfterIP() may assume so.

Please note that this is not intended as the final commit, but rather a basis for asking some advice on how to move forward.

The main stumbling point, which may be my lack of understanding of what it's supposed to do: the CanonicalLoopInfo assumes that the the cond block has a CmpInst as the first instruction. In the dynamic, the corresponding block [in my understanding] starts with a call instruction to fetch the "next" set of data to process. This causes the assertOK to fail, hence it is commented out on line 1300 in the patch.

This is no longer a "not intented as final commit". I'm not saying it's perfect, but review comments would be useful.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1226–1227

It is used (but not really NECESSARY to use!) for to get the debug location in getOrCreateSrcLocStr a couple of lines down, which means more changes - I will look at doing this as a separate change, once I have something that works.

1238–1257

Yes, definitely on my mind. I was just concentrating on "get something that works first, then refactor" - otherwise, I find myself refactoring, and then reverting half of that, because it was the "wrong thing"... ;)

Just a quick comment. I will review in detail later.

This patch connects MLIR lowering to LLVM IR using the OpenMPIRBuilder. We should add a test for that flow or remove that connection in this patch.

Once the pretty-printer and parser patch (https://reviews.llvm.org/D92327) lands, it will become easier to write tests.

Meinersbur added inline comments.Mon, Mar 15, 10:00 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1226–1227

getOrCreateSrcLocStr uses the InsertionPoint only as backup for the function name. Since it returns a default location when DebugLoc is not available, I don't see in what circumstances it would ever be used. In any case, there should be a version of getOrCreateSrcLocStr that only needs a DebugLoc and possibly a llvm::Function which can be obtained from the CanonicalLoopInfo stored BasicBlocks.

I am OK with doing it with a separate change, I was considering myself already.

1238–1257

This is fine for your personal workflow, but for committing a patch we should aim for clean source code in main. For instance, I often use auto locally but to comply to the LLVM coding standard, I have to replace most of them with the actual type.

Leporacanthicus marked 7 inline comments as done.Mon, Mar 15, 1:17 PM

I'm going to upload the fixes once they compile, but it's getting late, so probably not until tomorrow morning.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1259–1260

It appears, from what I can tell, we ALWAYS get one in the chunk-size. I will try to fix this in a bit. I have hand-coded different values in the LLVM-IR to test that the tests I have [written in Fortran] to check that it behaves correctly. Proper tests written in either MLIR or Fortran would be needed for this in the future, and tests that check for example chunk size arrives to this section correctly.

1276–1277

I have moved a minimal set of constants.

1321–1322

Yes, and since we're adding zero to IV, it's not much point in updating it, so I removed the whole block below too - as far as I can tell, it produces the same result.

Leporacanthicus marked 2 inline comments as done.

Fixed various review comments.

Meinersbur added inline comments.Thu, Mar 18, 8:23 AM
llvm/include/llvm/Frontend/OpenMP/OMPConstants.h
109–121

IMHO we should copy&paste the kmp_schedule enum here entirely, with a comment that it must be kept in sync.

Eventually kmp.h should include OMPConstants.h and use the declaration here instead of declaring its own.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1259–1260

Consider adding a test to OpenMPIRBuilderTest.cpp that inserts a non-one chunk size, or even a non-constant chunk-size.

1316

Typo?

Leporacanthicus marked 2 inline comments as done.

Fixes as per review comments

llvm/include/llvm/Frontend/OpenMP/OMPConstants.h
109–121

I have added a comment to say that this needs to match kmp.h

I did look at copying the whole set of enum values, but I think that's better done at a later stage - I don't feel I understand exactly how these things are being used, and that would be key to how they get organized in a move. [Yes, could just copy the whole thing, but I'm not convinced that is the BEST choice - there appears to be a secondary hierarchy in there].

I will revisit this in a future patch, after I have spent some time understanding all the uses of these enum values.

Add testing for chunk size in dynamic work sharing loop test.

Also removed a superfluous assignment in the production code.

Meinersbur added inline comments.Wed, Mar 31, 1:12 PM
llvm/include/llvm/Frontend/OpenMP/OMPConstants.h
109–121

Sounds reasonable.

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
377

[suggestion] Add a newline between the description and the beginning of params. Not sure whether it makes a difference for doxygen, but clang-format likes to redo paragraph line wrapping and would include \param into the paragraph.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1289

To avoid creating temporary strings.

1316

"Rejig" is not a typo? Wiktionary knows it, but UK-only.

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
1734

I assumed you would add a second test when chunk is non-default. However, the default chunk size of 1 is not special, so I am find with this.

1741–1744

[nit] Can use auto * when using dyn_cast on the same line. At least, please use consistent style in the same function,

1780–1782

You could just use cast instead of dyn_cast which asserts if it is the wrong type, so you do not need explicitly check for nullptr.

Leporacanthicus marked 10 inline comments as done.

Updates requested in review:

  • Avoid making temporary string
  • Avoid British English in comment
  • Be consistent and not mixing auto/named type in new tests
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1316

Changing to "modify" to be more universal. Living in the UK, I'm not always aware of what English words are "UK" only and which are "any English variety".

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
1734

I am in the process of adding fortran-compiler tests that do dynamic with no chunk size and static with a given chunk size, to cover more - but won't go into llvm main for a while, I guess.

1741–1744

The whole file is inconsistent in this respect. First dyn_cast in the file is to a named type (line 156), the second one is to an auto (line 161), but then it is MOSTLY named types, with a scattering of auto in the original static code. I'm changing the ones below to use non-auto, to match the rest of the file.

1780–1782

Again, this matches the rest of the file - there is a small number of cast<X>(y), but they are all preceded by an ASSERT_TRUE(isa<X>(y)), which is about the same as this construct - the isa<X> check may explain more clearly what's wrong. but I doubt anyone working on LLVM for more than a few days will struggle to understand why a dyn_cast returned a nullptr. I think the assert you get inside the cast<X> is less clear - if nothing else, in the sense that you probably can't set a breakpoint and then inspect what happened in a good way.

Could you add the following to the unittest?

Builder.restoreIP(EndIP);
Builder.CreateRetVoid();
OMPBuilder.finalize();
EXPECT_FALSE(verifyModule(*M, &errs()));

I checks whether the IR is internally consistent.

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
1736–1737

Do not use CLI after it has been invalidated.

1741–1744

Reviews of previous patches were not necessirily perfect ;-(

For reference, the generation control flow:

Leporacanthicus marked an inline comment as done.

Updates as per review comments:

  • Fetch from CLI before it is destroyed.
  • Add return to terminate block and then verifyModule

Could you add the following to the unittest?

Builder.restoreIP(EndIP);
Builder.CreateRetVoid();
OMPBuilder.finalize();
EXPECT_FALSE(verifyModule(*M, &errs()));

I checks whether the IR is internally consistent.

Good spot. Done!

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
1741–1744

Indeed. Problem comes when you copy-paste code from previously "not perfect reviews". I've been on the other end of this too. ;)