Page MenuHomePhabricator

[OpenMP] Use the OpenMP-IR-Builder
ClosedPublic

Authored by jdoerfert on Nov 6 2019, 2:18 PM.

Details

Summary

This is the initial patch to use the OpenMP-IR-Builder, as discussed on
the mailing list ([1] and later) and at the US Dev Meeting'19.

The design is similar to D61953 but:

  • placed in llvm/lib/IR/ next to IRBuilder, for lack of a better location.
  • in a non-WIP status, with proper documentation and working.
  • using a OpenMPKinds.def file to manage lists of directives, runtime functions, types, ..., similar to the current Clang implementation.
  • restricted to handle only (simple) barriers, to implement most #pragma omp barrier directives and most implicit barriers.
  • properly hooked into Clang to be used if possible.
  • compatible with the remaining code generation.

Parts have been extracted into D69853 and D69785.

The plan is to have multiple people working on moving logic from Clang
here once the initial scaffolding landed.

[1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html

Diff Detail

Event Timeline

jdoerfert created this revision.Nov 6 2019, 2:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 6 2019, 2:18 PM
ABataev added inline comments.Nov 7 2019, 6:31 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
3493

createBarrier

clang/test/OpenMP/barrier_codegen.cpp
47

Remove #2, it may be changed

jdoerfert marked 2 inline comments as done.Nov 7 2019, 7:24 AM
jdoerfert added inline comments.
clang/lib/CodeGen/CGOpenMPRuntime.cpp
3493

I'd say we align ourselves with the IRBuilder (which has CreateXXX methods) or we are different enough to avoid confusion, e.g., emitXXXX. I don't care much which way but createXXXX for one builder and CreateXXXX for another will be confusing, don't you think?

clang/test/OpenMP/barrier_codegen.cpp
47

Agreed, I even added an option to the update_test script to do so but forgot to use it (D68851).

ABataev added inline comments.Nov 7 2019, 7:34 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
3493

Shall follow the LLVM coding document and format the new function name in accordance with it?

jdoerfert marked an inline comment as done.Nov 7 2019, 8:28 AM
jdoerfert added inline comments.
clang/lib/CodeGen/CGOpenMPRuntime.cpp
3493

Changing the IRBuilder is something no one dared to do for years. I'm actually in favor but I would like to get this in rather sooner than later.

Two options I think are good:

  • Use CreateXXX now and, if someone ever comes around to rename the Builder methods we rename them in all Builders.
  • Use emitXXXX now to avoid confusion.

When someone in a review before asked for create instead of emit the reason was to be aligned with the IRBuilder. I think that is a good justification so I went with CreateXXX even if it is against the coding standard.

jdoerfert updated this revision to Diff 228264.Nov 7 2019, 9:33 AM

Make the command line option shorter (mentioning openmp once should suffice)

ABataev added inline comments.Nov 7 2019, 9:46 AM
clang/include/clang/Driver/Options.td
1659–1660

Do we need to expose this option to driver or it is enough to have just a frontend option? If still need to have a driver option, add a test for driver.

jdoerfert updated this revision to Diff 228270.Nov 7 2019, 9:50 AM

Update test attributes

jdoerfert marked an inline comment as done.Nov 7 2019, 9:50 AM
jdoerfert marked an inline comment as done.Nov 7 2019, 10:00 AM
jdoerfert added inline comments.
clang/include/clang/Driver/Options.td
1659–1660

How do I write a frontend option? Anything that we can query in the lib/CodeGen is fine with me. (I don't even need an option if we turn this on by default to get test coverage right away).

ABataev added inline comments.Nov 7 2019, 10:04 AM
clang/include/clang/Driver/Options.td
1659–1660

You nedd to move to CC1Options.td file. It means that you can't use it direcly when invoke the drive, instead you will need to use -Xclang -fopenmp-...

jdoerfert updated this revision to Diff 228281.Nov 7 2019, 11:18 AM

Add driver test coverage

jdoerfert marked 2 inline comments as done.Nov 7 2019, 11:19 AM
ABataev added inline comments.Nov 7 2019, 11:33 AM
clang/test/Driver/fopenmp.c
130–131

CHECK...-SAME?

clang/test/OpenMP/barrier_codegen.cpp
46

Not sure about correctness of inaccessiblemem_or_argmemonly attribute applied to this function. Are you sure about this? This is maybe true for the NVPTX/AMDGCN runtimes but not the generic version of libomp.

jdoerfert marked 4 inline comments as done.Nov 7 2019, 12:35 PM
jdoerfert added inline comments.
clang/test/Driver/fopenmp.c
130–131

Sure, will do.

clang/test/OpenMP/barrier_codegen.cpp
46

Fair point. I'll remove it for now but we should come up with a scheme. Maybe communicating the target library and other information to the IRBuilder so we can select based on those.

jdoerfert updated this revision to Diff 228331.Nov 7 2019, 3:44 PM
jdoerfert marked 2 inline comments as done.

Adjust tests as requested

Herald added a project: Restricted Project. · View Herald TranscriptNov 7 2019, 3:44 PM
jdoerfert marked 2 inline comments as done.Nov 7 2019, 3:47 PM
ABataev added inline comments.Nov 8 2019, 7:25 AM
clang/test/OpenMP/barrier_codegen.cpp
26

Not sure about correct use of nosync and readonly attributes. OpenMP runtime uses lazy initialization of the runtime library and when any runtime function is called, the inner parts of the OpenMP runtime are initialized automatically. It may use some sync primitives and may modify memory, I assume. Same about nofree.

jdoerfert marked an inline comment as done.Nov 8 2019, 10:14 AM
jdoerfert added inline comments.
clang/test/OpenMP/barrier_codegen.cpp
26

There are two versions of these functions, host and device. I assume host functions are not inlined but device functions might be. This is basically all the modes we support right now.

If we do not inline the function (host) we don't necessarily care what they do but what effect the user can expect.
The user can not expect to synchronize through __kmpc_global_thread_num calls in a defined way, thus nosync.
Similarly, from the users perspective there is no way to determine if something was written or freed, no matter how many of these calls I issue and under which control conditions, all I see is the number as a result. Thus, readonly and nofree. I believe readnone is even fine here but it might not work for the device version (see below) so I removed it.

If we do inline the function (device) we need to make sure the attributes are compatible with the inlined code to not cause UB. The code of __kmpc_global_thread_num at least does not write anything and only reads stuff (as far as I can tell).

Please correct me if I overlooked something.

ABataev added inline comments.Nov 8 2019, 10:25 AM
clang/test/OpenMP/barrier_codegen.cpp
26

But someone may try to inline the host-based runtime or try to use LTO with it.
The question is not about the user expectations but about the optimizations which can be triggered with these attributes.

jdoerfert marked an inline comment as done.Nov 8 2019, 12:11 PM
jdoerfert added inline comments.
clang/test/OpenMP/barrier_codegen.cpp
26

This is our runtime and we have supported and unsupported usage models.

ABataev added inline comments.Nov 8 2019, 12:23 PM
clang/test/OpenMP/barrier_codegen.cpp
26

Hmm, I don't think this the right approach. Plus, you still did not answer about optimizations. Maybe, currently, these attributes won't trigger dangerous optimizations but they can do this in the future and it may lead to unpredictable results. I would use the pessimistic model here rather than over-optimistic.

jdoerfert marked an inline comment as done.EditedNov 8 2019, 1:07 PM
This comment has been deleted.
clang/test/OpenMP/barrier_codegen.cpp
26

I did (try to) describe why there cannot be any problems wrt. optimizations:
The specified behavior of the runtime call is _as if_ it is readonly, nofree, and nosync.
That is, from the perspective of the compiler this is true and optimizations are allowed to use that fact.

The fact that the first ever runtime call initializes the runtime is neither part of the specification nor of the observable behavior. If we change the order between two call to __kmpc_global_thread_num, or similar calls, we cannot observe if/which one initialized the runtime and which read only stuff.

ABataev added inline comments.Nov 8 2019, 1:25 PM
clang/test/OpenMP/barrier_codegen.cpp
26

Here is the code of this function from the libomp:

  int gtid;

  if (!__kmp_init_serial) {
    gtid = KMP_GTID_DNE;
  } else
#ifdef KMP_TDATA_GTID
      if (TCR_4(__kmp_gtid_mode) >= 3) {
    KA_TRACE(1000, ("*** __kmp_get_global_thread_id_reg: using TDATA\n"));
    gtid = __kmp_gtid;
  } else
#endif
      if (TCR_4(__kmp_gtid_mode) >= 2) {
    KA_TRACE(1000, ("*** __kmp_get_global_thread_id_reg: using keyed TLS\n"));
    gtid = __kmp_gtid_get_specific();
  } else {
    KA_TRACE(1000,
             ("*** __kmp_get_global_thread_id_reg: using internal alg.\n"));
    gtid = __kmp_get_global_thread_id();
  }

  /* we must be a new uber master sibling thread */
  if (gtid == KMP_GTID_DNE) {
    KA_TRACE(10,
             ("__kmp_get_global_thread_id_reg: Encountered new root thread. "
              "Registering a new gtid.\n"));
    __kmp_acquire_bootstrap_lock(&__kmp_initz_lock);
    if (!__kmp_init_serial) {
      __kmp_do_serial_initialize();
      gtid = __kmp_gtid_get_specific();
    } else {
      gtid = __kmp_register_root(FALSE);
    }
    __kmp_release_bootstrap_lock(&__kmp_initz_lock);
    /*__kmp_printf( "+++ %d\n", gtid ); */ /* GROO */
  }

  KMP_DEBUG_ASSERT(gtid >= 0);

  return gtid;

As you can see, it is very complex. It has locks, in the debug mode it writes something, etc. It is not only about the initialization, there are some other calls (like __kmp_get_global_thread_id(), which really writes something). Do you really think it is safe to mark this as readonly and nosync?

jdoerfert marked an inline comment as done.Nov 8 2019, 1:32 PM
jdoerfert added inline comments.
clang/test/OpenMP/barrier_codegen.cpp
26
Do you really think it is safe to mark this as readonly and nosync?

Yes. I am arguing why in the last few comments.

The central question is not what the function does but how it behaves (as long as we do not inline it, then what it does leaks).

ABataev added inline comments.Nov 8 2019, 2:01 PM
clang/test/OpenMP/barrier_codegen.cpp
26

I still don't think the right way to do it. Instead I would suggest to think about platform dependent attributes. Say, in general we use some pessimistic set of attributes but for some platforms (like NVPTX or AMDGCN) we can define more optimistic set of attributes.

clang/test/OpenMP/barrier_codegen.cpp
26

Properties that are true globally, in the as-if sense, but violated locally, if inlined, are a seriously complicated proposal. I don't want to block this patch on a satisfactory resolution to that so suggest we go with conservative.

jdoerfert updated this revision to Diff 228772.Nov 11 2019, 2:47 PM

update test case

This revision is now accepted and ready to land.Nov 11 2019, 3:03 PM
jdoerfert updated this revision to Diff 228801.Nov 11 2019, 7:45 PM

Use IRBuilder for cancel barriers

Use IRBuilder for cancel barriers

In general, it would better to implement cancellation barriers in the separate patch. Same for OpenMPIrBuilder.

Adjust paths and test case options

ABataev added inline comments.Thu, Nov 14, 10:51 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
3493

Also, you need to introduce/use RAII that saves/restores insertion point automatically.

3496

Maybe, instead of saving the state, pass the pointer to the cancel block as a parameter to the CreateBarrier function?

Thanks @jdoerfert for working on this.

Sorry for being late to the party. I am trying to implement one trivial directive (Flush) and one slightly more involved (not decided).

I applied D69853, D69785, D69922 to my local build and found that D69922 is referring to OpenMPIRBuilder.h in llvm/Frontend whereas in D69785 it was introduced in llvm/Transforms/Utils/OpenMPIRBuilder.h

clang/lib/CodeGen/CGOpenMPRuntime.cpp
32

D69785 has this file in llvm/Transforms/Utils/OpenMPIRBuilder.h

clang/lib/CodeGen/CodeGenModule.cpp
56

D69785 has this file in llvm/Transforms/Utils/OpenMPIRBuilder.h

jdoerfert marked 3 inline comments as done.Thu, Nov 28, 10:35 AM

Use IRBuilder for cancel barriers

In general, it would better to implement cancellation barriers in the separate patch. Same for OpenMPIrBuilder.

Why would you think that splitting these closely related concepts would be beneficial?

Thanks @jdoerfert for working on this.

Sorry for being late to the party. I am trying to implement one trivial directive (Flush) and one slightly more involved (not decided).

I applied D69853, D69785, D69922 to my local build and found that D69922 is referring to OpenMPIRBuilder.h in llvm/Frontend whereas in D69785 it was introduced in llvm/Transforms/Utils/OpenMPIRBuilder.h

The conversation (on the llvm-dev list) seems to have settled on {include/llvm,lib}/frontend/OpenMP/. I will move the files around once we finally settle these reviews that haven't moved (more than 2 comments at a time).
As you can see, I have a huge backlog here and that is really hard to modify. You have a problem now as you depend on these. @hfinkel mentioned it in his email to the llvm-dev (about review etiquette), we need to
minimize iterations in reviews and actually review patches not only a few lines every time.

clang/lib/CodeGen/CGOpenMPRuntime.cpp
3496

All of this goes away in D70258, as it should. The frontend when creating a barrier does not need to know if it is cancelable and if so, where the cancel block is.

jdoerfert updated this revision to Diff 233448.Wed, Dec 11, 2:13 PM
jdoerfert marked 3 inline comments as done.

rebase

This revision was automatically updated to reflect the committed changes.

I applied D69853, D69785, D69922 to my local build and found that D69922 is referring to OpenMPIRBuilder.h in llvm/Frontend whereas in D69785 it was introduced in llvm/Transforms/Utils/OpenMPIRBuilder.h

I merged the first few patches. There are some I'm still waiting for the green light but the location question is now settled.

clang/include/clang/Driver/Options.td
1659–1660

I think people will need to use it in the short term, having it here seems therefor better.