This is an archive of the discontinued LLVM Phabricator instance.

[CodeExtractor] Optionally keep code in original function.
Needs ReviewPublic

Authored by Meinersbur on Dec 6 2021, 9:12 PM.

Details

Summary

When extracting a region into a new function, optionally allow cloning basic blocks and instructions into the extracted function instead of moving them. The keeps the original code in the original function such that they can still be referenced -- and branched to -- in the original function.

The motivation is the use of CodeExtractor in the OpenMPIRBuilder. The implementation of createParallel first emits the parallel region into the lexical function, then uses CodeExtractor to outline that region into a new function. The problem here is that Clang's code generator will references some basic blocks for code inside as well as outside the region. This includes some special purpose block (EHResumeBlock, TerminateLandingPad, TerminateHandler, UnreachableBlock, ...) and cleanup/dtor code that is re-used from multiple scopes (see test case extract-block-cleanup.ll). Moving these blocks into a different function will result in malformed IR. The KeepOldBlocks option will instead clone the outlined code into a new function keeping the auxiliary code intact, relying on later DCE to remove code that indeed has become unreachable. Additionally, this code could also be uses as a fallback when threading/offloading is disabled via environment option.

Use of KeepOldBlocks by OpenMPIRBuilder is not part of this patch. For testing, we extend llvm-extract allowing the use of this option and thus making it more powerful.

Diff Detail

Event Timeline

Meinersbur created this revision.Dec 6 2021, 9:12 PM
Meinersbur requested review of this revision.Dec 6 2021, 9:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2021, 9:12 PM
Herald added a subscriber: sstefan1. · View Herald Transcript
ormris removed a subscriber: ormris.Jan 24 2022, 11:12 AM

While this functionality is probably useful by itself, I am confused we need it in the OpenMP-IR-Builder.

In my mind that use case looked something like:
When createParallel starts we push new special blocks (EHResumeBlock, ...) onto the stack.
All cleanup uses the associated special blocks while body generation happens.
Once the region is done we revert back to the special blocks that were used before.

Why doesn't this work? I thought it already did.

In my mind that use case looked something like:
When createParallel starts we push new special blocks (EHResumeBlock, ...) onto the stack.
All cleanup uses the associated special blocks while body generation happens.
Once the region is done we revert back to the special blocks that were used before.

Why doesn't this work? I thought it already did.

Unfortunately is it a lot more complicated. There is no limited list of blocks that might be shared between outlined and outer function, the special blocks in CodeGenFunction (EHResumeBlock, TerminateLandingPad, TerminateHandler, UnreachableBlock) are just the most obvious ones. For instance, cleanup code is emitted only once and looks like this:

bb1:
  store i32 1, %c
  br label %common_cleanup

bb2:
  store i32 2, %c
  br label %common_cleanup

common_cleanup:
  [cleanup code]
  switch i32 %c, [
    [i32 1, label %bb1_continue],
    [i32 2, label %bb2_continue]
  }

bb1_continue:
  [...]

bb2_continue:
  [...]

And for every destruction a new switch case is added. Similar things happen with exceptions e.g. landing pads. What's worst, I have no exhaustive list of circumstances when sharing of code may happen and there is also the risk that the switched-out special block are stored in some other data structure and re-used after having left the region. I am not saying these cannot be fixed by switching-out some fields during the outlined region (there's also TerminateFunclets, ExceptionSlot, EHSelectorSlot, SEHCodeSlotStack, SEHInfo, ObjCEHValueStack, CurFuncIsThunk, ...), up to the entire CodeGenFunction but I vastly prefer a straightforward solution over making Clang's internals even more complicated. It's also one less concern for potential other front-ends than Clang/Flang.

That being said, a clean exception handling would indeed create a new handling context; currently exception handling inside the outlined region will also branch to the cleanup code in the outer function using the dtor pattern above. Actually triggering the exception at runtime is not well-formed in OpenMP, but the compiler should not crash crash because the cleanup code has been moved away. That is, copying the code instead of moving it will already allow us to compile C++ code without -fno-exceptions.

Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 3:33 PM

Thanks @Meinersbur for this patch. Apologies for the long wait. Just starting to go over it.

The summary is quite helpful to understand the need for this extension to the extractor from the Clang side. I guess this is currently not required for Flang/MLIR? Or is it needed to handle the cancellation constructs?

I think the KeepFunctions and KeepBlocks flags are useful to ensure that there are no changes to existing users. But I would also like to check with you whether keeping the functions and blocks are necessary for correctness when there are branches to the blocks other than the first one. If so would it be better to support it without flags? Also, do we have asserts presently to ensure that incorrect code is not generated in the presence of these branches when the flags are not specified?

llvm/include/llvm/Transforms/Utils/Cloning.h
117

Nit: Not about this patch.
DebugInfoFinder is missing from these comments.

120

Nit: There seems to be a style here for mentioning the parameter as the nth (7th here).

llvm/tools/llvm-extract/llvm-extract.cpp
87

Nit: Unlsess -> Unless

Meinersbur marked 2 inline comments as done.
  • Address review

The summary is quite helpful to understand the need for this extension to the extractor from the Clang side. I guess this is currently not required for Flang/MLIR? Or is it needed to handle the cancellation constructs?

As far as I know it it currently not necessary the OpenMP dialect lowing. Neither does MLIR currently support exceptions and even fir-dev does not support final procedures yet. However, support basic block sharing for code size reduction or other purposes might be required in the future.

But I would also like to check with you whether keeping the functions and blocks are necessary for correctness when there are branches to the blocks other than the first one. If so would it be better to support it without flags? Also, do we have asserts presently to ensure that incorrect code is not generated in the presence of these branches when the flags are not specified?

This patch only introduces using KeepOldBlocks for llvm-extract. I did not want to change the default behaviour for the tool, which is throwing away the parent function. Hence, if there are any branches to moved block are left, those branch instructions are thrown away with its parent function. The "create a call to extracted function" functionality offered by CodeExtractor is unused. Using --bb-keep-functions without --bb-keep-blocks may result in invalid IR if the --bb argument does not specify a proper region. This will be caught by the IR verifier in assert builds at CodeExtractor.cpp:1445. Are you suggesting to add the verifier for non-assert builds as well?

OpenMPIRBuilder will always use KeepOldBlocks mode.

llvm/include/llvm/Transforms/Utils/Cloning.h
117

CloneBasicBlock with InstSelect parameter is used in this patch.

The parameter DIFinder was added in D33655 and unrelated to this patch.

Introduce --replace-with call to replace --bb-keep-functions and --bb-keep-blocks.
Makes the interface easier and --bb-keep-functions alone can result in invalid IR.

clementval resigned from this revision.Jul 13 2023, 1:44 PM