This is an archive of the discontinued LLVM Phabricator instance.

[flang][driver] Add support for -embed-offload-object flag in flang -fc1
ClosedPublic

Authored by jsjodin on Jan 20 2023, 11:56 AM.

Details

Summary

This patch adds support for the -embed-offload-object flag to embed offloading binaries in host code. This flag is identical to the flag with the same name in clang.

Diff Detail

Event Timeline

jsjodin created this revision.Jan 20 2023, 11:56 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: sunshaoce. · View Herald Transcript
jsjodin requested review of this revision.Jan 20 2023, 11:56 AM

Thanks for sending this, @jsjodin ! Overall, this makes sense. But I have no experience with offloading and I would appreciate some RFC that would explain what the overall direction. What are the next steps? And what's the overall goal?

Also, could add a note in the summary that would explain whether the semantics of this option in Flang are identical to Clang?

flang/lib/Frontend/FrontendActions.cpp
740–741

Could you add a test that will exercise this diagnostic?

jsjodin updated this revision to Diff 491497.Jan 23 2023, 1:27 PM

Added negative test and early return when error occurrs.

Thanks for sending this, @jsjodin ! Overall, this makes sense. But I have no experience with offloading and I would appreciate some RFC that would explain what the overall direction. What are the next steps? And what's the overall goal?

I can give a short description. The over all goal is to support OpenMP in flang including target offloading. The direction is basically to implement the missing features in flang that are implied from the driver, which is shared with clang. Minimizing the difference between flang and clang is desirable to be able to share code as much as possible. For code generation, some work has already been done to move OpenMP code from clang to llvm for sharing implementations. For the flang driver the main work will be to implement the various flags for the commands that are being generated, and add support for the GPU targets. For example, this patch implements -embed-offload-object, which is an option that can already be generated by the flang driver if the -fopenmp and--offload-arch flags are provided (assuming there is a gpu target available, which will have to be added soon).
There is a presentation on the entire mechanism how the new offloading driver works here:
https://www.youtube.com/watch?v=4NnzymmQe7k
Slides here:
https://docs.google.com/presentation/d/1JG3GF7sqZ6lnl5kf1MrSW0-_k8FDrzaGk8hzGaq0K8k/edit?usp=sharing

Also, could add a note in the summary that would explain whether the semantics of this option in Flang are identical to Clang?

Sure, I will update this.

jsjodin edited the summary of this revision. (Show Details)Jan 23 2023, 1:55 PM

Thanks for sending this, @jsjodin ! Overall, this makes sense. But I have no experience with offloading and I would appreciate some RFC that would explain what the overall direction. What are the next steps? And what's the overall goal?

Also, could add a note in the summary that would explain whether the semantics of this option in Flang are identical to Clang?

This documentation is a little out-of-date, but is mostly correct, https://clang.llvm.org/docs/OffloadingDesign.html. I understand the desire is to replicate this handling very closely in Clang.

Can we get a test to show that this works when given straight LLVM-IR as well? Those are actually handled differently in Clang so I had to make sure that they both call the embedding function.

Also please add the metadata checks to the test, these are important as we check for the module metadata to identify these when the user specifies -flto. E.g. like this https://github.com/llvm/llvm-project/blob/main/clang/test/Frontend/embed-object.ll.

jsjodin updated this revision to Diff 492194.Jan 25 2023, 11:11 AM

Added test to embed with .bc file as input instead of source file. Fixed comment in negative test.

Added test to embed with .bc file as input instead of source file. Fixed comment in negative test.

Oh yeah, does Flang accept .ll files? I don't know the details of Flang's driver model since it uses MLIR. But if it does we should have a .ll test that shows the metadata attached to the embedded object.

jsjodin updated this revision to Diff 492210.Jan 25 2023, 11:50 AM

Update test to check metadata.

jhuber6 accepted this revision.Jan 25 2023, 11:52 AM

LGTM

flang/lib/Frontend/FrontendActions.cpp
736

Nit, Is this in style w/ Flang? Probably copied from Clang.

This revision is now accepted and ready to land.Jan 25 2023, 11:52 AM
jsjodin marked an inline comment as done.Jan 25 2023, 12:06 PM
jsjodin added inline comments.
flang/lib/Frontend/FrontendActions.cpp
736

It definitely is from Clang, I was considering putting this in the OMPIRBuilder, but given the callback needed for error handling it didn't seem worth it. I wasn't aware that flang has a different style than clang.

jhuber6 added inline comments.Jan 25 2023, 12:08 PM
flang/lib/Frontend/FrontendActions.cpp
736

Just looking around it seems like Flang prefers camelCase while Clang uses PascalCase. But I don't know the specific style rules. Usually clang-tidy will yell at you.

jsjodin added inline comments.Jan 25 2023, 1:46 PM
flang/lib/Frontend/FrontendActions.cpp
736

Yes, I will update the patch to do camel case.

jsjodin updated this revision to Diff 492250.Jan 25 2023, 1:56 PM

Update variable names to camel case.

jhuber6 accepted this revision.Jan 25 2023, 1:59 PM

Do you know what's left on the Flang driver side to enable offloading?

@awarzynski do you have any other concerns about the patch?

Do you know what's left on the Flang driver side to enable offloading?

I don't know exactly what is left. The Flang driver already uses the two stage compilation, so as far as I could tell generating llvm-ir there are no difference in the phases between clang and flang. If doing a full compilation with linking, then there seemed to be some issues with the various commands and there were some differences in the phases as well. My guess is that there is not a huge amount of work.

awarzynski accepted this revision.Jan 27 2023, 9:36 AM

@awarzynski do you have any other concerns about the patch?

Sorry for the delay, I was traveling. This is very exciting, thanks for implementing it!

LGTM

There is a presentation on the entire mechanism how the new offloading driver works here:
https://www.youtube.com/watch?v=4NnzymmQe7k
Slides here:
https://docs.google.com/presentation/d/1JG3GF7sqZ6lnl5kf1MrSW0-_k8FDrzaGk8hzGaq0K8k/edit?usp=sharing

You could add one of these links in the driver docs. Or perhaps a link to https://clang.llvm.org/docs/OffloadingDesign.html?

flang/lib/Frontend/FrontendActions.cpp
736

The coding style in Flang is tricky business ;) The driver strives to follow MLIR: https://github.com/llvm/llvm-project/blob/main/flang/lib/Frontend/FrontendActions.cpp#L9 :) This is just for future reference.

flang/test/Driver/embed-error.f90
3–5

[nit] I used to add these comments, but then people complained and removed some of them. Feel free to keep them or to remove.

jsjodin updated this revision to Diff 493301.Jan 30 2023, 6:34 AM

Merge with trunk so patch applies.

This revision was landed with ongoing or failed builds.Jan 31 2023, 8:03 AM
This revision was automatically updated to reflect the committed changes.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 31 2023, 8:03 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

The new test embed.f90 and embed-error.f90 are failing on my side. Do they assume a specific configuration?

The new test embed.f90 and embed-error.f90 are failing on my side. Do they assume a specific configuration?

No, it should work for any configuration as far as I know. How are you running the test?

No, it should work for any configuration as far as I know. How are you running the test?

Just with check-flang

No, it should work for any configuration as far as I know. How are you running the test?

Just with check-flang

How does it fail? Can you show the output? Also what CPU are you running on?

No, it should work for any configuration as far as I know. How are you running the test?

Just with check-flang

How does it fail? Can you show the output? Also what CPU are you running on?

********************
FAIL: Flang :: Driver/embed.f90 (2 of 1993)
******************** TEST 'Flang :: Driver/embed.f90' FAILED ********************
Script:
--
: 'RUN: at line 6';   llvm-project/build/bin/flang-new -fc1 -emit-llvm -o - -fembed-offload-object=llvm-project/flang/test/Driver/Inputs/hello.f90 llvm-project/flang/test/Driver/embed.f90 2>&1 | llvm-project/build/bin/FileCheck llvm-project/flang/test/Driver/embed.f90
: 'RUN: at line 8';   llvm-project/build/bin/flang-new -fc1 -emit-llvm-bc -o llvm-project/build/tools/flang/test/Driver/Output/embed.f90.tmp.bc llvm-project/flang/test/Driver/embed.f90 2>&1
: 'RUN: at line 9';   llvm-project/build/bin/flang-new -fc1 -emit-llvm -o - -fembed-offload-object=llvm-project/flang/test/Driver/Inputs/hello.f90 llvm-project/build/tools/flang/test/Driver/Output/embed.f90.tmp.bc 2>&1 | llvm-project/build/bin/FileCheck llvm-project/flang/test/Driver/embed.f90
--
Exit Code: 1

Command Output (stderr):
--
llvm-project/flang/test/Driver/embed.f90:11:10: error: CHECK: expected string not found in input
! CHECK: @[[OBJECT_1:.+]] = private constant [61 x i8] c"program hello\0A write(*,*), \22Hello world!\22\0Aend program hello\0A", section ".llvm.offloading", align 8, !exclude !0
         ^
<stdin>:1:1: note: scanning from here
; ModuleID = 'FIRModule'
^

Input file: <stdin>
Check file: llvm-project/flang/test/Driver/embed.f90

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = 'FIRModule' 
check:11     X~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
          2: source_filename = "FIRModule" 
check:11     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          3: target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" 
check:11     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          4: target triple = "x86_64-unknown-linux-gnu" 
check:11     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          5:  
check:11     ~
          6: @_QFECi = internal constant i32 1 
check:11     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          .
          .
          .
>>>>>>
`
FAIL: Flang :: Driver/embed-error.f90 (1 of 1993)
******************** TEST 'Flang :: Driver/embed-error.f90' FAILED ********************
Script:
--
: 'RUN: at line 6';   not llvm-project/build/bin/flang-new -fc1 -emit-llvm -o - -fembed-offload-object=llvm-project/flang/test/Driver/Inputs/missing.f90 llvm-project/flang/test/Driver/embed-error.f90 2>&1 | llvm-project/build/bin/FileCheck llvm-project/flang/test/Driver/embed-error.f90 --check-prefix=ERROR
--
Exit Code: 1

Command Output (stderr):
--
llvm-project/flang/test/Driver/embed-error.f90:8:10: error: ERROR: expected string not found in input
! ERROR: error: could not open
         ^
<stdin>:1:1: note: scanning from here
; ModuleID = 'FIRModule'
^
<stdin>:6:14: note: possible intended match here
@_QFECi = internal constant i32 1
             ^

Input file: <stdin>
Check file: llvm-project/flang/test/Driver/embed-error.f90

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: ; ModuleID = 'FIRModule' 
check:8'0     X~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
           2: source_filename = "FIRModule" 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           3: target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           4: target triple = "x86_64-unknown-linux-gnu" 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           5:  
check:8'0     ~
           6: @_QFECi = internal constant i32 1 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:8'1                  ?                     possible intended match
           7: @_QQEnvironmentDefaults = constant ptr null 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           8:  
check:8'0     ~
           9: declare ptr @malloc(i64) 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          10:  
check:8'0     ~
          11: declare void @free(ptr) 
check:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~
           .
           .
           .

I run on x86_64

No, it should work for any configuration as far as I know. How are you running the test?

Just with check-flang

How does it fail? Can you show the output? Also what CPU are you running on?

I run on `x86_64`

How did you configure your build? I'd like to reproduce your build and see if I can make it fail on my machine. Right now the test passes.

No, it should work for any configuration as far as I know. How are you running the test?

Just with check-flang

How does it fail? Can you show the output? Also what CPU are you running on?

I run on `x86_64`

How did you configure your build? I'd like to reproduce your build and see if I can make it fail on my machine. Right now the test passes.

My bad .. my CompilerInvocation.cpp was overwritten and missed part of this patch. It works fine. Sorry to have bothered you with that.

No, it should work for any configuration as far as I know. How are you running the test?

Just with check-flang

How does it fail? Can you show the output? Also what CPU are you running on?

I run on `x86_64`

How did you configure your build? I'd like to reproduce your build and see if I can make it fail on my machine. Right now the test passes.

My bad .. my CompilerInvocation.cpp was overwritten and missed part of this patch. It works fine. Sorry to have bothered you with that.

No problem, glad it works for you. I'm working on another patch that is configuration dependent so I want to make sure there are no errors.