Page MenuHomePhabricator

[OpenMP] Introduce new flag to change offloading driver pipeline
ClosedPublic

Authored by jhuber6 on Jan 3 2022, 9:40 AM.

Details

Summary

This patch introduces the -fopenmp-new-driver option which instructs
the compiler to use a new driver scheme for producing offloading code.
In this scheme we create a complete offloading object file and then pass
it as input to the host compilation phase. This will allow us to embed
the object code in the backend phase.

This is the start of a series of commits to rework the OpenMP offloading driver
pipeline. The goal of this is to simplify the steps required for creating an
offloading program. This patch changes the driver's configuration to simply pass
the device file back to the host as an input so it can be embedded as an LLVM IR
global during the backend, then simply passes that object file to the linker.

This driver implementation will currently create the following phases,

$ clang input.c -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-phases
               +- 0: input, "input.c", c, (host-openmp)
            +- 1: preprocessor, {0}, cpp-output, (host-openmp)
         +- 2: compiler, {1}, ir, (host-openmp)
         |        |     +- 3: input, "input.c", c, (device-openmp)
         |        |  +- 4: preprocessor, {3}, cpp-output, (device-openmp)
         |        |- 5: compiler, {4}, ir, (device-openmp)
         |     +- 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64)" {5}, ir
         |  +- 7: backend, {6}, assembler, (device-openmp)
         |- 8: assembler, {7}, object, (device-openmp)
      +- 9: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64)" {8}, ir
   +- 10: backend, {9}, assembler, (host-openmp)
+- 11: assembler, {10}, object, (host-openmp)
12: clang-linker-wrapper, {11}, image, (host-openmp)

Which will map to the following bindings

# "x86_64-unknown-linux-gnu" - "clang", inputs: ["input.c"], output: "/tmp/input-bae62e.bc"
# "nvptx64" - "clang", inputs: ["input.c", "/tmp/input-bae62e.bc"], output: "/tmp/input-76784e.s"
# "nvptx64" - "NVPTX::Assembler", inputs: ["/tmp/input-76784e.s"], output: "/tmp/input-8f29db.o"
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["/tmp/input-bae62e.bc", "/tmp/input-8f29db.o"], output: "/tmp/input-545450.o"
# "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["/tmp/input-545450.o"], output: "a.out"

Diff Detail

Event Timeline

jhuber6 created this revision.Jan 3 2022, 9:40 AM
jhuber6 requested review of this revision.Jan 3 2022, 9:40 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 3 2022, 9:40 AM
jhuber6 updated this revision to Diff 397089.Jan 3 2022, 9:49 AM

Updating to only contain this commit.

jhuber6 edited the summary of this revision. (Show Details)Jan 3 2022, 9:53 AM
ormris removed a subscriber: ormris.Jan 18 2022, 10:18 AM
JonChesterfield accepted this revision.Jan 26 2022, 9:55 AM

This looks reasonable to me, though I'd prefer we keep the forward declare. The scalar->vector transform is mechanical and the if (Args.hasArg(options::OPT_fopenmp_new_driver)) cruft will disappear when we return to a single driver implementation.

clang/include/clang/Driver/Driver.h
45

This looks like it should be a breaking change - InputInfo is no longer forward declared. Would it be reasonable to keep the forward declaration and put the typedef between class statements and the LTOKind enum?

This revision is now accepted and ready to land.Jan 26 2022, 9:55 AM
jhuber6 added inline comments.Jan 26 2022, 11:16 AM
clang/include/clang/Driver/Driver.h
45

We can't forward declare a struct and use it by-value in a container, I would need to change it to a pointer. It's doable, but I don't think it's ideal. I'm not sure why this would break anything, the forward declarations simply were there to avoid including more files here. I could be wrong however.

This revision was landed with ongoing or failed builds.Jan 31 2022, 12:56 PM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jan 31 2022, 1:31 PM

Looks like this breaks tests on macOS: http://45.33.8.238/macm1/26856/step_7.txt

Please take a look and revert for now if it takes a while to fix (maybe just needs an explicit triple?)

Looks like this breaks tests on macOS: http://45.33.8.238/macm1/26856/step_7.txt

Please take a look and revert for now if it takes a while to fix (maybe just needs an explicit triple?)

Not sure what I expected when I hard-coded the host-triple in the test. I pushed a change in rGb79e2a1ccd3b, can you check it again?

Weird, can you show me what -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-bindings looks like there? I'm not sure why but it doesn't seem to be getting one of the expected arguments but I'm not sure how to reproduce it.

It seems what's happening here is that we are building the host.bc twice, this will work fine but isn't ideal. I prevent this manually by checking the cache if one of the jobs was already created, but for some reason that doesn't seem to be happening here. I'll need to figure out how to reproduce this.

thakis added a comment.Feb 1 2022, 6:23 AM

Tests have been failing on Mac for over 20 hours now. Time to revert and fix async?

 % bin/clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-new-driver -no-canonical-prefixes -ccc-print-bindings /Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c -o openmp-offload-gpu -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-bindings
clang version 14.0.0
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: bin
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-729b05.bc"
# "nvptx64-nvidia-cuda" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-729b05.bc"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a35969.s"
# "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a35969.s"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a3f7f0.o"
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a3f7f0.o"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-f53552.bc"
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-f53552.bc"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-86f846.o"
# "x86_64-apple-darwin19.6.0" - "Offload::Linker", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-86f846.o"], output: "openmp-offload-gpu"

Tests have been failing on Mac for over 20 hours now. Time to revert and fix async?

 % bin/clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-new-driver -no-canonical-prefixes -ccc-print-bindings /Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c -o openmp-offload-gpu -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-bindings
clang version 14.0.0
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: bin
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-729b05.bc"
# "nvptx64-nvidia-cuda" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-729b05.bc"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a35969.s"
# "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a35969.s"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a3f7f0.o"
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a3f7f0.o"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-f53552.bc"
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-f53552.bc"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-86f846.o"
# "x86_64-apple-darwin19.6.0" - "Offload::Linker", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-86f846.o"], output: "openmp-offload-gpu"

I'd rather just disable the test, this patch has like 20 others that depend on it so we'd need to revert all of those. It's definitely not grabbing the bitcode from the cache as expected. I can disable this part of the test for now, but do you know how I could figure out how to reproduce this? I've been tracking some other buildbots and they don't seem to have the same issue so I'm not sure what's special about this one.

thakis added a comment.Feb 1 2022, 6:46 AM

Tests have been failing on Mac for over 20 hours now. Time to revert and fix async?

 % bin/clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-new-driver -no-canonical-prefixes -ccc-print-bindings /Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c -o openmp-offload-gpu -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver -ccc-print-bindings
clang version 14.0.0
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: bin
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-729b05.bc"
# "nvptx64-nvidia-cuda" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-729b05.bc"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a35969.s"
# "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a35969.s"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a3f7f0.o"
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-a3f7f0.o"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-f53552.bc"
# "x86_64-apple-darwin19.6.0" - "clang", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-f53552.bc"], output: "/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-86f846.o"
# "x86_64-apple-darwin19.6.0" - "Offload::Linker", inputs: ["/var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/openmp-offload-gpu-86f846.o"], output: "openmp-offload-gpu"

I'd rather just disable the test, this patch has like 20 others that depend on it so we'd need to revert all of those. It's definitely not grabbing the bitcode from the cache as expected. I can disable this part of the test for now, but do you know how I could figure out how to reproduce this? I've been tracking some other buildbots and they don't seem to have the same issue so I'm not sure what's special about this one.

Just build and run tests on any mac. This fails on 3 different macs I tried (2x arm, 1x intel), in a bunch of different build configs.

For the particular build I sent the output from, the cmake invocation looked like /Applications/CMake.app/Contents/bin/cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS='compiler-rt;libcxx;clang' -DLLVM_APPEND_VC_REV=NO -DCMAKE_C_COMPILER=$HOME/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang -DCMAKE_CXX_COMPILER=$HOME/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang++ -DCMAKE_OSX_SYSROOT=/Users/thakis/src/llvm-project/sysroot/MacOSX.sdk -DDARWIN_macosx_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/MacOSX.sdk -DDARWIN_iphoneos_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/iPhoneOS.sdk -DDARWIN_iphonesimulator_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/iPhoneSimulator.sdk ../llvm

thakis added a comment.Feb 1 2022, 6:47 AM

ps "This is inconvenient to revert since 20 patches landed" means you're landing too many patches too quickly. This landed less than 24h ago, so if it's difficult to revert, that's a bit of a problem for the project, right?

Just build and run tests on any mac. This fails on 3 different macs I tried (2x arm, 1x intel), in a bunch of different build configs.

For the particular build I sent the output from, the cmake invocation looked like /Applications/CMake.app/Contents/bin/cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS='compiler-rt;libcxx;clang' -DLLVM_APPEND_VC_REV=NO -DCMAKE_C_COMPILER=$HOME/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang -DCMAKE_CXX_COMPILER=$HOME/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang++ -DCMAKE_OSX_SYSROOT=/Users/thakis/src/llvm-project/sysroot/MacOSX.sdk -DDARWIN_macosx_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/MacOSX.sdk -DDARWIN_iphoneos_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/iPhoneOS.sdk -DDARWIN_iphonesimulator_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/iPhoneSimulator.sdk ../llvm

I don't have access to a mac computer right now. I'm just going to remove the problematic check line so this passes and add it in later once I figure out what's going on here. The output you're getting should still work, it's just not ideal because we're regenerating the bitcode file so this isn't a breaking issue.

Removed the two lines in rG28c15341368b, let me know if this lets the tests pass. I'll look into getting an access somehow so I can reproduce this and figure it out.

thakis added a comment.Feb 1 2022, 7:45 AM

Tests are passing again.