- User Since
- Aug 11 2015, 1:08 PM (354 w, 2 d)
Revision just uploaded annotates all kernels with an integer. That doesn't need to land with the intrinsic but it makes it much easier to drive runtime tests through this code so it's there for now.
- Rebase on main, append metadata to kernels in lower module lds pass
Mon, May 23
Fri, May 20
If it was adding a calling convention, sure - caution warranted. There's no llvm change here though, an existing CC is exposed to C++. No change to the type system either.
A clangd buildbot (https://lab.llvm.org/buildbot/#/builders/131/builds/27770) failed on this with
Thu, May 19
Unintentionally created this patch against an older version of main and it interacted badly with D124998 on the rebase. Rerunning tests now, and will leave this open for further comments for a little while. Thanks all
- Fix git merge misfires
- Fix git merge misfires
- Rebase on main
Thanks for accepting! I'm interested to learn more about how the calling conv works, e.g. if parts of it are implemented in clang and parts of it patched on the fly by opt, but that's downstream of easy access to writing C tests that use it.
Added a codegen test for arg passing. It establishes that most arguments are left alone, but structs passed by value are handled as an addrspace(4) byref. Letting opt -O2 run annotated some argument pointers as being in addrspace(1) which I think is wrong.
- Add O0 arg passing codegen test
OK, so that's a different thing. CUDA/HIP has a bunch of rules about implicitly tagging things with addrspace(1) at the call boundary. I don't think any of that magic should exist for C or C++, the developer gets to spell out the address space stuff they want explicitly, and if they want to link it against CUDA/HIP/OpenMP code they get to look up the rules. In particular, persuading clang to do the extra argument mangling stuff will get in the way of using this to create test cases.
Wed, May 18
Looks about as expected, modulo Johannes' comment above. Could we do the getenv call once, instead of per-launch? Somewhere in the massive class constructor perhaps.
Thu, May 5
Wed, May 4
Drop the caching of the value and pass in function instead of module. No functional change to previous diff, simpler code.
- Module->function, skip caching the attribute
Sun, May 1
Didn't notice the hasFnAttribute request comment, change made. Anything else?
- rebase, and getFnAttr -> hasFnAttr
Apr 27 2022
Apr 19 2022
Nice! I like the strategy.
Apr 18 2022
LGTM, thanks! If it fails CI we might want to submit the tests and the code change separately, but I'm fine with trying it as one block in the first instance.
Apr 14 2022
Thanks! Good fix
Apr 13 2022
Thanks for the extra test lines!
Editing the tests in place to check for new driver properties would mean we lose coverage for the old driver and get a lot of churn if we need to back out the change.
Couple of nits above but basically I'm convinced. The gnarly part of binary formats is string tables and I'm delighted that part of MC was readily reusable. Wrapping the string table in different bytes to align with the elf format may still be a good idea but it's not an obvious correctness hazard.
Apr 6 2022
Extra dependency edge is great, thanks! Is LIBOMPTARGET_TESTED_PLUGINS new? It looks unused in this patch
Apr 5 2022
I think we need tests for coff and elf for the MC part, should be existing ones for the other flags. Ideal fit for the openmp sectiy, good find. Inclined to ignore the formatting warnings on old code.
Apr 4 2022
Ron thought @bcahoon might be interested, adding as reviewer
Mar 31 2022
@arsenm there's a bunch of stuff to do that builds on this. What further changes would you like?
Mar 29 2022
Added some reviewers. I'd much prefer this used an existing binary format, DIY is prone to errors and maintenance hassles down the road. Don't care as much about which format as about it being one with an existing, tested implementation and ideally existing inspection tools.
Mar 28 2022
Mar 25 2022
Nice, thanks. Wonder if we want protected visibility as well.
Mar 24 2022
Mar 23 2022
Sounds good to me too, thanks!
- Drop string bool parameter and unreliable test
- invert sense of attribute
Mar 22 2022
@estewart08 Fedora are rejecting some uses of rpath (and probably runpath). I can't find a corresponding page for redhat. This may become a problem for our aomp/rocm builds.
It lets applications run with the libomp and libomptarget built with the toolchain. For users, they don't have to be root. For compiler devs, we test our work instead of whatever the distro had installed.
Mar 21 2022
The current behaviour is to allocate an instance of llvm.amdgcn.module.lds.t at address zero in every kernel in the IR module if the global llvm.amdgcn.module.lds is present. That instance only needs to be allocated in kernels which subsequently call a function which uses it.
The other passes do know how much is allocated. I don't want to go digging around the IR looking for the llvm.donothing call, which iirc is removed before instruction selection
Mar 20 2022
Mar 19 2022
@ronlieb this is the patch I'd like to run through the internal CI
Mar 17 2022
Mar 16 2022
Mar 11 2022
- apply refactor from D115980, removes typo-prone part
This can be picked up and simplified using D121499
Given the header proposed in D121499, OpenMPMath.cpp can be replaced with:
Sounds good to me.
@ronlieb this probably means our CI needs to do different things
We've got quite a lot of debt in this area and seem at risk of taking on more. Ideally the cuda and hip and openmp headers would be closer to a single header containing:
OK, I think I follow.
Mar 7 2022
I'm ooo for a couple of days, @tianshilei1992 could you take a look?
Found one error in the existing code but can believe there are more. We could split this patch into two, one replacing the DIY LockGuard with std::lock_guard and clang-format the thing, then a second replacing manual locking with uses of lock guard. Would be the same behaviour as this patch but make it easier to find the change in behaviour by inspection.
Mar 5 2022
Only comments on the helper type for now. I like the idea of a generic wrap-this-thing-in-mutex helper as a path to stepping on the remaining race conditions.
Mar 4 2022
Mar 2 2022
Thanks! I get the opposite, building locally with clang sometimes breaks gcc buildbots.
Broke a couple of buildbots, e.g. https://lab.llvm.org/buildbot/#/builders/193/builds/7799
Feb 23 2022
Thanks! Seems a good thing to add to the offloading test runner, preferably in a separate change to avoid reverting this in case of unforeseen problems
Feb 21 2022
LG, thanks for including the test
Feb 16 2022
This is asserting somewhere in clang-linker-wrapper, e.g. https://lab.llvm.org/buildbot/#/builders/193/builds/6939
Feb 15 2022
I'm not confident it's only attributes that we rely on mlink-builtin-bitcode patching up, that could be poor phrasing on my part.
Feb 14 2022
Nvm, saw the commit message
Feb 12 2022
Feb 11 2022
Cross compilers are a hazard here. I'd expect there to be a fairly long list of magic flags you need to pass to clang to get it to find the right libraries. Can you add fno-openmp-implicit-rpath to that list instead?
Feb 10 2022
That's somewhat interesting, @arsenm the 4000 series APUs used to identify as gfx902 iirc - does the new name for existing hardware make sense to you?
Feb 8 2022
Feb 5 2022
Looks reasonable to me. This is looking at 80 bit floating point which x86 can do (in some modes at least, this passes mlong-double-80 ). Long double elsewhere tends to be 64 or 128 bit.
Feb 4 2022
Can we drop an xfail in an existing test as part of this patch?
Feb 3 2022
I've read this closely and can't think of anywhere else that needs to be patched. Thanks!
Feb 1 2022
@ronlieb @jdoerfert This was hoped to resolve the breakage from https://lab.llvm.org/buildbot/#/builders/193/builds/5772 which unfortunately has five changes in it,