I have added a few things to the OpenMP FAQ which I think were missing. Feel free to suggest some changes. Are there missing options in the offloading command line reference? And what do you think about the section "Q: Why is my
build taking a long time"?
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks for putting this together!
I had some minor comments.
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
84 | I think this should be capability | |
477–478 | Flip the flags or the sentences, so that the flags and their meaning are in the same order. | |
482 | have in this and the next paragraph also "Pass an argument `<arg>` ... " as it is written further down. | |
506 | I assume LTO is not actually running on the device, so maybe rephrase to ... optimization for device code. | |
527 | Should be indicated which one is the default, i.e., old or new? |
Thanks for the nice suggestions! I will fix it right away.
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
477–478 | Thank you for pointing this out! I thought the two flags had the same meaning as they had the same description in the Clang command line reference. clang --help | grep offload-host-device --offload-host-device Only compile for the offloading host. clang --help | grep offload-host-only --offload-host-only Only compile for the offloading host. What exactly is the difference between the two flags? |
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
477–478 | To be honest, I only guessed from their name, hopefully someone else knows. My interpretation: Let's try summoning @jdoerfert or @jhuber6 |
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
527 | In clang/lib/Driver/ToolChains/Clang.cpp the following means that it is turned off by default, right? 6343- if (Args.hasFlag(options::OPT_offload_new_driver, 6344- options::OPT_no_offload_new_driver, false)) 6345: CmdArgs.push_back("--offload-new-driver"); |
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
462 | Add sth like: Note that this option is often not needed anymore if `--offload-arch is provided. | |
468 | Mention auto detection (for the machine used for compiling) and mention {amdgpu,nvptx}-arch executables together with the "key" that corresponds to the sub architecture. | |
473 | This is not what this does. It will compile only the code that goes on the device, not the code for the host. Mention this is for debug purposes mostly, or if device only runtimes are created. | |
504–507 | Just make arg above optional and say what the default is [=<arg>] ... | |
513 | I would not propose this to verify anything, honestly. Actually, mention that this is not to verify and mention how you should verify (with the env var set to mandatory). Instead, this is to avoid the host fallback which can help if the target contains code that cannot be compiled for the host (like unguarded device intrinsics), or if you want to save compile time. | |
533–535 | @jhuber6, am I correct in assuming there is no "old" driver for OpenMP, and we should tell people this is only for HIP/CUDA? (even there it seems silly, this is probably not of use right now) |
Thinking we should put these in a separate file and have the FAQ link to it.
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
89 | Unrelated, but this section definitely needs to be updated either by me or @JonChesterfield. | |
468 | Add a line showing that --offload-arch=sm_80,gfx90a does both. | |
472 | Maybe worth mentioning that this is primarily used for checking the IR output when compiling for the device. | |
477–478 | --offload-host-only skips the device portion, has the effect of not having the .llvm.offloading section where the embedded device code would otherwise live, check the IR and you'll see the lack of a giant binary string. | |
482 | You can also use -Xarch_device and -Xarch_host which are worth mentioning. | |
527 | Upstream, OpenMP only uses the new driver. This flag is for enabling it for CUDA / HIP so that it can be interoperable w/ OpenMP. |
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
518–521 | Remove the negative option, describe the bits (or link to the docs) that you can set and that you need the env var too. | |
525 | Not can be, *is*. So: Embed LLVM-IR for the device code in the object files rather than binary code for the respective target. At runtime the LLVM-IR is optimized again and compiled for the target device. Link to the JIT options, and mention this is good for debugging. | |
533–535 | So, let's just talk about the positive option right now. | |
541 | This is default for OpenMP, right (@jhuber6)? |
This update should hopefully resolve most of your comments. Let me know if the changes do not meet your expectations.
I have tried my best to incorporate your feedback. If you want to see the HTML, it looks like this
I want to move the offloading command line reference to another file, but I think that would delete all of your comments.
openmp/docs/SupportAndFAQ.rst | ||
---|---|---|
561 | You are right, I just misunderstood the documentation. I will fix it right away. |
openmp/docs/CommandLineArgumentReference.rst | ||
---|---|---|
14 | I think these are supposed to use ---- for a subsection. |
Changed subtitles to subsections and noted that --offload-arch is automatically detected if not specified.
openmp/docs/CommandLineArgumentReference.rst | ||
---|---|---|
14 | Thanks, you are right as always! |
openmp/docs/CommandLineArgumentReference.rst | ||
---|---|---|
54 | Maybe mention CPU offloading here, like with -fopenmp-targets=x86_64-pc-linux-gnu. | |
65–66 | Drop the bin. Also we infer with --offload-arch=native as well. | |
184 | Should be fixed | |
openmp/docs/SupportAndFAQ.rst | ||
84 | Wait I think this is totally wrong, I removed that in 17. It's LIBOMPTARGET_DEVICE_ARCHITECTURES=sm_70,gfx1030 now for example. |
Added new lines to the end of the files and included `LIBOMPTARGET_DEVICE_ARCHITECTURES`.
I think these are supposed to use ---- for a subsection.