[X86] Add xgetbv xsetbv intrinsics
Details
- Reviewers
AsafBadouh delena thakis craig.topper igorb aaboud m_zuckerman - Commits
- rG197b65f83341: [X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms
rC278783: [X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms
rL278783: [X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms
Diff Detail
- Repository
- rL LLVM
Event Timeline
#include <x86intrin.h>in the test is not clear for me. Does it mean that you broke backward compatibility?
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
6779 ↗ | (On Diff #62628) | remove {} |
lib/Headers/intrin.h | ||
905 ↗ | (On Diff #62628) | I'm not sure that we can move it from one file to another. And what was wrong with current implementation. |
the <x86intrin.h> include is because i added calls to the intrinsics themselves in the test, no just the builtins.
lib/Headers/intrin.h | ||
---|---|---|
905 ↗ | (On Diff #62628) | it can't be left here since it will conflict with non-windows implementation. my impression was that it is generally better to use "regular" lowering flow, over using inline asm. |
This change seems consistent with similar being done in r250158 when the other xsave intrinsics were added. intrin.h includes x86intrin.h which in turn includes xsaveintrin.h. So this just makes xgetbv/xsetbv available on non-microsoft platforms.
lib/Headers/intrin.h | ||
---|---|---|
289 ↗ | (On Diff #62628) | Should _XCR_XFEATURE_ENABLE_MASK be moved? Also what is that "static inline" bound to? It looks like when _XCR_XFEATURE_ENABLE_MASK was added, the "static inline" was separated from the declaration of _xgetbv. |
Reverted in r278814, it appears to break usage of _xgetbv on Windows:
https://build.chromium.org/p/chromium.fyi/builders/ClangToTWin%28dll%29/builds/5846/steps/compile/logs/stdio
../../base/cpu.cc(194,10): error: use of undeclared identifier '_xgetbv'
(_xgetbv(0) & 6) == 6 /* XSAVE enabled by kernel */;
You removed the test in ms-intrin.cpp that would have prevented this breakage, too.
Sorry about that, forgot that i changed the ms_intrin test.
about the failure, I think that xsaveintrin.h is not being included because it requires the xsave feature - which should be on if the target supports it.
do you know what in which target the failure occurred? also, can you direct me to the source code for the failure?
thanks
The source isn't that interesting, it includes intrin.h and immintrin.h before using _xgetbv.
I think the issue is that Nico added the _MSC_VER check to intrin.h in http://reviews.llvm.org/D20291:
#if !defined(_MSC_VER) || has_feature(modules) || defined(XSAVE__)
#include <xsaveintrin.h>
#endif
Nico is on vacation right now, but we might want to reconsider that. We really only need to block the AVX512 intrinsics which are unreasonably large.
Still, XSAVE should have been defined when compiling for a target that supports the feature.
But anyway, the xsaveintrin.h is quite small so always including it shouldn't be an issue.
Are you ok with me removing the #if just for this header file, or would you like to wait for Nico?
Hm, resending my comments because it doesn't appear to work from email. I swear it used to...
That's not how MSVC intrinsics (or icc intrinsics, right?) are supposed to work, though. In MSVC, all intrinsics are always available, regardless of subtarget options. Unfortunately, Clang is not very compatible in this area, because of the way that we map the vector intrinsics to our generic vector IR and then codegen that. In this case, we don't have any generic instruction to map to, so I think we should try to be compatible here.
But anyway, the xsaveintrin.h is quite small so always including it shouldn't be an issue.
Are you ok with me removing the #if just for this header file, or would you like to wait for Nico?
I think removing the _MSC_VER check is probably OK, but will the new LLVM intrinsic you added generate correct code when the xsave feature is disabled, as I expect it is in Chromium?
removing the MSC_VER check will not be enough, the feature guards from the intrinsic and the builtin need to be removed to make it work. not sure if this is the right way to go, any thoughts on this?
Now might be the time to solve the larger problem of wider intrinsic availability. Like I mentioned, all these intrinsics really ought to be available all the time, regardless of CPU subtarget.