"ww" and "ws" are both constraint codes for VSX vector registers that
holds scalar double data. "ww" is preferred for float while "ws" is
preferred for double.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 34245 Build 34244: arc lint + arc unit
Event Timeline
It is great to add ww for compatibility.
However if we are going to add ww, looks like we should update ws as well?
clang/lib/Basic/Targets/PPC.h | ||
---|---|---|
211 | Add some more comments for w to distinguish it from s? Do we want to keep compatibility with GCC? So ww can use FP while ws can NOT ? | |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
14080 | Should we exclude FP for ws and return VFRCRegClass instead of VSFRCRegClass ? | |
llvm/test/CodeGen/PowerPC/inlineasm-vsx-reg.ll | ||
42 | Maybe we should add another test for ws as well? The above test is actually for 'x' modifier? |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
14080 | Can you elaborate on what I should do? I'm not familiar with the register info stuff... |
llvm/test/CodeGen/PowerPC/inlineasm-vsx-reg.ll | ||
---|---|---|
42 | I think it is incorrect if the 'x' modifier is not used, so we probably don't have to check the no-modifier case. |
clang/lib/Basic/Targets/PPC.h | ||
---|---|---|
211 | I played with "ws" and "ww" but can't find any behavior difference from assembly produced by powerpc64le-linux-gnu-gcc. I'll keep the current form (which is known to make musl fmax/fmaxf build) unless the gcc semantics are clearer. |
LGTM. Thanks for investigating GCC behavior.
clang/lib/Basic/Targets/PPC.h | ||
---|---|---|
211 | OK. Thanks. So maybe it is just the misleading doc problem of GCC. | |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
14080 | VSFRC contains both F8RC and VFRC. F8RC is FP. However, if ws can use FP as well as you found in later GCC experiments, then we don't need to do this. |
float ws_float(float x, float y) { __asm__ ("xsadddp %x0, %x1, %x2" : "=ws"(x) : "ws"(x), "ws"(y)); return x; } float ww_float(float x, float y) { __asm__ ("xsadddp %x0, %x1, %x2" : "=ww"(x) : "ww"(x), "ww"(y)); return x; } double ws_double(double x, double y) { __asm__ ("xsadddp %x0, %x1, %x2" : "=ws"(x) : "ws"(x), "ws"(y)); return x; } double ww_double(double x, double y) { __asm__ ("xsadddp %x0, %x1, %x2" : "=ww"(x) : "ww"(x), "ww"(y)); return x; }
% powerpc64le-linux-gnu-gcc -O2 a.c -S -o - | grep xsadd xsadddp 1, 1, 2 xsadddp 1, 1, 2 xsadddp 1, 1, 2 xsadddp 1, 1, 2 % clang -target ppc64le -O2 a.c -S -o - | grep xsadd # same output
float scalar_ww_float(float x, float y) { __asm__ ("fadds %0, %1, %2" : "=ww"(x) : "ww"(x), "ww"(y)); return x; } float scalar_ws_float(float x, float y) { __asm__ ("fadds %0, %1, %2" : "=ws"(x) : "ws"(x), "ws"(y)); return x; } double scalar_ww_double(double x, double y) { __asm__ ("fadds %0, %1, %2" : "=ww"(x) : "ww"(x), "ww"(y)); return x; } double scalar_ws_double(double x, double y) { __asm__ ("fadds %0, %1, %2" : "=ws"(x) : "ws"(x), "ws"(y)); return x; }
% powerpc64le-linux-gnu-gcc -O2 a.c -S -o - | grep fadds fadds 1, 1, 2 fadds 1, 1, 2 fadds 1, 1, 2 fadds 1, 1, 2 % clang -target ppc64le -O2 a.c -S -o - | grep fadds # same output
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
14080 | float ws_float(float x, float y) { __asm__ ("xsadddp %0, %1, %2" : "=ws"(x) : "ws"(x), "ws"(y)); return x; } float ww_float(float x, float y) { __asm__ ("xsadddp %0, %1, %2" : "=ww"(x) : "ww"(x), "ww"(y)); return x; } double ws_double(double x, double y) { __asm__ ("xsadddp %0, %1, %2" : "=ws"(x) : "ws"(x), "ws"(y)); return x; } double ww_double(double x, double y) { __asm__ ("xsadddp %0, %1, %2" : "=ww"(x) : "ww"(x), "ww"(y)); return x; } % powerpc64le-linux-gnu-gcc -O2 a.c -S -o - | grep xsadd xsadddp 1, 1, 2 xsadddp 1, 1, 2 xsadddp 1, 1, 2 xsadddp 1, 1, 2 |
Add some more comments for w to distinguish it from s?
Do we want to keep compatibility with GCC?
According to https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/Machine-Constraints.html#Machine-Constraints,
ww is FP or VSX register to perform float operations under -mvsx or NO_REGS.,
while ws is VSX vector register to hold scalar double values .
So ww can use FP while ws can NOT ?