This is a follow up D147507 which removed the prior transformation to
shufps which was incorrect as the mask was for 64-bit double
elements, not 32-bit float elements. Using shufpd for the
replacement, however, preserves the mask semantics and has the same
benefits as shufps.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM.
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
250–255 | No sure if we should use VSHUFPD for them too. |
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
157 | Hello Noah, we've got downstream mis-compile on fuzzer testing on this patch. While we do not have IR reproducer for this problem at the moment, may be you would be able to detect the problem from source code? |
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
240–247 | Guess problem may come from here. |
If it helps. the bad and good compilations differ only in one line:
vshufps $0, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,0],xmm0[0,0]
vs
vshufps $68, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,1],xmm0[0,1]
So probably the problem comes from case which have not been updated but uses a lambda which is updated?
Hello Noah, we've got downstream mis-compile on fuzzer testing on this patch.
Specifically reverting constants 0x00 -> 0x44 and 0xFF -> 0xee fixes the issue.
While we do not have IR reproducer for this problem at the moment, may be you would be able to detect the problem from source code?
At least I see that this comment states that for r,r case we should use old constants and for r,r,k new ones.
However it looks like the patch updates more cases. Is it correct?
It just is wild guess as I'm not familiar with this code.