Let's undo earlier optimizations that turn < 0 comparisons into > -1. This enables us to use vpmovq2m instead of vpternlogd + vpcmpgtq.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I'm surprised that this didn't break existing tests. Where is the right place to update the test suite?
Actually, wait. Something weird is going on at the mid-level. These two functions should generate the same optimized IR, right?
typedef int V __attribute__((vector_size(64))); V lt_zero_x_y(V mask, V x, V y) { return mask < 0 ? x : y; } V ge_zero_y_x(V mask, V x, V y) { return mask >= 0 ? y : x; }
Should but currently we have a different IR. No canonicalization? Wow.
Plus, just cursious why with
typedef int V __attribute__((vector_size(4)));
we produce
define dso_local i32 @_Z11lt_zero_x_yDv1_iS_S_(i32 %0, i32 %1, i32 %2) local_unnamed_addr #0 { %4 = insertelement <1 x i32> poison, i32 %0, i32 0 %5 = insertelement <1 x i32> poison, i32 %1, i32 0 %6 = insertelement <1 x i32> poison, i32 %2, i32 0 %7 = icmp sgt <1 x i32> %4, <i32 -1> %8 = select <1 x i1> %7, <1 x i32> %6, <1 x i32> %5 %9 = extractelement <1 x i32> %8, i32 0 ret i32 %9 }
Why not scalarize it on IR level?
I'm not sure we canonicalize select operand order.
vpmovq2m only exists with avx512dq. Is this transform still useful with just avx512f?
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
22802 | I'm a little confused. the vpmovq2m patter is (setgt imm_allzeros, X). Does this SETGE get canonicalized again? |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
22802 | Oh, sorry about that. I originally made a logic error and converted > -1 to < 0 instead of >= 0; but at that time, I did verify that VPMOVQ2M was generated. You are right that the desired instruction is not generated with SETGE now that the logic error is fixed. In any case, I'd like to update the tests ahead of any code change. Is there a preferred file to put the current suboptimal code gen? |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
22802 | avx512-vec-cmp.ll seems like a decent file to put it in. |
I filed a bug about the IR not being canonicalized. Let's wait for that before proceeding on generating VPMOVQ2M:
@davezarzycki What's going on the way phab is reporting the path in Revision Contents pane? llvm/lib/Target/X86 -> i/llvm/lib/Target/X86 ?!?
That's just git-diff. I thought Phab handled that but I guess not. From the documetation:
diff.mnemonicPrefix
If set, 'git diff' uses a prefix pair that is different from the standard "a/" and "b/" depending on what is being compared. When this configuration is in effect, reverse diff output also swaps the order of the prefixes:
git diff
compares the (i)ndex and the (w)ork tree;
git diff HEAD
compares a (c)ommit and the (w)ork tree;
git diff --cached
compares a (c)ommit and the (i)ndex;
git diff HEAD:file1 file2
compares an (o)bject and a (w)ork tree entity;
git diff --no-index a b
compares two non-git things (1) and (2).
I'm a little confused. the vpmovq2m patter is (setgt imm_allzeros, X). Does this SETGE get canonicalized again?