This patch handles back-end folding of generic patterns created by D48067 in cases where the instruction isn't a straightforward packed values rounding operation, but a masked operation or a scalar operation.
Details
Diff Detail
Event Timeline
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30964 | Can we just do this with isel patterns like we do for ADDSS? | |
llvm/test/CodeGen/X86/vec_floor.ll | ||
874 | Can you generate %k from a compare instruction rather than passing in a X x i1 type. It will make the code a little cleaner since we won't have to extend and split the mask in such crazy ways. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30964 | I've considered that, but decided to fold it here. To do it in .td patterns we'd need to add 4 new patterns in 2 separate files. 32 and 64 bit patterns would need to be added for VROUNDS* on AVX and ROUNDS* on SSE4.1. Writing this pattern here both makes it easier to track and produces less check complexity. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
30958 | There's a signed vs unsigned comparison warning on this line. |
llvm/lib/Target/X86/X86InstrAVX512.td | ||
---|---|---|
8736 | Do we have test cases covering this pattern? I can't find any zero extend instructions |
Added zero extension of mask to i32 in the masked scalar tests and added more ways to represent the mask, testing the 8-bit mask pattern among others. 16-bit mask patterns removed due to scalar_to_vector errors.
Corrected the scalar pattern predicates, added packed zero-masked instruction patterns and tests to cover zero-masking. Changed the RUN line of vec_floor.ll to give different results for AVX512F and AVX512VL where needed (e.g. in 128- and 256-bit masked operations).
Added tests for floor intrinsics and masked scalar double patterns to cover all introduced isel patterns.
There's a signed vs unsigned comparison warning on this line.