We were already doing this for integer constants. This patch implements
the same thing for floating point constants.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
---|---|---|
2263–2264 | Would fixing this FIXME do the same thing? |
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
---|---|---|
2263–2264 | Aha! I thought this should be done centrally somewhere. I'll give this a try, thanks. |
This looks like a clear win to me on all the affected test cases except for one slight regression noted inline.
llvm/test/CodeGen/ARM/fcopysign.ll | ||
---|---|---|
98–101 | I'm not an ARM expert but this slight regression looks like it's just bad luck in the register allocator. If the vmov went into d1 then we could use vbit instead of vorr+vbsl. |
llvm/test/CodeGen/X86/uint_to_fp-2.ll | ||
---|---|---|
11 | @craig.topper @spatel This looks like we're loading the same constant as (double x) and <double x, double ???> - do you know if we already have examples of this or is this a new type of regression? |
llvm/test/CodeGen/ARM/fcopysign.ll | ||
---|---|---|
98–101 | Yeah this can happen... It's a bit of a shame but doesn't look like the fault of this patch, exactly. BSL/BIT/BIF are selected best-effort using a pseudo. |
llvm/test/CodeGen/X86/uint_to_fp-2.ll | ||
---|---|---|
11 | We might avoid that by checking number of uses of the constant? Looks like a general shortcoming of combining/hoisting constants: $ cat vec_const.ll define <4 x i32> @f(<4 x i32> %x) { %a1 = add <4 x i32> %x, <i32 1, i32 2, i32 3, i32 4> %a2 = xor <4 x i32> %a1, <i32 1, i32 2, i32 3, i32 undef> ret <4 x i32> %a2 } $ llc -o - vec_const.ll .section __TEXT,__text,regular,pure_instructions .build_version macos, 10, 15 .section __TEXT,__literal16,16byte_literals .p2align 4 ## -- Begin function f LCPI0_0: .long 1 ## 0x1 .long 2 ## 0x2 .long 3 ## 0x3 .long 4 ## 0x4 LCPI0_1: .long 1 ## 0x1 .long 2 ## 0x2 .long 3 ## 0x3 .space 4 .section __TEXT,__text,regular,pure_instructions .globl _f .p2align 4, 0x90 _f: ## @f .cfi_startproc ## %bb.0: paddd LCPI0_0(%rip), %xmm0 pxor LCPI0_1(%rip), %xmm0 retq |
llvm/test/CodeGen/X86/uint_to_fp-2.ll | ||
---|---|---|
11 | Thanks, I've raised a bug at https://bugs.llvm.org/show_bug.cgi?id=47744 |
LGTM, cheers - looks like the minor arm/x86 issues are already common and nothing really attributable to this patch.
Would fixing this FIXME do the same thing?