When a literal that cannot fit in the immediate form of the fmov instruction
is used to initialise an SVE vector, an extra unnecessary fmov is currently
generated. This patch adds an extra codegen pattern preventing the extra
instruction from being generated.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
556 | do we also need patterns for f16/f64? |
Patch looks good to me. I'll leave it up to you whether you want to extend the patch to cover f16/f64 cases or defer until needed.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
556 | I would say want rather than need given this is more optimisation than function. | |
llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll | ||
133–139 | Keep it if you want but this test "vscale x 2 x float" is kind of unnecessary because the intrinsics only expect to operate on/with fully packed vectors. The fact some work with unpacked types is really a quirk due to some reusing the ISD nodes used for stock LLVM IR. |
Added equivalent f64 patterns
The f64 case is similar enough that adding those patterns here is fine; f16 seems to take
a different codepath and generate very different code, so I'd rather change that separately
when it comes up (if it proves necessary).
llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll | ||
---|---|---|
133–139 | If it's not likely to not be permitted in future I'd probably rather leave it in on the basis that more tests is usually just better in my opinion. I can remove if it's something that shouldn't be allowed/might not be allowed in future though |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
557 | It looks like this probably needs a |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
563–564 | nxv4f64 is not a legal type and thus this pattern will have no affect. | |
llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll | ||
157–164 | Whilst I can begrudgingly accept the vscale x 2 x float test, this vscale x 4 x double test should definite be removed as it is not linked to any of the new patterns. | |
llvm/test/CodeGen/AArch64/sve-vector-splat.ll | ||
404–405 | I'd remove this test also, but if you keep it then the CHECK lines need updating as <vscale x 4 x double> sits across two registers. Personally I just remove it as it's not testing anything that is not already tested by splat_nxv2f64_fmov_fold. |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
557 | @david-arm : What have you spotted to say this? The tests suggest AddedComplexity is not required? |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
557 | Well, I guess I haven't seen specific tests that require this. I just thought this was consistent with the patterns below that's all, but maybe I've just misunderstood something! |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
559 | Rouge intent. |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
559 | fixed in committed version |
do we also need patterns for f16/f64?