Compiler hits infinite loop in DAGCombine. For force-streaming-compatible-sve mode we have custom lowering for 128-bit vector splats and later in DAGCombiner::SimplifyVCastOp() we scalarized SPLAT because we have custom lowering for SME. Later, we restored SPLAT opertion via performMulCombine().
Details
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
24422–24424 | The only reason this currently works for NEON is because DAGCombiner::SimplifyVCastOp for no good reason singles out ISD::SPLAT_VECTOR, which is only used for scalable vectors and SVE. I think the better solution would be to pass the whole Node* to preferScalarizeSplat, and then to look at its uses. If the opcode is a zero/sign extend and any of the uses is a mul, then we don't want to do this transformation, because that's when a umull/smull instruction can be used. I looked into whether it was possible to make LoweMUL smarter and have it recognise the dup(ext(x)) pattern so that we don't need to do the combine for dup(ext(x)) -> ext(dup(x) in the first place, but this seems rather non-trivial. | |
llvm/test/CodeGen/AArch64/aarch64-force-streaming-compatible-sve.ll | ||
4 ↗ | (On Diff #503331) | Just a general comment (and not a suggestion for this patch): It would be nice if we could use SVE's [us]mull[bt] instructions for this. |
llvm/test/CodeGen/AArch64/aarch64-force-streaming-compatible-sve.ll | ||
---|---|---|
1 ↗ | (On Diff #503331) | I think it makes sense to move this test into one of the existing files, perhaps something like CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll? |
4 ↗ | (On Diff #503331) | I think it would be good to rename this function to something that is more meaningful, perhaps extend_cmp_of_mul? Also, I think it would be nice to add a comment that describes what we're testing here, i.e. that the DAG combiner does not get stuck in an infinite loop. |
Adding @hassnaa-arm as a reviewer too as she has been doing a lot of work on the streaming compatible code generation.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
24423 | I guess this condition can be removed now? | |
24424 | Should this code also check for ISD::ANY_EXTEND ? | |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll | ||
894 | Can you also add a test where the result of the extend is not used by a mul? |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll | ||
---|---|---|
930–932 | This is testing the wrong thing, because it is testing splat(sext(scalar)) rather than sext(splat(scalar)) which is what SimplifyVCastOp tries to change. | |
931 | It's better to use poison here and below. | |
955 | No vectors are used in this loop, so the code you've modified does not apply to this test at all. I'm expecting a similar test to the above, but then using a different operator instead of a mul. |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll | ||
---|---|---|
894 | ok, I added function with the setcc node instead of mul. |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll | ||
---|---|---|
919 | This test is different from the one above in ways other than the mul. Can you make the tests otherwise identical? Can you also verify that it still exercises the code you've added to your patch? |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll | ||
---|---|---|
911 | nit: If you replace the 8 here by a smaller number like 2, then we don't need to observe the effect of legalisation and we'd only get 1 mul instead of 4. |
Moved tests to <2 x > types.
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll | ||
---|---|---|
911 | sorry, I was wrong. I can reproduce with <2 x > types. |
I guess this condition can be removed now?