ADD(UADDV a, UADDV b) --> UADDV(ADD a, b)
This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888
Meta ticket: https://bugs.llvm.org/show_bug.cgi?id=46929
Differential D88731
[AArch64] Combine UADDVs to generate vector add mivnay on Oct 2 2020, 5:31 AM. Authored by
Details ADD(UADDV a, UADDV b) --> UADDV(ADD a, b) This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888
Diff Detail
Unit Tests Event TimelineComment Actions The pre-commit check is not being helpful here... :-/ Seems like a useful optimisation to me though. Can you add test for i16 and i8. As far as I understand they will not fold because we will have legalized types and the return type will not match the vector element type? It's still worth having the tests. We could think of doing this target independent instead. Folding add(vecreduce(x), vecreduce(y)) -> vecreduce(add(x, y)). It sounds generally useful to me. World that work in your case, or would it specifically need to work on UADDV? Can you also run the update_llc_test_checks.py script on the file and pre-commit the tests, just showing the changes here.
Comment Actions
Sure. I have created new patch : https://reviews.llvm.org/D89365 Comment Actions Yes!
I don't need to use UADDV specifically. But, I am not sure what impact it has on other targets (different lowerings, patterns, etc..). Comment Actions Yeah. I was going to say that this would replace 3 instructions with 2, so would generally be useful. But thinking about it, that wouldn't actually be the case for MVE. The two reductions (VADDVA's) would likely overlap better than back to back loads. The code here looks good to me. If we find we need to handle more cases in the future we can look at it there, and perhaps put some sort of "undo" in for MVE. But in the meantime this patch LGTM. |
clang-tidy: warning: 'auto LHSN1' can be declared as 'auto *LHSN1' [llvm-qualified-auto]
not useful