Adds patterns to catch masks preceeding a long multiply,
and generating a single umull/smull instruction instead.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
360 ms | linux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp |
Event Timeline
Looks nice to me. Is it worth adding mul (sext_inreg, sext i32) patterns too in case one operand is sext and the other is being masked? mul is commutative so I think it would only be two extra patterns, one for sext and one for zext.
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
1477 | Nit: Can you line up the ('s for the input and the output |
Looks nice to me. Is it worth adding mul (sext_inreg, sext i32) patterns too in case one operand is sext and the other is being masked? mul is commutative so I think it would only be two extra patterns, one for sext and one for zext.
Added, though I did have some difficulties hand-crafting something that would use these patterns (hence no specific tests)
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
1477 | Parentheses aligned and output put types added, thanks :) |
Something like https://godbolt.org/z/q376ob maybe?
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
1478 | GPR32 I think, for the sext input. And then the output can use the GPR32 directly. Like the SMADDLrrr pattern below. |
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
1481 | This pattern isn't needed I don't think. The other four you have here look good to me, but I would remove the whitespace and maybe reorder them. | |
llvm/test/CodeGen/AArch64/aarch64-mull-masks.ll | ||
25 | Can you add a commuted test too, where the and and the conv are used the other way around. Same for smull. They should match automatically I think, with the patterns that are already here. |
This pattern isn't needed I don't think
Removed
but I would remove the whitespace and maybe reorder them.
Done
Can you add a commuted test too
Added, the patterns do indeed match
Nit: Can you line up the ('s for the input and the output
The other patterns below have output type's too. It's probably worth adding those just for consistency.