-Lower AVGFloor(A, B) to:
SRL(A) + SRL(B) + (A&B)&1.
-Lower AVGCeil(A, B) to:
SRL(A) + SRL(B) + (A|B)&1.
Differential D143283
[AArch64][SVE]: custom lower AVGFloor/AVGCeil. hassnaa-arm on Feb 3 2023, 9:57 AM. Authored by
Details -Lower AVGFloor(A, B) to: SRL(A) + SRL(B) + (A&B)&1. -Lower AVGCeil(A, B) to: SRL(A) + SRL(B) + (A|B)&1.
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes
Comment Actions Hello. Sorry for the delay in looking at this but I wasn't sure exactly what you were trying to do, and I've never been a huge fan of DAG combines that create the wrong node just to expand it later. It looks like for legal types this can lead to a nice decrease in instruction count though. For smaller types I'm not sure that checking for individual opcodes for extension will work well. They could already be extending loads for example. I've not thought about it too much yet, but As far as I understand from the original hadd (/avg) work it would probably be better to be checking that the top bits are known 0 for unsigned, and that there are >1 sign bits for the signed cases. In any case, it would be good to see alive proofs for the transforms you are making.
Comment Actions Remove sve-avgfloor testing file.
Comment Actions Hi - I was just looking at the patch whilst you updated it! Please ignore any comments that don't apply any more.
Comment Actions Thanks. here are some alive proofs for the transform in https://alive2.llvm.org/ce/z/N6hwQY and https://alive2.llvm.org/ce/z/u_GjYJ. Can you extend the testing to include both ashr and lshr versions? They should both be useful if we are custom legalizing the nodes. Otherwise I think this looks good.
Comment Actions I think it's worth adding test for both the ashr and lshr versions, but otherwise I think this LGTM. Thanks
Comment Actions Thanks for all the changes @hassnaa-arm, I've just left some final minor comments.
Comment Actions Thanks for the changes @hassnaa-arm, I'm satisfied with the patch now so removing my 'requesting changes'. |