This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Select lower fsub,fabs pattern to fabd on AArch64
ClosedPublic

Authored by karthikthecool on Dec 28 2014, 11:48 PM.

Details

Summary

Hi,
Similar to http://reviews.llvm.org/D6781 we can select lower fsub, fabs pattern to fabd on AArch64.
Add pattern matching in .td file to handle the same.
For example for the below code -

float a[4],b[4],c[4];
void fabd_test() {
  a[0] = fabs(b[0]-c[0]);
  a[1] = fabs(b[1]-c[1]);
  a[2] = fabs(b[2]-c[2]);
  a[3] = fabs(b[3]-c[3]);
}

gcc produces a single

fabd	v0.4s, v1.4s, v0.4s

instead of

fsub	v0.4s, v0.4s, v1.4s
fabs	v0.4s, v0.4s

which was previously produced by clang. After this patch we are able to lower fsub fabs to fabd.
This is also valid for scalar operands in case of fabd.

Please let me know if this is good to commit.

Thanks and Regards
Karthik Bhat

Diff Detail

Repository
rL LLVM

Event Timeline

karthikthecool retitled this revision from to [AArch64] Select lower fsub,fabs pattern to fabd on AArch64.
karthikthecool updated this object.
karthikthecool edited the test plan for this revision. (Show Details)
karthikthecool set the repository for this revision to rL LLVM.
karthikthecool added a subscriber: Unknown Object (MLST).

Again, same comment as for other patch.

FABD Vd.<T>, Vn.<T>, Vm.<T>
Subtracts the elements of Vm from the corresponding
elements of Vn, and places the absolute values of the results in the elements of Vd

Rm, Rn should swap thier places i think.
I was also thinking if we should have a more generic name for the testcase to cover all vector arithmetic instructions involving abs.
Would like to know what James and Tim's thoughts are on this.
LGTM, otherwise.

Hi Jyoti,
Thanks for the input. Yes you are correct the registers in matched pattern for FABD needs to be swapped. I got confused by the manual wording. Updated the patch.
Please let me know if you have any other comments.
Thanks and Regards
Karthik Bhat

I would use a more generic name for the testcase as mentioned before. You could probably club the testcases of sadb, fabd into a single generic file something like arm64-neon-simd-vabs.ll. For the scalar versions you should find an appropriate place though.

We should probably check for complete machine instructions in the test output rather than just pneumonic, since we have added different patterns to handle various types of data.

Could you please modify to test these?

[PS] I would still get a go from either of reviewers before committing.

Hi Karthik,
To further clarify, reason for clubbing the testcases of sadb, fabd into a single generic file is so that we can accomodate more patterns revolving around
abs arithmetic, another reason being, adding more test files means running llc multiple times on each file, which in turn increases test time.
Hope this sounds reasonable.

Hi Jyoti,
Updated the test cases as per reveiw comments to check the exact instruction being generated.
Merged test case with D6781.
Please let me know if you have any other comments or if it is good to commit.
Thanks and Regards
Karthik Bhat

jmolloy accepted this revision.Jan 5 2015, 2:50 AM
jmolloy edited edge metadata.

Hi Karthik,

This looks fine to me. I'd have written the test slightly differently, taking function parameters rather than globals as it makes the resulting code smaller and easier to read, but it's fine as-is.

Cheers,

James

This revision is now accepted and ready to land.Jan 5 2015, 2:50 AM
karthikthecool closed this revision.Jan 5 2015, 6:00 AM

Thanks James. Comitted as r225169 after modifying test case as per comment.
Thanks!