Page MenuHomePhabricator

[PPC] support for arithmetic builtins in the FE
ClosedPublic

Authored by amehsan on Nov 11 2016, 5:48 AM.

Details

Summary

This includes various overloads of the following builtins:

  • vec_neg
  • vec_nabs
  • vec_adde
  • vec_addec
  • vec_sube
  • vec_subec
  • vec_subc

Note that for vec_sub builtins on 32 bit integers, the semantics is similar to what ISA describes for instructions working on quadwords: the first operand is added to the one's complement of the second operand. (As opposed to two's complement which I expected).

There is a small backend patch as well that I will post separately, but this is independent.

Diff Detail

Event Timeline

amehsan updated this revision to Diff 77608.Nov 11 2016, 5:48 AM
amehsan retitled this revision from to [PPC] support for arithmetic builtins in the FE.
amehsan updated this object.
amehsan added subscribers: cfe-commits, echristo.
nemanjai added inline comments.Nov 13 2016, 6:49 PM
lib/Headers/altivec.h
313

I don't understand why we're adding __mask to the sum of __a and __b. Shouldn't that be __carry?

321

Same comment as above.

348

Is it not a little cleaner and more readable to just mask out the __c parameter before the loop (similarly to the masking you've done with __mask above)?

349

I think it's a little more clear and obvious what is happening if you actually have just a single cast and mask - i.e.

unsigned long long __longa = ((unsigned long long) __a[i]) & 0xFFFFFFFF;
kbarton edited edge metadata.Nov 15 2016, 6:03 AM

There are several inline comments that need to be addressed.
I also think it's worthwhile putting a comment at the top of the review indicating the (assumed) semantics for the vec_sube and vec_subec instructions that are being implemented (i.e., the behaviour mimics the vec_subeuqm instructions and thus uses one's compliment plus the carry)

lib/Headers/altivec.h
306

please remove blank line

349

Is a mask actually needed? This seems to be what is done in the vec_addec function below, without the cast. I agree that is cleaner.

The other minor nit is to pick a single value for the mask (1, 0x01, 0x00000001) and use it consistently.

10515

Why do we mask the carries for sign/unsigned ints, but not __128 ints?

10523

Please reorder these to put the __128 below signed and unsigned.

10542

Is it possible to use vec_adde(a, ~b, __carry)?

kbarton requested changes to this revision.Nov 15 2016, 6:03 AM
kbarton edited edge metadata.
This revision now requires changes to proceed.Nov 15 2016, 6:03 AM
amehsan added inline comments.Nov 16 2016, 6:59 AM
lib/Headers/altivec.h
349

To come up with this code pattern I looked at the following pieces of codes:

unsigned long long f (int t) {
  return (unsigned long long)  t;
}

When compiled with optimization produces

define i64 @f(i32 signext %t) local_unnamed_addr #0 {
entry:
  %conv = sext i32 %t to i64
  ret i64 %conv
}

Which is incorrect. Also

unsigned long long f (int t) {
  return (unsigned long long)(unsigned)  t;
}
~

results in

define i64 @f(i32 signext %t) local_unnamed_addr #0 {
entry:
  %conv = zext i32 %t to i64
  ret i64 %conv
}

and

.Lfunc_begin0:
# BB#0:                                 # %entry
        clrldi   3, 3, 32
        blr

So I think the code here is optimal and correct and there is no need to change it.

10515

for quadword, hardware does the masking (implicitly, by only looking at the rightmost bit)

amehsan updated this revision to Diff 78376.Nov 17 2016, 9:53 AM
amehsan updated this object.
amehsan edited edge metadata.
kbarton accepted this revision.Nov 18 2016, 10:25 AM
kbarton edited edge metadata.

LGTM

This revision is now accepted and ready to land.Nov 18 2016, 10:25 AM