As per title. This is the very straightforward 2 complement subtraction transaform.
Diff Detail
- Build Status
Buildable 6636 Build 6636: arc lint + arc unit
Event Timeline
This kicks in for fold-pcmpeqd-2.ll . Looking at the assembly, things looks good, but I'm not really sure what this test is testing for, so if someone familiar could advice on what to do, that'd be great. @chandlerc , @dblaikie you worked on that, can you advice ?
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
2095 | Are you sure this sets the overflow bit correctly? |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
2095 | You made me doubt, so I put some test together, and it does indeed overflow correctly. |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
2095 | Are you really, 100% sure it sets the overflow bit correctly? "uaddo (a ^ -1), 1" sets the overflow bit if "a ^ 1 == -1", or "a == 0". "usubo 0, a" sets the overflow bit if "a !=0". Simple test program to show this: #include <stdio.h> __attribute((noinline)) int f(unsigned a, unsigned *ovf) { return __builtin_sub_overflow(0, a, ovf); } __attribute((noinline)) int f2(unsigned a, unsigned *ovf) { return __builtin_add_overflow(a ^ -1, 1, ovf); } int main() { unsigned sum, ovf; ovf = f(0, &sum); printf("%u %u\n", sum, ovf); ovf = f2(0, &sum); printf("%u %u\n", sum, ovf); ovf = f(1, &sum); printf("%u %u\n", sum, ovf); ovf = f2(1, &sum); printf("%u %u\n", sum, ovf); } |
I have no idea what clang is doing there. It seems like the intrinsic do not map directly to the uaddo/usubo. See for yourself the generated IR (in clang 3.8 that's what I have available ATM):
; Function Attrs: noinline norecurse nounwind uwtable define i32 @f(i32 %a, i32* nocapture %ovf) #0 { %1 = zext i32 %a to i33 %2 = sub nsw i33 0, %1 %3 = trunc i33 %2 to i32 %4 = and i33 %2, 4294967295 %5 = icmp ne i33 %4, %2 store i32 %3, i32* %ovf, align 4 %6 = zext i1 %5 to i32 ret i32 %6 }
I'm not working from C to begin with, so I'm not super familiar with these intrinsic and what they are supposed to do. i definitively want to get to the bottom of this and make sure this is correct, but we can't conclude from the result of the intrinsic that it is the case. Sadly, I'm not on my work computer right now, so I can't check out what DAG is generated and how it is combined ATM in this specific case.
Oh, oops, posted the wrong version. Corrected:
#include <stdio.h> __attribute((noinline)) int f(unsigned a, unsigned *ovf) { return __builtin_usub_overflow(0, a, ovf); } __attribute((noinline)) int f2(unsigned a, unsigned *ovf) { return __builtin_uadd_overflow(a ^ -1, 1, ovf); } int main() { unsigned sum, ovf; ovf = f(0, &sum); printf("%u %u\n", sum, ovf); ovf = f2(0, &sum); printf("%u %u\n", sum, ovf); ovf = f(1, &sum); printf("%u %u\n", sum, ovf); ovf = f2(1, &sum); printf("%u %u\n", sum, ovf); }
(I have no idea why clang generates such weird code for overloaded versions of the intrinsics.)
OK I was able to dig more. Something is screwed up with my test case. This is indeed not doing the right thing with the carry.
Are you sure this sets the overflow bit correctly?