This is an archive of the discontinued LLVM Phabricator instance.

[libc] Prevent overflow from intermediate results when adding UInt<N> values.
ClosedPublic

Authored by lntue on Aug 3 2022, 12:45 PM.

Details

Summary

Prevent overflow from intermediate results when adding UInt<N> values.

Diff Detail

Event Timeline

lntue created this revision.Aug 3 2022, 12:45 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 3 2022, 12:45 PM
lntue requested review of this revision.Aug 3 2022, 12:45 PM
orex accepted this revision.Aug 3 2022, 2:51 PM
This revision is now accepted and ready to land.Aug 3 2022, 2:51 PM
orex added inline comments.Aug 4 2022, 1:14 AM
libc/src/__support/CPP/UInt.h
74

Hi, Tue!

I was thinking about the implementation. I little bit worries about the performance. To many if here. I would like to propose you another solution. Of course it is up to you to accept it or not. The strongest point of the solution is slightly reduced number of arithmetic operations and only one "main" if. The second one triggered very rare.

  uint64_t carry = 0;
  for (size_t i = 0; i < WordCount; ++i) {
    // Will be wrapped if sum more than 2^(sizeof(x)) - 1
    val[i] += x.val[i];
    // If an overflow appears, the result is less than both of the initial
    // variables
    if (val[i] < x.val[i]) {
      // Add previous carry. Overflow is not possible.
      val[i] += carry;
      // Put 1 to the next digits.
      carry = 1;
    } else {
      val[i] += carry;
      // Likely no overflow.
      if (likely(val[i]) != 0)
        carry = 0;
      // else carry keeps value in case of carry = 0 it is simply 0 with
      // no overflow in case of 1 this made overflow and propagates next.
    }
  }
  return carry;
}
lntue added inline comments.Aug 4 2022, 6:27 AM
libc/src/__support/CPP/UInt.h
74

Hi Kirill, thanks for thinking about improving the performance! I should have added some more background to this patch.

The main reason I had to make it a bit complicated is that when using UInt<128> to replace __uint128_t (which we will need to do for targets without __uint128_t builtin supports), it failed some tests that check overflow flags, which are set by the intermediate computation such as val[i] += x.val[i] in your improvement or the previous implementation, while the overall __uint128_t addition is not overflowed.

Of course this change will make another issue pop up, that is now we don't set overflow flag/trap when the real sum in __uint128_t is overflowed. That's why I left some todo for later patches.

orex added inline comments.Aug 4 2022, 7:37 AM
libc/src/__support/CPP/UInt.h
74

Thank you for the reply. it was mistake, I should read the title more carefully. Your solution looks good and beautiful to solve the problem.
I don't know the full context, but for x86 I can offer an alternative, just clear CF the flag with test command (if you are not interesting in other flags):
https://en.wikipedia.org/wiki/TEST_(x86_instruction)
Another proposed solution can be, something like this:

  1. calculate all sums, but last with the loop I proposed or with your previous solution, which is also very elegant.
  2. clear the flag for some architectures. As I know, for example, RISC V do not have flags, so do not needed.
  3. Simply add vals in the end.

So you can get "true overflow" flag in the end or you can clear the flag in the end, so the behavior will be the same.
But anyhow your solution is very elegant.
P.S. To be honest I don't see why you previous solution overflows. Sorry for bothering you, but It is very interesting for me?

Did you consider using builtins with overflow checking:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
It is platform independent and leaves some optimisations opportunities to the compiler.

orex added a comment.EditedAug 4 2022, 7:52 AM

Did you consider using builtins with overflow checking:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
It is platform independent and leaves some optimisations opportunities to the compiler.

In general this is a good idea, but it can be two problems:

  1. It still can setup flags, which should be avoided.
  2. I'm worried is it already exists on this stage. I had a problem, that some builtins were not exists. See https://github.com/llvm/llvm-project/commit/27aca975b6b6e9d5c7516c091f954884b28650ae for example.
lntue added a comment.Aug 4 2022, 11:45 AM

Did you consider using builtins with overflow checking:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
It is platform independent and leaves some optimisations opportunities to the compiler.

Thanks for your suggestion! We definitely want to use compiler builtins to improve the performance whenever they are available, and fall back to generic implementation otherwise.

As of now, this code is only used in a very limited way for arm32 platforms that do not have __uint128_t builtins, and we haven't finished setting up other requirements such as flags, environments or performance testings yet.

Once those are set, we will followup with performance enhancement using builtins, together with more comprehensive testings, including exception flags and performance, and making sure that the fallback works properly just as @orex mentioned.

lntue added inline comments.Aug 4 2022, 12:00 PM
libc/src/__support/CPP/UInt.h
74

Thanks! We can definitely have followups with performance enhancement per architecture targets once this class is used in more places, together with more comprehensive testings.

Re P.S.: Sorry I didn't dig in too much to see why it overflowed. It's possible that some compiler optimizations were too aggressive recognizing that the operations could be done in 32-bits and used such; or they recognized the pattern and reduced it to uint64_t adds with carry bit checks? Anyhow, I plan to use this safe implementation as our baseline and extend it to subtraction next. We definitely should have specialized versions with builtins / clear flags for x86_64 and aarch64 to improve the performance, as we are going to use more than 128-bit integers very soon.