[SDAG] Add a fallback multiplication expansion
AuditedrL270720

Description

[SDAG] Add a fallback multiplication expansion

LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.

Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.

This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).

Fixes PR19797.

Details

Auditors
chfast
Committed
hfinkelMay 25 2016, 9:50 AM
Parents
rL270719: [ThinLTO] Fix test check prefix so that intended prefix tested
Branches
Unknown
Tags
Unknown
chfast added a subscriber: chfast.May 25 2016, 12:03 PM

Thanks for adding that. Does it handle only 256-bit integers?

Thanks for adding that. Does it handle only 256-bit integers?

No, it should handle arbitrary sizes because of how it slots into the legalization process.

I tried something similar in https://reviews.llvm.org/D8770

The problem is it grows very fast with the type size. For i256 it uses 11 muls, for i512 43 muls. And it crashes for i1024 with

lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1085: llvm::SDValue llvm::SelectionDAG::getConstant(uint64_t, const llvm::SDLoc&, llvm::EVT, bool, bool): Assertion `(EltVT.getSizeInBits() >= 64 || (uint64_t)((int64_t)Val >> EltVT.getSizeInBits()) + 1 < 2) && "getConstant with a uint64_t value that doesn't fit in the type!"' failed.

I tried my version for i4096 back then and it generates megabytes of code... I think assert is better in the extreme case.

The problem is it grows very fast with the type size. For i256 it uses 11 muls, for i512 43 muls. And it crashes for i1024 with

Can you please file a bug report? While I agree that this won't scale, asymptotically, i1024 shouldn't be that limit.

This produces incorrect results for my set of tests of mul i256. Can you explain how you tested it? I guess there is no chance for adding runtime tests for that?

I used the attached program.

to test the code (there is a driver C code with a reference impl for i256, and then a .ll file which has the IR to make a function with LLVM's expanded i256 multiplication).

chfast raised a concern with this commit.Nov 15 2016, 5:12 AM

The algorithm is not correct. I fixed it in https://reviews.llvm.org/D26628.