Carefully match the pattern matched by ISel so that this produces shld / shrd (unless Subtarget->isSHLDSlow() is true).
Thanks to Craig Topper for providing the LLVM IR pattern that gets successfully matched.
Fixes PR37755.
Differential D49606
[ms] Add __shiftleft128 / __shiftright128 intrinsics thakis on Jul 20 2018, 9:32 AM. Authored by
Details
Carefully match the pattern matched by ISel so that this produces shld / shrd (unless Subtarget->isSHLDSlow() is true). Thanks to Craig Topper for providing the LLVM IR pattern that gets successfully matched. Fixes PR37755.
Diff Detail Event TimelineComment Actions Here are the IR patterns for this that work. Not sure if we can do this directly in C, we need a 128 bit type, but maybe we can emit it from CGBuiltin.cpp? define i64 @__shiftleft128(i64 %x, i64 %y, i8 %amt) { %a = zext i64 %x to i128 %b = zext i64 %y to i128 %c = shl i128 %b, 64 %d = or i128 %a, %c %amtmask = and i8 %amt, 63 %e = zext i8 %amtmask to i128 %f = shl i128 %d, %e %g = lshr i128 %f, 64 %h = trunc i128 %g to i64 ret i64 %h } define i64 @__shiftright128(i64 %x, i64 %y, i8 %amt) { %a = zext i64 %x to i128 %b = zext i64 %y to i128 %c = shl i128 %b, 64 %d = or i128 %a, %c %amtmask = and i8 %amt, 63 %e = zext i8 %amtmask to i128 %f = lshr i128 %d, %e %g = trunc i128 %f to i64 ret i64 %g } Comment Actions We have __int128. If you think hitting the pattern is preferable to inline asm, I can try to give that a try, either via C or via CGBuiltin.cpp. Comment Actions I'd prefer the pattern over inline assembly. It'll give us more flexibility in the backend if we should be using some other instruction on different targets. Comment Actions I’m not at my dev machine, but this is exactly the definition of funnel shift, no? Unless that got reverted, adding/modifying clang builtins was the next step in the plan for those intrinsics. We probably need some backend work to match the variable shift version, but shift-by constant should already work. Comment Actions @spatel, yes its exactly funnel shift. I wasn't sure if we were ready for clang to create it yet or not. Can we let this go as is and change it to funnel shift once we have the variable case fixed in the backend? Comment Actions Isn't implementing this in plain old C the nicest approach anyhow, even once funnel shift exists? Comment Actions The only weird thing that I can really think of with the C version is that the 'and' on the shift amount might get hoisted out of a loop and not get dropped during isel. Comment Actions LGTM. I'm inclined to let this go in now since we have a requested use for it. We can change it to funnel shift once we're confident in the backend. Comment Actions No, the primary reason for creating the intrinsic is that we can’t guarantee that we’ll recognize the pattern as ‘shld’ or ‘rotate’ in this C/IR form. That’s because the pattern can get split across basic blocks, so the backend can’t easily recognize it. Also, IIUC this patch only deals with the specific case of 64-bit values, so we still don’t have the functionality for other widths? I don’t object if this helps that one case, but we have made the investment in the IR intrinsics, so using them should be the general direction/goal. Comment Actions r337619, thanks! The hoisting point is a good one; will rewrite using funnelshift once that's possible :-) |