This is an archive of the discontinued LLVM Phabricator instance.

[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
ClosedPublic

Authored by mgehre-amd on Jul 19 2022, 4:34 AM.

Details

Summary

This adds the ExpandLargeDivRem [0] to the default pass pipeline.
The limit at which it expands div/rem instructions is configured
via a new TargetTransformInfo hook (default: no expansion)
X86, Arm and AArch64 backends implement this hook to expand div/rem
instructions with more than 128 bits.

[0] https://reviews.llvm.org/D126644

Diff Detail

Event Timeline

mgehre-amd created this revision.Jul 19 2022, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2022, 4:34 AM
mgehre-amd requested review of this revision.Jul 19 2022, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2022, 4:34 AM
Herald added a subscriber: wdng. · View Herald Transcript

Set limit to 64 bit for ARM and x86_32

Thank you for working on this @mgehre-amd!

Pinging for review; I'm hoping we can improve our _BitInt support with these changes (that I'm unqualified to review myself).

Adding some more reviewers for x86 who were helping with prior reviews in this area.

Thanks @aaron.ballman for helping out with the reviewers!

Is this to address @arsenm's comments at https://reviews.llvm.org/D126644#3581469?

Yes, this PR introduces the hooks into TargetTransformInfo so targets can decide when/if the lowering needs to occur.

The tests are too prolix to check the correctness and it's fragile to use so many registers. I suggest we remove those tests and just do run-time time in llvm-test-suite.

FreddyYe added a comment.EditedAug 29 2022, 6:49 AM

I created one single source demo test: https://reviews.llvm.org/D132850, which is BitInt(256) divide test. Unit tests should be able to cover the pass and easy to review, @mgehre-amd will it help, WDYT? I can help extend the tests if it's the right direction.

The tests are too prolix to check the correctness and it's fragile to use so many registers. I suggest we remove those tests and just do run-time time in llvm-test-suite.

Yes, you are right. I will restrict the IR tests to maybe just check for the absence of a call instructions? And at least to check that llc doesn't assert like it used to do.

I created one single source demo test: https://reviews.llvm.org/D132850, which is BitInt(256) divide test. Unit tests should be able to cover the pass and easy to review, @mgehre-amd will it help, WDYT? I can help extend the tests if it's the right direction.

Thank you, this looks like the right approach!

The tests are too prolix to check the correctness and it's fragile to use so many registers. I suggest we remove those tests and just do run-time time in llvm-test-suite.

Yes, you are right. I will restrict the IR tests to maybe just check for the absence of a call instructions? And at least to check that llc doesn't assert like it used to do.

Yes, checking no call instructions sounds good.

  • Simplified tests to only check for NOT: call
  • Rebased on main
  • Disable expand-large-div-rem when the second operand of div/rem is a power-of-two constant. For those, the backend has peephole optimizations, see llvm/test/CodeGen/X86/i128-sdiv.ll
  • Updated new test since rebasing:
    • llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed/unsigned.ll, llvm/test/CodeGen/X86/i128-udiv/sdiv.ll: Those tests were checking that call __divti3 was created on X86 for 128 bit divisions, which makes no sense because __divti3 is neither implemented by libgcc nor by compiler-rt on X86 in 32-bit mode.
    • Removed test llvm/test/CodeGen/X86/libcall-sret.ll: It was checking that the backend doesn't crash when emitting a 128 bit libcall on 32-bit X86. Now that we don't emit calls to __divti3 on X86 anymore, there is no 128 bit libcall left to reproduce the issue.
    • llvm/test/CodeGen/X86/pr38539.ll: Removed X86 section from f(); the comment says that f() is targeted at 64-bit mode- In 32 bit mode, we would now get a urem expansion.

I just updated the runtime test(https://reviews.llvm.org/D132850) for some extension, it works correct in my local environment. But the pre-merge build seems to not work correct, and I'm afraid if there are some Endian issues in my code. Anyway, it can be refined later.

FreddyYe accepted this revision.Sep 6 2022, 1:35 AM

LGTM. Thanks @mgehre-amd for your great work to implementing this feature!

This revision is now accepted and ready to land.Sep 6 2022, 1:35 AM
This revision was landed with ongoing or failed builds.Sep 6 2022, 7:32 AM
This revision was automatically updated to reflect the committed changes.
melver added a subscriber: melver.Sep 6 2022, 7:49 AM
melver added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
294

unsigned?

https://lab.llvm.org/buildbot/#/builders/13/builds/25460

C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe  /nologo /TP -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\buildbot\mlir-x64-windows-ninja\build\lib\CodeGen -IC:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\lib\CodeGen -IC:\buildbot\mlir-x64-windows-ninja\build\include -IC:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\include /DWIN32 /D_WINDOWS   /WX /Zc:inline /Zc:__cplusplus /Oi /bigobj /permissive- /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2  /EHs-c- /GR- -UNDEBUG -std:c++17 /showIncludes /Folib\CodeGen\CMakeFiles\LLVMCodeGen.dir\LLVMTargetMachine.cpp.obj /Fdlib\CodeGen\CMakeFiles\LLVMCodeGen.dir\LLVMCodeGen.pdb /FS -c C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\lib\CodeGen\LLVMTargetMachine.cpp
C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\include\llvm/Analysis/TargetTransformInfoImpl.h(295): error C2220: the following warning is treated as an error
C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\include\llvm/Analysis/TargetTransformInfoImpl.h(295): warning C4305: 'return': truncation from 'llvm::IntegerType::<unnamed-enum-MIN_INT_BITS>' to 'bool'
mgehre-amd added inline comments.Sep 6 2022, 8:08 AM
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
294

Yes, sorry, I'm about to push a fix

Breaks https://lab.llvm.org/buildbot/#/builders/85/builds/10520

******************** TEST 'LLVM :: Transforms/ExpandLargeDivRem/udiv129.ll' FAILED ********************
Script:
--
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/opt -S -expand-large-div-rem < /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll | /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll
--
Exit Code: 1

Command Output (stderr):
--
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll:6:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: _udiv-special-cases:
              ^
<stdin>:5:19: note: scanning from here
define void @test(i129* %ptr, i129* %out) #0 {
                  ^
<stdin>:7:9: note: possible intended match here
 %res = udiv i129 %a, 3
        ^

Input file: <stdin>
Check file: /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3:  
          4: ; Function Attrs: nounwind 
          5: define void @test(i129* %ptr, i129* %out) #0 { 
next:6'0                       X~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
          6:  %a = load i129, i129* %ptr, align 4 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          7:  %res = udiv i129 %a, 3 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~
next:6'1             ?                possible intended match
          8:  store i129 %res, i129* %out, align 4 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          9:  ret void 
next:6'0     ~~~~~~~~~~
         10: } 
next:6'0     ~~
         11:  
next:6'0     ~
         12: attributes #0 = { nounwind } 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

Breaks https://lab.llvm.org/buildbot/#/builders/85/builds/10520

******************** TEST 'LLVM :: Transforms/ExpandLargeDivRem/udiv129.ll' FAILED ********************
Script:
--
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/opt -S -expand-large-div-rem < /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll | /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll
--
Exit Code: 1

Command Output (stderr):
--
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll:6:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: _udiv-special-cases:
              ^
<stdin>:5:19: note: scanning from here
define void @test(i129* %ptr, i129* %out) #0 {
                  ^
<stdin>:7:9: note: possible intended match here
 %res = udiv i129 %a, 3
        ^

Input file: <stdin>
Check file: /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: ; ModuleID = '<stdin>' 
          2: source_filename = "<stdin>" 
          3:  
          4: ; Function Attrs: nounwind 
          5: define void @test(i129* %ptr, i129* %out) #0 { 
next:6'0                       X~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
          6:  %a = load i129, i129* %ptr, align 4 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          7:  %res = udiv i129 %a, 3 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~
next:6'1             ?                possible intended match
          8:  store i129 %res, i129* %out, align 4 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          9:  ret void 
next:6'0     ~~~~~~~~~~
         10: } 
next:6'0     ~~
         11:  
next:6'0     ~
         12: attributes #0 = { nounwind } 
next:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

Thanks for letting me know, I already pushed a fix.

LGTM. Thanks @mgehre-amd for your great work to implementing this feature!

Agreed, thank you for this work! Do you agree that we can now bump BITINT_MAXWIDTH in Clang above 128 now, or are we still missing support for that (floating-point conversions were also a problem IIRC)?

LGTM. Thanks @mgehre-amd for your great work to implementing this feature!

Agreed, thank you for this work! Do you agree that we can now bump BITINT_MAXWIDTH in Clang above 128 now, or are we still missing support for that (floating-point conversions were also a problem IIRC)?

Thanks back to everyone who helped by reviewing!
Unfortunately, float-to-big-int and and big-int-to-float conversions still crash in the backend.
I want to take a look at this next.

LGTM. Thanks @mgehre-amd for your great work to implementing this feature!

Agreed, thank you for this work! Do you agree that we can now bump BITINT_MAXWIDTH in Clang above 128 now, or are we still missing support for that (floating-point conversions were also a problem IIRC)?

Thanks back to everyone who helped by reviewing!
Unfortunately, float-to-big-int and and big-int-to-float conversions still crash in the backend.
I want to take a look at this next.

Thank you for confirming and the offer to look into fixing up that support as well. It's greatly appreciated!

arsenm added inline comments.Sep 9 2022, 6:13 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
690–691

TargetTransformInfo isn't really the appropriate place to put something for a lowering decision. TargetLowering would make more sense

arsenm added inline comments.Sep 9 2022, 6:21 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
690–691

The name here is also misleading. It's not the max legal width, just the maximum codegen supports. 128 is still not really legal in the normal use of the term

mgehre-amd marked 2 inline comments as done.Sep 12 2022, 1:27 AM
mgehre-amd added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfo.h
690–691

Hi @arsenm, thanks for pointing this out!

I prepared a PR to move this over to TargetLowering under the name maxSupportedDivRemBitWidth.
This works with the old pass manager, but I cannot figure out to get a TargetLowering/TargetPassConfig with the new pass manager.
Do you have an idea?

arsenm added inline comments.Sep 12 2022, 5:51 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
690–691

CodeGen is only using the old pass manager so it's a bit of moot point right now

mgehre-amd marked 2 inline comments as done.Sep 12 2022, 6:30 AM
mgehre-amd added inline comments.
llvm/include/llvm/Analysis/TargetTransformInfo.h
690–691

I opened https://reviews.llvm.org/D133691 to address this.

llvm/test/CodeGen/X86/i128-udiv.ll