This adds the ExpandLargeDivRem [0] to the default pass pipeline.
The limit at which it expands div/rem instructions is configured
via a new TargetTransformInfo hook (default: no expansion)
X86, Arm and AArch64 backends implement this hook to expand div/rem
instructions with more than 128 bits.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thank you for working on this @mgehre-amd!
Pinging for review; I'm hoping we can improve our _BitInt support with these changes (that I'm unqualified to review myself).
Yes, this PR introduces the hooks into TargetTransformInfo so targets can decide when/if the lowering needs to occur.
The tests are too prolix to check the correctness and it's fragile to use so many registers. I suggest we remove those tests and just do run-time time in llvm-test-suite.
I created one single source demo test: https://reviews.llvm.org/D132850, which is BitInt(256) divide test. Unit tests should be able to cover the pass and easy to review, @mgehre-amd will it help, WDYT? I can help extend the tests if it's the right direction.
Yes, you are right. I will restrict the IR tests to maybe just check for the absence of a call instructions? And at least to check that llc doesn't assert like it used to do.
Thank you, this looks like the right approach!
- Simplified tests to only check for NOT: call
- Rebased on main
- Disable expand-large-div-rem when the second operand of div/rem is a power-of-two constant. For those, the backend has peephole optimizations, see llvm/test/CodeGen/X86/i128-sdiv.ll
- Updated new test since rebasing:
- llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed/unsigned.ll, llvm/test/CodeGen/X86/i128-udiv/sdiv.ll: Those tests were checking that call __divti3 was created on X86 for 128 bit divisions, which makes no sense because __divti3 is neither implemented by libgcc nor by compiler-rt on X86 in 32-bit mode.
- Removed test llvm/test/CodeGen/X86/libcall-sret.ll: It was checking that the backend doesn't crash when emitting a 128 bit libcall on 32-bit X86. Now that we don't emit calls to __divti3 on X86 anymore, there is no 128 bit libcall left to reproduce the issue.
- llvm/test/CodeGen/X86/pr38539.ll: Removed X86 section from f(); the comment says that f() is targeted at 64-bit mode- In 32 bit mode, we would now get a urem expansion.
I just updated the runtime test(https://reviews.llvm.org/D132850) for some extension, it works correct in my local environment. But the pre-merge build seems to not work correct, and I'm afraid if there are some Endian issues in my code. Anyway, it can be refined later.
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
294 | unsigned? https://lab.llvm.org/buildbot/#/builders/13/builds/25460 C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe /nologo /TP -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\buildbot\mlir-x64-windows-ninja\build\lib\CodeGen -IC:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\lib\CodeGen -IC:\buildbot\mlir-x64-windows-ninja\build\include -IC:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\include /DWIN32 /D_WINDOWS /WX /Zc:inline /Zc:__cplusplus /Oi /bigobj /permissive- /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /EHs-c- /GR- -UNDEBUG -std:c++17 /showIncludes /Folib\CodeGen\CMakeFiles\LLVMCodeGen.dir\LLVMTargetMachine.cpp.obj /Fdlib\CodeGen\CMakeFiles\LLVMCodeGen.dir\LLVMCodeGen.pdb /FS -c C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\lib\CodeGen\LLVMTargetMachine.cpp C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\include\llvm/Analysis/TargetTransformInfoImpl.h(295): error C2220: the following warning is treated as an error C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\include\llvm/Analysis/TargetTransformInfoImpl.h(295): warning C4305: 'return': truncation from 'llvm::IntegerType::<unnamed-enum-MIN_INT_BITS>' to 'bool' |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
294 | Yes, sorry, I'm about to push a fix |
Breaks https://lab.llvm.org/buildbot/#/builders/85/builds/10520
******************** TEST 'LLVM :: Transforms/ExpandLargeDivRem/udiv129.ll' FAILED ******************** Script: -- : 'RUN: at line 2'; /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/opt -S -expand-large-div-rem < /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll | /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll -- Exit Code: 1 Command Output (stderr): -- /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll:6:15: error: CHECK-NEXT: expected string not found in input ; CHECK-NEXT: _udiv-special-cases: ^ <stdin>:5:19: note: scanning from here define void @test(i129* %ptr, i129* %out) #0 { ^ <stdin>:7:9: note: possible intended match here %res = udiv i129 %a, 3 ^ Input file: <stdin> Check file: /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll -dump-input=help explains the following input dump. Input was: <<<<<< 1: ; ModuleID = '<stdin>' 2: source_filename = "<stdin>" 3: 4: ; Function Attrs: nounwind 5: define void @test(i129* %ptr, i129* %out) #0 { next:6'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found 6: %a = load i129, i129* %ptr, align 4 next:6'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7: %res = udiv i129 %a, 3 next:6'0 ~~~~~~~~~~~~~~~~~~~~~~~~ next:6'1 ? possible intended match 8: store i129 %res, i129* %out, align 4 next:6'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 9: ret void next:6'0 ~~~~~~~~~~ 10: } next:6'0 ~~ 11: next:6'0 ~ 12: attributes #0 = { nounwind } next:6'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>
Agreed, thank you for this work! Do you agree that we can now bump BITINT_MAXWIDTH in Clang above 128 now, or are we still missing support for that (floating-point conversions were also a problem IIRC)?
Thanks back to everyone who helped by reviewing!
Unfortunately, float-to-big-int and and big-int-to-float conversions still crash in the backend.
I want to take a look at this next.
Thank you for confirming and the offer to look into fixing up that support as well. It's greatly appreciated!
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
690–691 | TargetTransformInfo isn't really the appropriate place to put something for a lowering decision. TargetLowering would make more sense |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
690–691 | The name here is also misleading. It's not the max legal width, just the maximum codegen supports. 128 is still not really legal in the normal use of the term |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
690–691 | Hi @arsenm, thanks for pointing this out! I prepared a PR to move this over to TargetLowering under the name maxSupportedDivRemBitWidth. |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
690–691 | CodeGen is only using the old pass manager so it's a bit of moot point right now |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
690–691 | I opened https://reviews.llvm.org/D133691 to address this. |
TargetTransformInfo isn't really the appropriate place to put something for a lowering decision. TargetLowering would make more sense