This is an archive of the discontinued LLVM Phabricator instance.

[SCCP] convert signed div/rem to unsigned for non-negative operands
ClosedPublic

Authored by spatel on Sep 2 2022, 5:08 AM.

Details

Summary

This extends the transform added with D81756 to handle div/rem opcodes. For example:
https://alive2.llvm.org/ce/z/cX6za6

This replicates part of what CVP already does, but the motivating example from issue #57472, demonstrates a phase ordering problem - we convert branches to select before CVP runs and miss the transform.

I didn't find an explanation for skipping a constant operand in the existing zext code (ie, sext i8 42 -> zext i8 42 is valid, but we skip it). So that clause is not included for this transform, but there is a test diff for that pattern in case it should be bypassed.

I'd split this into a refactoring NFC patch that adds the helper function, then add the code for sdiv/srem cases if approved. I can also add ashr->lshr as a follow-up.

Diff Detail

Event Timeline

spatel created this revision.Sep 2 2022, 5:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2022, 5:08 AM
spatel requested review of this revision.Sep 2 2022, 5:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2022, 5:08 AM
alex added a subscriber: alex.Sep 2 2022, 5:17 AM
BaoshanPang added inline comments.Sep 2 2022, 9:45 AM
llvm/test/Transforms/SCCP/divrem.ll
141

Should we add a 'TODO' here just like what we do for @sdiv_nonneg0_nonnegconst1 ?

spatel added inline comments.Sep 2 2022, 10:53 AM
llvm/test/Transforms/SCCP/divrem.ll
141

This one is not safe to transform - it needs 'nsw':
https://alive2.llvm.org/ce/z/VDuSrw

I can make that explicit in the comment.

BaoshanPang added inline comments.Sep 2 2022, 11:16 AM
llvm/test/Transforms/SCCP/divrem.ll
141

I see. Thanks for the example.
Then the patch looks good to me.

fhahn added a comment.Sep 2 2022, 1:40 PM

I didn't find an explanation for skipping a constant operand in the existing zext code (ie, sext i8 42 -> zext i8 42 is valid, but we skip it). So that clause is not included for this transform, but there is a test diff for that pattern in case it should be bypassed.

Do those operations have users left? If they can be simplified to a constant, I'd expect tryToReplaceWithConstant to take care of replacing all users of them.

I'd split this into a refactoring NFC patch that adds the helper function, then add the code for sdiv/srem cases if approved. I

Splitting off the helper first would make sense, overall this looks like a nice improvement.

spatel added a comment.Sep 3 2022, 7:20 AM

I didn't find an explanation for skipping a constant operand in the existing zext code (ie, sext i8 42 -> zext i8 42 is valid, but we skip it). So that clause is not included for this transform, but there is a test diff for that pattern in case it should be bypassed.

Do those operations have users left? If they can be simplified to a constant, I'd expect tryToReplaceWithConstant to take care of replacing all users of them.

Yes, the instruction still has a user. For example, when I step through this:

define i16 @srem_cmp_constants() {
  %sel = select i1 false, i16 0, i16 12704
  %r = srem i16 %sel, 0
  ret i16 %r
}

I see:

% opt -ipsccp sdiv.ll -S -debug
Args: opt -ipsccp sdiv.ll -S -debug 
Marking Block Executable: 

Popped off BBWL: 
  %sel = select i1 false, i16 0, i16 12704
  %r = srem i16 %sel, 0
  ret i16 %r

Merged constantrange<12704, 12705> into   %sel = select i1 false, i16 0, i16 12704 : constantrange<12704, 12705>

Popped off I-WL:   %sel = select i1 false, i16 0, i16 12704
RESOLVING UNDEFS
markOverdefined:   %r = srem i16 %sel, 0

Popped off OI-WL:   %r = srem i16 %sel, 0
Merged overdefined into define i16 @srem_cmp_constants() {
  %sel = select i1 false, i16 0, i16 12704
  %r = srem i16 %sel, 0
  ret i16 %r
}
 : overdefined

Popped off OI-WL: define i16 @srem_cmp_constants() {
  %sel = select i1 false, i16 0, i16 12704
  %r = srem i16 %sel, 0
  ret i16 %r
}

RESOLVING UNDEFS
  Constant: i16 12704 =   %sel = select i1 false, i16 0, i16 12704
; ModuleID = 'sdiv.ll'
source_filename = "sdiv.ll"

define i16 @srem_cmp_constants() {
  %r = srem i16 12704, 0
  ret i16 %r
}

So we are doing the RAUW, but the solver is not re-run on those uses to get the subsequent constant-folding?

spatel added a comment.Sep 3 2022, 7:28 AM

define i16 @srem_cmp_constants() {

%r = srem i16 12704, 0
ret i16 %r

}

So we are doing the RAUW, but the solver is not re-run on those uses to get the subsequent constant-folding?

Ah - just noticed that it all folds away except in this case with 0 divisor. That's treated as a special-case because it is immediate UB?

spatel updated this revision to Diff 457793.Sep 3 2022, 7:39 AM

Patch updated:
Rebased after refactoring (22e1f66f26b8).
Added explanatory comment to negative multiply test.

spatel updated this revision to Diff 457795.Sep 3 2022, 7:46 AM

Patch updated:
Missed adding a test comment in the previous revision.

Should we do this in Instcombine also? Like check Known.isNonNegative() for LHS and RHS in instcombine?
SCCP can work on the cases with dominated condition. And Instcombine with value tracking also help the case like abs or and a mask without sign bits?

Should we do this in Instcombine also? Like check Known.isNonNegative() for LHS and RHS in instcombine?
SCCP can work on the cases with dominated condition. And Instcombine with value tracking also help the case like abs or and a mask without sign bits?

It's already done:
https://github.com/llvm/llvm-project/blob/8534f514747d57cd42602403eb98c8230f5c7ff9/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp#L1218
https://github.com/llvm/llvm-project/blob/8534f514747d57cd42602403eb98c8230f5c7ff9/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp#L1607

We miss the transform in the motivating case because the dominating condition is converted to select. So we could also match select patterns in instcombine, but I like this as a more general solution.

spatel accepted this revision.Sep 6 2022, 5:27 AM

Based on previous "LGTM" comment, setting this to "accepted", so it won't complain on commit.

This revision is now accepted and ready to land.Sep 6 2022, 5:27 AM
fhahn added a comment.Sep 6 2022, 5:35 AM

Ah - just noticed that it all folds away except in this case with 0 divisor. That's treated as a special-case because it is immediate UB?

Yeah, probably not worth worrying about

fhahn added a comment.Sep 6 2022, 10:23 AM

It looks like this is causing a crash when building llvm-test-suite. Ive reverted the commit for now to unblock some bots.

To reproduce, run opt -passes=ipsccp on the IR below.

@g = internal global i32 256, align 4

define void @test() {
entry:
  %0 = load i32, ptr @g, align 4
  %div = sdiv i32 %0, undef
  ret void
}
fhahn added a comment.Sep 6 2022, 10:41 AM

I should have been clearer, this patch seems to cause the assertion below for the reproducer:

Assertion failed: (I != ValueState.end() && "V not found in ValueState nor Paramstate map!"), function getLatticeValueFor, file SCCPSolver.cpp, line 418.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: ..../bin/opt -ipsccp bugpoint-reduced-simplified.ll
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  opt                      0x0000000106285094 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  opt                      0x0000000106284094 llvm::sys::RunSignalHandlers() + 112
2  opt                      0x0000000106285720 SignalHandler(int) + 344
3  libsystem_platform.dylib 0x0000000197bc82a4 _sigtramp + 56
4  libsystem_pthread.dylib  0x0000000197b99cec pthread_kill + 288
5  libsystem_c.dylib        0x0000000197ad32c8 abort + 180
6  libsystem_c.dylib        0x0000000197ad2620 err + 0
7  opt                      0x00000001072a5cac llvm::SCCPInstVisitor::getLatticeValueFor(llvm::Value*) const (.cold.3) + 0
8  opt                      0x00000001063883ec llvm::SCCPSolver::getTrackedRetVals() + 0
9  opt                      0x000000010610dbb8 simplifyInstsInBlock(llvm::SCCPSolver&, llvm::BasicBlock&, llvm::SmallPtrSetImpl<llvm::Value*>&, llvm::TrackingStatistic&, llvm::TrackingStatistic&) + 964
10 opt                      0x000000010610c144 llvm::runIPSCCP(llvm::Module&, llvm::DataLayout const&, std::__1::function<llvm::TargetLibraryInfo const& (llvm::Function&)>, llvm::function_ref<llvm::AnalysisResultsForFn (llvm::Function&)>) + 1488
11 opt                      0x0000000105cac960 llvm::IPSCCPPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 200
12 opt                      0x0000000105a688e0 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 460
13 opt                      0x0000000104979b10 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::ArrayRef<llvm::PassPlugin>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool) + 12628
14 opt                      0x0000000104989eb4 main + 13084
15 dyld                     0x0000000197873e50 start + 2544
spatel added a comment.Sep 6 2022, 1:49 PM

I should have been clearer, this patch seems to cause the assertion below for the reproducer:

Assertion failed: (I != ValueState.end() && "V not found in ValueState nor Paramstate map!"), function getLatticeValueFor, file SCCPSolver.cpp, line 418.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: ..../bin/opt -ipsccp bugpoint-reduced-simplified.ll
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  opt                      0x0000000106285094 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  opt                      0x0000000106284094 llvm::sys::RunSignalHandlers() + 112
2  opt                      0x0000000106285720 SignalHandler(int) + 344
3  libsystem_platform.dylib 0x0000000197bc82a4 _sigtramp + 56
4  libsystem_pthread.dylib  0x0000000197b99cec pthread_kill + 288
5  libsystem_c.dylib        0x0000000197ad32c8 abort + 180
6  libsystem_c.dylib        0x0000000197ad2620 err + 0
7  opt                      0x00000001072a5cac llvm::SCCPInstVisitor::getLatticeValueFor(llvm::Value*) const (.cold.3) + 0
8  opt                      0x00000001063883ec llvm::SCCPSolver::getTrackedRetVals() + 0

Thanks for reverting. I didn't see any bot fail mails that correspond to this crash, but I can repro the minimal test crashing locally.
However, I'm still not sure what is going on in the internals of the solver. Strangely, there's no crash if I just insert the same test into the file with the other tests that were added for this patch. Something about the state of other tests prevents the crash.
Maybe this is why the existing code avoided transforming a constant-foldable instruction in the sext case? :)
If I add a clause like that here, it seems to avoid the trouble, but I'll try to step through some more and see if I can find the real cause of the problem.