This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking][ScalarEvolution] improving llvm.assume's support for the argument value without context & reducing the result range of ScalarEvolution::getRange using computeConstantRange
Needs RevisionPublic

Authored by CaprYang on Jul 15 2023, 10:43 PM.

Details

Summary

When optimizing the following code using the HardwareLoops pass, I found that due to the inability to determine the range of the 'step' variable, conservative behavior was applied and the hardware loop was not generated.

opt -passes='hardware-loops<force-hardware-loops>' test.ll -S -o test_opt.ll
define void @TestHWLoops(ptr %out, i32 %step) {
entry:
  %gt0 = icmp sgt i32 %step, 0
  %lt10 = icmp slt i32 %step, 10
  tail call void @llvm.assume(i1 %gt0)
  tail call void @llvm.assume(i1 %lt10)
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i = phi i32 [ 0, %entry ], [ %i.next, %for.body ]
  %arrayidx = getelementptr inbounds i32, ptr %out, i32 %i
  store i32 0, ptr %arrayidx, align 4
  %i.next = add i32 %i, %step
  %cmp = icmp slt i32 %i.next, 10
  br i1 %cmp, label %for.body, label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.body
  ret void
}

declare void @llvm.assume(i1)

I tried to add an assume statement to 'step' to ensure its range is [1,10), but it still did not take effect.

I observed two reasons for this:

  1. When checking if it is safe to transform the loop, ScalarEvolution::isKnownPositive is used, which ultimately calls ScalarEvolution::getRangeRef. However, in the case of scUnknown, assume is not effectively utilized to obtain range information. Therefore, I used ValueTracking's computeConstantRange here.
  2. In computeConstantRange, Arguments cannot set CtxI to themselves like regular Instructions. Additionally, Arguments do not need to worry about being deleted like regular instructions.

Using the following changes can correctly generate hardware loops:

define void @TestHWLoops(ptr %out, i32 %step) {
entry:
  %gt0 = icmp sgt i32 %step, 0
  %lt10 = icmp slt i32 %step, 10
  tail call void @llvm.assume(i1 %gt0)
  tail call void @llvm.assume(i1 %lt10)
  %0 = udiv i32 9, %step
  %1 = add nuw nsw i32 %0, 1
  call void @llvm.set.loop.iterations.i32(i32 %1)
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i = phi i32 [ 0, %entry ], [ %i.next, %for.body ]
  %arrayidx = getelementptr inbounds i32, ptr %out, i32 %i
  store i32 0, ptr %arrayidx, align 4
  %i.next = add i32 %i, %step
  %2 = call i1 @llvm.loop.decrement.i32(i32 1)
  br i1 %2, label %for.body, label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.body
  ret void
}

; Function Attrs: nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: readwrite)
declare void @llvm.assume(i1 noundef) #0

; Function Attrs: nocallback noduplicate nofree nosync nounwind willreturn
declare void @llvm.set.loop.iterations.i32(i32) #1

; Function Attrs: nocallback noduplicate nofree nosync nounwind willreturn
declare i1 @llvm.loop.decrement.i32(i32) #1

attributes #0 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: readwrite) }
attributes #1 = { nocallback noduplicate nofree nosync nounwind willreturn }

Diff Detail

Event Timeline

CaprYang created this revision.Jul 15 2023, 10:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2023, 10:43 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
CaprYang requested review of this revision.Jul 15 2023, 10:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2023, 10:43 PM
CaprYang updated this revision to Diff 540784.Jul 16 2023, 4:38 AM
CaprYang updated this revision to Diff 540819.Jul 16 2023, 10:45 AM
CaprYang edited the summary of this revision. (Show Details)

I'm not sure if my current changes are reasonable. If possible, I will add with tests and support more functions.

CaprYang updated this revision to Diff 540877.Jul 17 2023, 12:21 AM
CaprYang updated this revision to Diff 541030.Jul 17 2023, 7:46 AM
arsenm added inline comments.Jul 20 2023, 5:45 AM
llvm/lib/Analysis/ScalarEvolution.cpp
6868–6873 ↗(On Diff #541030)

This is a separate change

llvm/lib/Analysis/ValueTracking.cpp
103

An empty block isn't well formed IR, does it really need to handle that

120

Instead of pretending arguments are at the first instruction, could we just directly attach ranges to Arguments (by an attribute)

CaprYang added inline comments.Jul 30 2023, 11:07 AM
llvm/lib/Analysis/ScalarEvolution.cpp
6868–6873 ↗(On Diff #541030)

okey, i will split it into a separate commit at later

llvm/lib/Analysis/ValueTracking.cpp
103

improved

120

Use the attribute to set MD_range? Because the value range constraint may be influenced by other variable values, I think "assume" is more appropriate. Also, I think "assume" should be effective for all values, and the Arguments should not be an exception.
CtxI is set as the first instruction, because if it is effective for the first instruction, it will definitely be effective for subsequent instructions? maybe? And what would be the downside of doing this?

CaprYang updated this revision to Diff 545456.Jul 30 2023, 11:08 AM
nikic added a subscriber: nikic.Jul 30 2023, 12:22 PM

Please see D93974, D97077, D97099 and D97092 for existing attempts to tackle this problem. In particular, you'll want to read the discussion on the last one of these.

arsenm requested changes to this revision.Aug 14 2023, 2:49 PM

Should consider previous attempts

This revision now requires changes to proceed.Aug 14 2023, 2:49 PM

Please see D93974, D97077, D97099 and D97092 for existing attempts to tackle this problem. In particular, you'll want to read the discussion on the last one of these.

IIUC the fundemental issue in D97092 is there is no way to guarantee the assume won't be used to influence the condition actually being assumed. But if the scope of the change is limited to arguments, which have no dependencies that can be modified, isn't that concern voided. I get that that makes it so that we can literally make this work for i1 arguments, but maybe there is a separate way forward that protects assume(condition(arg)). Probably too big a hammer, but something like the lines of @llvm.icmp.<pred>.assume(lhs, rhs) seems like it would allow for assumes on arguments.