This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
830–843	I think these two ifs are redundant with the `!isa<GlobalVariable>(PtrOpV)` check that's done below the while loop. It should be possible to just drop them.
853	This looks confused. The second element in the pair is not the IndexTypeSize, but already the Scale (as an APInt).
877	divission -> division
887–888	I think we can keep this non-optional. `Stride=1` is always valid as a fallback.

Thank you for the review!

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
830–843	Thanks for the good catch!
853	Sounds reasonable! (I thought this bit width casting is necessary to avoid mismatching on APInt ops, but it's not.)
887–888	yeah, but the GCD calculation process seems good to be undefined on the first. I'll assign 1 on the following Stride's "nullopt" checking!

apply feedbacks

Harbormaster completed remote builds in B230549: Diff 520257.May 7 2023, 11:59 PM

nikic added inline comments.May 8 2023, 3:59 AM

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
887–888	You are right that inside getStrideAndModOffsetOfGEP() we should start with `std::optional<APInt>`. My thinking it that we can use plain `APInt` outside it. What do you think about making `getStrideAndModOffsetOfGEP()` return `std::pair<APInt, APInt>` and then calling it like `auto [Stride, ConstOffset] = getStrideAndModOffsetOfGEP(PtrOp, DL)`? In that case, getStrideAndModOffsetOfGEP would return Stride=1, ConstOffset=0 for the case where it can't determine anything.

khei4 added inline comments.May 8 2023, 6:25 AM

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
887–888	Sounds good! I'll apply;)

khei4 added inline comments.May 8 2023, 6:30 AM

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
834–835	I noticed this failure become uncapturable. I'm not sure the cases GEPOperator::collectOffset return false, but as we confirm, we can decide by Stride 1 whatever:)

nikic added inline comments.May 8 2023, 6:46 AM

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
834–835	I think you can replace it with `break;` so it falls through to the `!isa<GlobalVariable>(PtrOp)` handling. collectOffset() only fails for scalable vectors, so it's a very rare case...

make Stride non-optional on foldPatternedLoads(the caller function) body.

Logic looks good, some style nits. Can you please check the compile-time impact for this patch?

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
824–825
828	nit: Unnecessary empty line.
833
834
847–848	I don't think there's a need to create a separate APInt here?
855
880	I think you can move BW into the if below now, it's not used otherwise.
895	I think the `!Stride` isn't needed anymore.

khei4 added inline comments.May 8 2023, 8:06 AM

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
834–835	I noticed it now! it sounds shorter than I did!

Thank you for the quick good catches!

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
847–848	Good catch! I believed this is necessary to match bit-width, but that was another place...
880	The same goes for LoadTy! Thanks!

apply feedbacks

Harbormaster completed remote builds in B230635: Diff 520362.May 8 2023, 8:47 AM

Can you please check the compile-time impact for this patch?

https://llvm-compile-time-tracker.com/compare.php?from=4c457e81c4ed78e237b408fb480909a956432638&to=07a1587220d85d7cf7aff10518e7e3cce2784aca&stat=instructions:u

This seems no impact on compile-time!

nikic accepted this revision.May 9 2023, 4:18 AM

This revision is now accepted and ready to land.May 9 2023, 4:18 AM

khei4 mentioned this in rG0574a4be879e: [AggressiveInstCombine] folding load for constant global patterened arrays and….May 9 2023, 7:22 AM

This is causing a compiler crash, which looks like a div by zero. I'll start working on a repro, but I'd like to revert to keep trunk green.

rupprecht mentioned this in rGe08c397a8807: Revert "[AggressiveInstCombine] folding load for constant global patterened….May 9 2023, 10:39 AM

In D146622#4330046, @rupprecht wrote:

This is causing a compiler crash, which looks like a div by zero. I'll start working on a repro, but I'd like to revert to keep trunk green.

Reverted in e08c397a88077c50dc25e71b39b9d5efbfc85a9a

The following crashes under opt -passes=aggressive-instcombine:

; ModuleID = 'repro.ll'
source_filename = "repro.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.snork = type { i128 }

@global = internal constant %struct.snork { i128 1 }

define i1 @snork() {
bb:
  %load = load i64, ptr @global, align 8
  %icmp = icmp sgt i64 %load, 0
  call void @llvm.assume(i1 %icmp)
  ret i1 false
}

; Function Attrs: nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: readwrite)
declare void @llvm.assume(i1 noundef) #0

attributes #0 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: readwrite) }

Halide also found this too: https://buildbot.halide-lang.org/master/#/builders/76/builds/64/steps/9/logs/stdio

@khei4 FYI, 0574a4be879e07b48ba9be8d63eebba49a04dfe8 somehow placed everything on one line:)

@rupprecht Thank you for handling regression! Sorry for being late. And repro seems a really common case...
only place using division is L859 on https://reviews.llvm.org/rG0574a4be879e07b48ba9be8d63eebba49a04dfe8. I'll handle the zero-size case!

Edit: FYI, This is caused by a lack of handling a 128bit integer's type size. ~~no-GEPed load. -O2 or -O3 couldn't repro this, for the above case, normal ConstantFolding seems to eliminate this case.~~ -O2 or -O3 could repro this. I'll add this case to the AggressiveInstCombine test! Thank you for the report!

@MaskRay Maybe that's why I wrote it just by git commit -m. I rewrite it also.

WIP notes:
This is clearly div by 0 error, but especially something wrong with i128. I'm now searching why handling is needed for i128. For the following test cases only the last one crash.

@g128 = internal constant i128 42
@g64 = internal constant i64 42
define i128 @no-gep-128-as128(i64 %idx){
 %1 = load i128, ptr @g128, align 4
  ret i128 %1
}
define i128 @no-gep-64-as128(i64 %idx){
 %1 = load i128, ptr @g64, align 4
  ret i128 %1
}
define i32 @no-gep-64-as32(i64 %idx){
 %1 = load i32, ptr @g64, align 4
  ret i32 %1
}
define i32 @no-gep-128-as32-crash(i64 %idx){
 %1 = load i32, ptr @g128, align 4
  ret i32 %1
}

although I'm now wondering why i128 couldn't be folded by EarlyCSEPass
https://llvm.godbolt.org/z/YbP4W3Y4x
I believe this will fix the div zero crashes.

add comment newlines

Harbormaster completed remote builds in B231079: Diff 520967.May 10 2023, 6:54 AM

The updated code looks good to me, but could you please also add the extra test case (that was previously crashing)?

And yes, the fact that the load from i128 constant does not fold is weird. I think it is due to this check: https://github.com/llvm/llvm-project/blob/0f800dfe036c12e1883586234bcae2be33d82920/llvm/lib/Analysis/ConstantFolding.cpp#L432 It would be nice to relax it so it can handle wider integers.

The updated code looks good to me, but could you please also add the extra test case (that was previously crashing)?

@nikic Thank you for reminding me! I added. (sorry I said in the former text.

I think it is due to this check: https://github.com/llvm/llvm-project/blob/0f800dfe036c12e1883586234bcae2be33d82920/llvm/lib/Analysis/ConstantFolding.cpp#L432

Thanks a lot for finding that part! I'll try to patch it!

Harbormaster completed remote builds in B231491: Diff 521511.May 11 2023, 7:12 PM

nikic added inline comments.May 12 2023, 1:24 AM

llvm/test/Transforms/AggressiveInstCombine/patterned-load.ll
20	Could you please also rerun update_test_checks.py for the new test?

update test. (It's embarrassing. Sorry...

LGTM

Harbormaster completed remote builds in B231550: Diff 521582.May 12 2023, 2:27 AM

khei4 mentioned this in D150422: [ConstantFolding] fold integers whose bitwidth is greater than 64..May 12 2023, 3:03 AM

khei4 added a commit: rG39a0677784d1: [AggressiveInstCombine] folding load for constant global patterened arrays….May 13 2023, 2:52 AM

Sorry, I failed to link this revision with the following commit...
https://reviews.llvm.org/rG39a0677784d1b53f2d6e33af2a53e915f3f62c86

Revision Contents

Path

Size

llvm/

lib/

Transforms/

AggressiveInstCombine/

AggressiveInstCombine.cpp

59 lines

test/

Transforms/

AggressiveInstCombine/

patterned-load.ll

69 lines

Diff 521582

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp

Show First 20 Lines • Show All 815 Lines • ▼ Show 20 Lines static bool foldConsecutiveLoads(Instruction &I, const DataLayout &DL,

// shift if not zero. // shift if not zero.

if (LOps.Shift) if (LOps.Shift)

NewOp = Builder.CreateShl(NewOp, ConstantInt::get(I.getContext(), *LOps.Shift)); NewOp = Builder.CreateShl(NewOp, ConstantInt::get(I.getContext(), *LOps.Shift));

I.replaceAllUsesWith(NewOp); I.replaceAllUsesWith(NewOp);

return true; return true;

} }

// Calculate GEP Stride and accumulated const ModOffset. Return Stride and

// ModOffset

nikicUnsubmitted

Not Done

return true;

}

- // calculate GEP Stride and accumulated const ModOffset. return Stride and

- // ModOffset

+ // Calculate GEP Stride and accumulated const ModOffset. Return Stride and

+ // ModOffset.

static std::pair<APInt, APInt>

nikic:

static std::pair<APInt, APInt>

getStrideAndModOffsetOfGEP(Value *PtrOp, const DataLayout &DL) {

unsigned BW = DL.getIndexTypeSizeInBits(PtrOp->getType());

nikicUnsubmitted

Not Done

nit: Unnecessary empty line.

nikic: nit: Unnecessary empty line.

std::optional<APInt> Stride;

APInt ModOffset(BW, 0);

// Return a minimum gep stride, greatest common divisor of consective gep

// index scales(c.f. Bézout's identity).

while (auto *GEP = dyn_cast<GEPOperator>(PtrOp)) {

nikicUnsubmitted

Not Done

// Return a minimum gep stride, greatest common divisor of consective gep

- // indices type sizes (c.f. Bézout's identity).

+ // index scales (c.f. Bézout's identity).

while (auto GEP = dyn_cast<GEPOperator>(PtrOp)) {

nikic:

MapVector<Value *, APInt> VarOffsets;

nikicUnsubmitted

Not Done

// indices type sizes (c.f. Bézout's identity).

- while (auto GEP = dyn_cast<GEPOperator>(PtrOp)) {

+ while (auto *GEP = dyn_cast<GEPOperator>(PtrOp)) {

MapVector<Value *, APInt> VarOffsets;

nikic:

if (!GEP->collectOffset(DL, BW, VarOffsets, ModOffset))

khei4AuthorUnsubmitted

Done

I noticed this failure become uncapturable. I'm not sure the cases GEPOperator::collectOffset return false, but as we confirm, we can decide by Stride 1 whatever:)

khei4: I noticed this failure become uncapturable. I'm not sure the cases GEPOperator::collectOffset…

nikicUnsubmitted

Not Done

I think you can replace it with break; so it falls through to the !isa<GlobalVariable>(PtrOp) handling. collectOffset() only fails for scalable vectors, so it's a very rare case...

nikic: I think you can replace it with `break;` so it falls through to the `!isa<GlobalVariable>…

khei4AuthorUnsubmitted

Done

I noticed it now! it sounds shorter than I did!

khei4: I noticed it now! it sounds shorter than I did!

break;

for (auto [V, Scale] : VarOffsets) {

// Only keep a power of two factor for non-inbounds

if (!GEP->isInBounds())

Scale = APInt::getOneBitSet(Scale.getBitWidth(), Scale.countr_zero());

if (!Stride)

nikicUnsubmitted

Not Done

I think these two ifs are redundant with the !isa<GlobalVariable>(PtrOpV) check that's done below the while loop. It should be possible to just drop them.

nikic: I think these two ifs are redundant with the `!isa<GlobalVariable>(PtrOpV)` check that's done…

khei4AuthorUnsubmitted

Done

Thanks for the good catch!

khei4: Thanks for the good catch!

Stride = Scale;

else

Stride = APIntOps::GreatestCommonDivisor(*Stride, Scale);

}

nikicUnsubmitted

Not Done

else

- Stride = APIntOps::GreatestCommonDivisor(

- *Stride, APInt(BW, Scale.getZExtValue()));

+ Stride = APIntOps::GreatestCommonDivisor(*Stride, Scale);

}

PtrOp = GEP->getPointerOperand();

I don't think there's a need to create a separate APInt here?

nikic: I don't think there's a need to create a separate APInt here?

khei4AuthorUnsubmitted

Done

Good catch! I believed this is necessary to match bit-width, but that was another place...

khei4: Good catch! I believed this is necessary to match bit-width, but that was another place...

PtrOp = GEP->getPointerOperand();

}

// Check whether pointer arrives back at Global Variable via at least one GEP.

// Even if it doesn't, we can check by alignment.

nikicUnsubmitted

Not Done

This looks confused. The second element in the pair is not the IndexTypeSize, but already the Scale (as an APInt).

nikic: This looks confused. The second element in the pair is not the IndexTypeSize, but already the…

khei4AuthorUnsubmitted

Done

Sounds reasonable! (I thought this bit width casting is necessary to avoid mismatching on APInt ops, but it's not.)

khei4: Sounds reasonable! (I thought this bit width casting is necessary to avoid mismatching on APInt…

if (!isa<GlobalVariable>(PtrOp) || !Stride)

return {APInt(BW, 1), APInt(BW, 0)};

nikicUnsubmitted

Not Done

// Check whether pointer arrives back at Global Variable.

- // Even if it's not, we can check by alignment.

+ // Even if it doesn't, we can check by alignment.

if (!isa<GlobalVariable>(PtrOp))

nikic:

// In consideration of signed GEP indices, non-negligible offset become

// remainder of division by minimum GEP stride.

ModOffset = ModOffset.srem(*Stride);

if (ModOffset.isNegative())

ModOffset += *Stride;

return {*Stride, ModOffset};

}

/// If C is a constant patterned array and all valid loaded results for given /// If C is a constant patterned array and all valid loaded results for given

/// alignment are same to a constant, return that constant. /// alignment are same to a constant, return that constant.

static bool foldPatternedLoads(Instruction &I, const DataLayout &DL) { static bool foldPatternedLoads(Instruction &I, const DataLayout &DL) {

auto *LI = dyn_cast<LoadInst>(&I); auto *LI = dyn_cast<LoadInst>(&I);

if (!LI || LI->isVolatile()) if (!LI || LI->isVolatile())

return false; return false;

// We can only fold the load if it is from a constant global with definitive // We can only fold the load if it is from a constant global with definitive

// initializer. Skip expensive logic if this is not the case. // initializer. Skip expensive logic if this is not the case.

auto *PtrOp = LI->getPointerOperand(); auto *PtrOp = LI->getPointerOperand();

auto *GV = dyn_cast<GlobalVariable>(getUnderlyingObject(PtrOp)); auto *GV = dyn_cast<GlobalVariable>(getUnderlyingObject(PtrOp));

if (!GV || !GV->isConstant() || !GV->hasDefinitiveInitializer()) if (!GV || !GV->isConstant() || !GV->hasDefinitiveInitializer())

nikicUnsubmitted

Not Done

divission -> division

nikic: divission -> division

return false; return false;

Type *LoadTy = LI->getType();

Constant *C = GV->getInitializer();

// Bail for large initializers in excess of 4K to avoid too many scans. // Bail for large initializers in excess of 4K to avoid too many scans.

nikicUnsubmitted

Not Done

I think you can move BW into the if below now, it's not used otherwise.

nikic: I think you can move BW into the if below now, it's not used otherwise.

khei4AuthorUnsubmitted

Done

The same goes for LoadTy! Thanks!

khei4: The same goes for LoadTy! Thanks!

Constant *C = GV->getInitializer();

uint64_t GVSize = DL.getTypeAllocSize(C->getType()); uint64_t GVSize = DL.getTypeAllocSize(C->getType());

if (!GVSize || 4096 < GVSize) if (!GVSize || 4096 < GVSize)

return false; return false;

// Check whether pointer arrives back at Global Variable. Type *LoadTy = LI->getType();

// If PtrOp is neither GlobalVariable nor GEP, it might not arrive back at

// GlobalVariable.

// TODO: implement GEP handling

unsigned BW = DL.getIndexTypeSizeInBits(PtrOp->getType()); unsigned BW = DL.getIndexTypeSizeInBits(PtrOp->getType());

// TODO: Determine stride based on GEPs. auto [Stride, ConstOffset] = getStrideAndModOffsetOfGEP(PtrOp, DL);

nikicUnsubmitted

Not Done

I think we can keep this non-optional. Stride=1 is always valid as a fallback.

nikic: I think we can keep this non-optional. `Stride=1` is always valid as a fallback.

khei4AuthorUnsubmitted

Done

yeah, but the GCD calculation process seems good to be undefined on the first.
I'll assign 1 on the following Stride's "nullopt" checking!

khei4: yeah, but the GCD calculation process seems good to be undefined on the first. I'll assign 1 on…

nikicUnsubmitted

Not Done

You are right that inside getStrideAndModOffsetOfGEP() we should start with std::optional<APInt>. My thinking it that we can use plain APInt outside it. What do you think about making getStrideAndModOffsetOfGEP() return std::pair<APInt, APInt> and then calling it like auto [Stride, ConstOffset] = getStrideAndModOffsetOfGEP(PtrOp, DL)? In that case, getStrideAndModOffsetOfGEP would return Stride=1, ConstOffset=0 for the case where it can't determine anything.

nikic: You are right that inside getStrideAndModOffsetOfGEP() we should start with `std…

khei4AuthorUnsubmitted

Done

Sounds good! I'll apply;)

khei4: Sounds good! I'll apply;)

APInt Stride(BW, 1);

APInt ConstOffset(BW, 0);

// Any possible offset could be multiple of GEP stride. And any valid // Any possible offset could be multiple of GEP stride. And any valid

// offset is multiple of load alignment, so checking only multiples of bigger // offset is multiple of load alignment, so checking only multiples of bigger

// one is sufficient to say results' equality. // one is sufficient to say results' equality.

if (auto LA = LI->getAlign(); if (auto LA = LI->getAlign();

LA <= GV->getAlign().valueOrOne() && Stride.getZExtValue() < LA.value()) LA <= GV->getAlign().valueOrOne() && Stride.getZExtValue() < LA.value()) {

ConstOffset = APInt(BW, 0);

nikicUnsubmitted

Not Done

I think the !Stride isn't needed anymore.

nikic: I think the `!Stride` isn't needed anymore.

Stride = APInt(BW, LA.value()); Stride = APInt(BW, LA.value());

}

Constant *Ca = ConstantFoldLoadFromConst(C, LoadTy, ConstOffset, DL); Constant *Ca = ConstantFoldLoadFromConst(C, LoadTy, ConstOffset, DL);

if (!Ca) if (!Ca)

return false; return false;

unsigned E = GVSize - DL.getTypeStoreSize(LoadTy); unsigned E = GVSize - DL.getTypeStoreSize(LoadTy);

for (; ConstOffset.getZExtValue() <= E; ConstOffset += Stride) for (; ConstOffset.getZExtValue() <= E; ConstOffset += Stride)

if (Ca != ConstantFoldLoadFromConst(C, LoadTy, ConstOffset, DL)) if (Ca != ConstantFoldLoadFromConst(C, LoadTy, ConstOffset, DL))

▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

llvm/test/Transforms/AggressiveInstCombine/patterned-load.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=aggressive-instcombine -S -data-layout="e" \| FileCheck %s --check-prefixes=CHECK,LE		; RUN: opt < %s -passes=aggressive-instcombine -S -data-layout="e" \| FileCheck %s --check-prefixes=CHECK,LE
; RUN: opt < %s -passes=aggressive-instcombine -S -data-layout="E" \| FileCheck %s --check-prefixes=CHECK,BE		; RUN: opt < %s -passes=aggressive-instcombine -S -data-layout="E" \| FileCheck %s --check-prefixes=CHECK,BE


@constarray1 = internal constant [8 x i8] c"\01\00\01\00\01\00\01\00", align 4		@constarray1 = internal constant [8 x i8] c"\01\00\01\00\01\00\01\00", align 4
@constarray2 = internal constant [8 x i8] c"\FF\FF\01\00\01\00\01\00", align 4		@constarray2 = internal constant [8 x i8] c"\FF\FF\01\00\01\00\01\00", align 4

@g = internal constant i32 42		@g = internal constant i32 42
@constptrarray = internal constant [4 x ptr] [ptr @g, ptr @g, ptr @g, ptr @g], align 4		@constptrarray = internal constant [4 x ptr] [ptr @g, ptr @g, ptr @g, ptr @g], align 4

@constpackedstruct = internal constant <{[8 x i8]}> <{[8 x i8] c"\01\00\01\00\01\00\01\00"}>, align 4		@constpackedstruct = internal constant <{[8 x i8]}> <{[8 x i8] c"\01\00\01\00\01\00\01\00"}>, align 4
@conststruct = internal constant {i16, [8 x i8]} {i16 1, [8 x i8] c"\01\00\01\00\01\00\01\00"}, align 4		@conststruct = internal constant {i16, [8 x i8]} {i16 1, [8 x i8] c"\01\00\01\00\01\00\01\00"}, align 4

		%struct = type { i128 }
		@global = internal constant %struct { i128 1 }
		; TODO: this should be folded, but currently i128 is not folded.
		define i32 @no-gep-128-struct(i64 %idx){
		; CHECK-LABEL: @no-gep-128-struct(
		; CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr @global, align 4
		nikicUnsubmitted Not Done Reply Inline Actions Could you please also rerun update_test_checks.py for the new test? nikic: Could you please also rerun update_test_checks.py for the new test?
		; CHECK-NEXT: ret i32 [[TMP1]]
		;
		%1 = load i32, ptr @global, align 4
		ret i32 %1
		}

define i8 @inbounds_gep_load_i8_align2(i64 %idx){		define i8 @inbounds_gep_load_i8_align2(i64 %idx){
; CHECK-LABEL: @inbounds_gep_load_i8_align2(		; CHECK-LABEL: @inbounds_gep_load_i8_align2(
; CHECK-NEXT: ret i8 1		; CHECK-NEXT: ret i8 1
;		;
%1 = getelementptr inbounds i8, ptr @constarray1, i64 %idx		%1 = getelementptr inbounds i8, ptr @constarray1, i64 %idx
%2 = load i8, ptr %1, align 2		%2 = load i8, ptr %1, align 2
ret i8 %2		ret i8 %2
}		}
Show All 20 Lines	;
%1 = getelementptr inbounds i8, ptr @constarray1, i64 %idx		%1 = getelementptr inbounds i8, ptr @constarray1, i64 %idx
%2 = load volatile i8, ptr %1, align 2		%2 = load volatile i8, ptr %1, align 2
ret i8 %2		ret i8 %2
}		}

declare ptr @llvm.ptrmask.p0.i64(ptr , i64)		declare ptr @llvm.ptrmask.p0.i64(ptr , i64)

; can't be folded because ptrmask can change ptr, while preserving provenance		; can't be folded because ptrmask can change ptr, while preserving provenance
define i8 @inbounds_gep_load_i8_align2_ptrmasked(i64 %idx, i64 %mask){		; This invalidates GEP indices analysis
; CHECK-LABEL: @inbounds_gep_load_i8_align2_ptrmasked(		define i8 @inbounds_gep_load_i16_align1_ptrmasked(i64 %idx, i64 %mask){
; CHECK-NEXT: ret i8 1		; CHECK-LABEL: @inbounds_gep_load_i16_align1_ptrmasked(
		; CHECK-NEXT: [[TMP1:%.]] = call ptr @llvm.ptrmask.p0.i64(ptr @constarray1, i64 [[MASK:%.]])
		; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i16, ptr [[TMP1]], i64 [[IDX:%.]]
		; CHECK-NEXT: [[TMP3:%.*]] = load i8, ptr [[TMP2]], align 1
		; CHECK-NEXT: ret i8 [[TMP3]]
;		;
%1 = call ptr @llvm.ptrmask.p0.i64(ptr @constarray1, i64 %mask)		%1 = call ptr @llvm.ptrmask.p0.i64(ptr @constarray1, i64 %mask)
%2 = getelementptr inbounds i8, ptr %1, i64 %idx		%2 = getelementptr inbounds i16, ptr %1, i64 %idx
%3 = load i8, ptr %2, align 2		%3 = load i8, ptr %2, align 1
ret i8 %3		ret i8 %3
		khei4AuthorUnsubmitted Done Reply Inline Actions Tweaked this case not to be folded by alignment. khei4: Tweaked this case not to be folded by alignment.
}		}

; TODO: this will be ret i32 65537(LE), 16777472(BE)
define i32 @inbounds_gep_i16_load_i32_align1(i64 %idx){		define i32 @inbounds_gep_i16_load_i32_align1(i64 %idx){
; CHECK-LABEL: @inbounds_gep_i16_load_i32_align1(		; LE-LABEL: @inbounds_gep_i16_load_i32_align1(
; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds i16, ptr @constarray1, i64 [[IDX:%.]]		; LE-NEXT: ret i32 65537
; CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[TMP1]], align 1		;
; CHECK-NEXT: ret i32 [[TMP2]]		; BE-LABEL: @inbounds_gep_i16_load_i32_align1(
		; BE-NEXT: ret i32 16777472
;		;
%1 = getelementptr inbounds i16, ptr @constarray1, i64 %idx		%1 = getelementptr inbounds i16, ptr @constarray1, i64 %idx
%2 = load i32, ptr %1, align 1		%2 = load i32, ptr %1, align 1
ret i32 %2		ret i32 %2
}		}

; TODO: this will be ret i32 65537(LE), 16777472(BE)
define i32 @inbounds_gep_i32_load_i32_align8(i64 %idx){		define i32 @inbounds_gep_i32_load_i32_align8(i64 %idx){
; CHECK-LABEL: @inbounds_gep_i32_load_i32_align8(		; LE-LABEL: @inbounds_gep_i32_load_i32_align8(
; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds i32, ptr @constarray1, i64 [[IDX:%.]]		; LE-NEXT: ret i32 65537
; CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[TMP1]], align 8		;
; CHECK-NEXT: ret i32 [[TMP2]]		; BE-LABEL: @inbounds_gep_i32_load_i32_align8(
		; BE-NEXT: ret i32 16777472
;		;
%1 = getelementptr inbounds i32, ptr @constarray1, i64 %idx		%1 = getelementptr inbounds i32, ptr @constarray1, i64 %idx
%2 = load i32, ptr %1, align 8		%2 = load i32, ptr %1, align 8
ret i32 %2		ret i32 %2
}		}

; TODO: this will be ret i32 65547(LE), 16777472(BE)
define i32 @inbounds_gep_i32_load_i32_const_offset(i64 %idx){		define i32 @inbounds_gep_i32_load_i32_const_offset(i64 %idx){
; CHECK-LABEL: @inbounds_gep_i32_load_i32_const_offset(		; LE-LABEL: @inbounds_gep_i32_load_i32_const_offset(
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i16, ptr @constarray2, i64 1		; LE-NEXT: ret i32 65537
; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i32, ptr [[TMP1]], i64 [[IDX:%.]]		;
; CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4		; BE-LABEL: @inbounds_gep_i32_load_i32_const_offset(
; CHECK-NEXT: ret i32 [[TMP3]]		; BE-NEXT: ret i32 16777472
;		;
%1 = getelementptr inbounds i16, ptr @constarray2, i64 1		%1 = getelementptr inbounds i16, ptr @constarray2, i64 1
%2 = getelementptr inbounds i32, ptr %1, i64 %idx		%2 = getelementptr inbounds i32, ptr %1, i64 %idx
%3 = load i32, ptr %2, align 4		%3 = load i32, ptr %2, align 4
ret i32 %3		ret i32 %3
}		}

define i32 @gep_load_i32_align2_const_offset(i64 %idx){		define i32 @gep_load_i32_align2_const_offset(i64 %idx){
Show All 20 Lines
; CHECK-NEXT: ret i32 [[TMP3]]		; CHECK-NEXT: ret i32 [[TMP3]]
;		;
%1 = getelementptr i16, ptr @constarray2, i64 -2		%1 = getelementptr i16, ptr @constarray2, i64 -2
%2 = getelementptr [3 x i16], ptr %1, i64 %idx		%2 = getelementptr [3 x i16], ptr %1, i64 %idx
%3 = load i32, ptr %2, align 2		%3 = load i32, ptr %2, align 2
ret i32 %3		ret i32 %3
}		}

; TODO: this will be ret i32 42
define i32 @inbounds_gep_i32_load_i32_const_ptr_array(i64 %idx){		define i32 @inbounds_gep_i32_load_i32_const_ptr_array(i64 %idx){
; CHECK-LABEL: @inbounds_gep_i32_load_i32_const_ptr_array(		; CHECK-LABEL: @inbounds_gep_i32_load_i32_const_ptr_array(
; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds ptr, ptr @constptrarray, i64 [[IDX:%.]]		; CHECK-NEXT: ret i32 42
; CHECK-NEXT: [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 4
; CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4
; CHECK-NEXT: ret i32 [[TMP3]]
;		;
%1 = getelementptr inbounds ptr, ptr @constptrarray, i64 %idx		%1 = getelementptr inbounds ptr, ptr @constptrarray, i64 %idx
%2 = load ptr, ptr %1, align 4		%2 = load ptr, ptr %1, align 4
%3 = load i32, ptr %2, align 4		%3 = load i32, ptr %2, align 4
ret i32 %3		ret i32 %3
}		}

define i32 @inbounds_gep_i32_load_i32_align4_packedstruct(i64 %idx){		define i32 @inbounds_gep_i32_load_i32_align4_packedstruct(i64 %idx){
Show All 15 Lines
; CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[TMP1]], align 1		; CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[TMP1]], align 1
; CHECK-NEXT: ret i32 [[TMP2]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;
%1 = getelementptr inbounds i8, ptr @constpackedstruct, i64 %idx		%1 = getelementptr inbounds i8, ptr @constpackedstruct, i64 %idx
%2 = load i32, ptr %1, align 1		%2 = load i32, ptr %1, align 1
ret i32 %2		ret i32 %2
}		}

; TODO: this coould be folded into 65537(LE), 16777472(BE)
define i32 @inbounds_gep_i32_load_i32_align4_struct_with_const_offset(i64 %idx){		define i32 @inbounds_gep_i32_load_i32_align4_struct_with_const_offset(i64 %idx){
; LE-LABEL: @inbounds_gep_i32_load_i32_align4_struct_with_const_offset(		; LE-LABEL: @inbounds_gep_i32_load_i32_align4_struct_with_const_offset(
; LE-NEXT: ret i32 65537		; LE-NEXT: ret i32 65537
;		;
; BE-LABEL: @inbounds_gep_i32_load_i32_align4_struct_with_const_offset(		; BE-LABEL: @inbounds_gep_i32_load_i32_align4_struct_with_const_offset(
; BE-NEXT: [[TMP1:%.*]] = getelementptr inbounds i16, ptr @conststruct, i64 1		; BE-NEXT: ret i32 16777472
; BE-NEXT: [[TMP2:%.]] = getelementptr inbounds i32, ptr [[TMP1]], i64 [[IDX:%.]]
; BE-NEXT: [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4
; BE-NEXT: ret i32 [[TMP3]]
;		;
%1 = getelementptr inbounds i16, ptr @conststruct, i64 1		%1 = getelementptr inbounds i16, ptr @conststruct, i64 1
%2 = getelementptr inbounds i32, ptr %1, i64 %idx		%2 = getelementptr inbounds i32, ptr %1, i64 %idx
%3 = load i32, ptr %2, align 4		%3 = load i32, ptr %2, align 4
ret i32 %3		ret i32 %3
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[AggressiveInstCombine] folding load for constant global patterened arrays and structs by GEP indicesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 521582

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp

llvm/test/Transforms/AggressiveInstCombine/patterned-load.ll

[AggressiveInstCombine] folding load for constant global patterened arrays and structs by GEP indices
ClosedPublic