Page MenuHomePhabricator

Refactor commoning of target specific load/store intrinsics in EarlyCSE
Needs ReviewPublic

Authored by ssijaric on Mar 13 2015, 2:45 AM.

Details

Summary

Introduce LoadSite and StoreSite to enable us to deal with target specific load/store intrinsics with characteristics similar to those of regular load and store instructions. Remove ParseMemoryInst from EarlyCSE, and replace it with LoadSite and StoreSite.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
reames added inline comments.Mar 30 2015, 5:22 PM
include/llvm/Analysis/TargetTransformInfo.h
471

The naming of these functions is inconsistent. I'm fine with either scheme, but please be consistent.

lib/Transforms/Scalar/EarlyCSE.cpp
469

Volatile != Simple. Please use appropriate naming to match underling instruction.

482

Er, is this a separate bugfix? I don't understand where this is coming from w.r.t. the requested refactor.

510

It really feels like the functionality for whether an intrinsic reads from memory should live inside mayReadFromMemory.

545

Given this is testing whether a location matches not whether the instruction matches, I would like to see something which made that explicit.

Hi Philip,

Thanks for reviewing these changes! I'll upload a new patch with context.

include/llvm/Analysis/TargetTransformInfo.h
471

The inconsistency in naming is to indicate that isTargetIntrinsicLikeLoad may be invoked on any intrinsic instruction, whereas isTargetMemoryIntrinsic..... may be invoked on intrinsics that are either like loads or like stores. The asserts in TargetTransformInfo.cpp check for these properties.

I renamed these from isTargetMemoryIntrinsic* to isTargetIntrinsic* in the new patch. I kept the asserts for now. Maybe they should be renamed to something like isTargetLoadOrStoreIntrinsicAtomic instead?

Alternatively, we could do away with different query functions, and just have something like

MemoryIntrinsicInfo getTargetMemoryIntrinsic(IntrinsicInst*);

with overloaded operator bool(), which would return true if the MemoryIntrinsicInfo contains a supported intrinsic. The overhead would be that we would probably need to keep one MemoryIntrinsicInfo struct in LoadStoreSiteBase. This could perhaps avoid any inconsistencies due to having different query functions.

include/llvm/IR/LoadStoreSite.h
1 ↗(On Diff #21905)

Yes, let's deal with renaming once other issues are addressed.

33 ↗(On Diff #21905)

As per your previous suggestion, let's deal with renaming once other issues are addressed.

69 ↗(On Diff #21905)

Done in the updated patch.

76 ↗(On Diff #21905)

Commoned in the updated patch.

99 ↗(On Diff #21905)

On AArch64 and ARM at least, store intrinsics are not marked as readnone by the front-end, and so mayReadFromMemory would always return true. For regular store instructions, mayReadFromMemory always returns false for non-atomic stores. This change allows the back-end to treat store intrinsics like regular stores.

I don't think it's legal to mark store intrinsics with readnone. When readnone is used on a function, then the function "does not write through any pointer arguments", according to the LLVM Language Reference Manual.

110 ↗(On Diff #21905)

I put this here for completeness, in case some other optimization makes use of this API. For example, it would return false when checking "store i32 %val, i32* %ptr" and "load %val2 = load i16, i16* %ptr2".

This mirrors the check for

%0 = bitcast i32* %a to i8*
call void @llvm.aarch64.neon.st2.v4i32.p0i8(<4 x i32> %3, <4 x i32> %4, i8* %0)

and

%5 = bitcast i32* %a to i8*
%vld3 = call { <4 x i32>, <4 x i32>, <4 x i32> } @llvm.aarch64.neon.ld3.v4i32.p0i8(i8* %5)

While the check for non-intrinsic loads and stores is not necessary when using this API in EaryCSE, I don't think I should just return "true" unconditionally for regular loads and stores.

146 ↗(On Diff #21905)

Changed in the updated patch.

lib/Analysis/TargetTransformInfo.cpp
269

This is supposed to check for memory ordering, similar to isAtomic() when invoked on regular loads and stores. The new name, isTargetIntrinsicAtomic(), is probably not any better. Is the comment in TargetTransformInfo.h enough?

lib/Target/AArch64/AArch64TargetTransformInfo.cpp
539

There are, but only a few intrinsics are supported for now (the ones in isTargetIntrinsicLikeLoad(...) and isTargetIntrinsicLikeStore(...)). The assert makes sure that we are dealing with supported load/store intrinsics.

Perhaps, a better way would be to just have one query function for load/store intrinsics. Something like

MemoryIntrinsicInfo getTargetMemoryIntrinsic(IntrinsicInst *Inst);

which would return all the needed information inside the MemoryIntrinsicInfo struct? Then we could do away with separate APIs, which may be confusing. The overhead would be the inclusion of this struct in LoadStoreSiteBase.

lib/Transforms/Scalar/EarlyCSE.cpp
469

Originally, the check was

// Ignore volatile loads.
if (!LI->isSimple())

The check "if (MemInst.isVolatile())" does the same check underneath, at least for regular loads and stores. The name of MemInst.isVolatile() is wrong - it should have been MemInst.isSimple()).

The new check

if (!LS.isSimple()) restores the original check that was in place.

The comment , " Ignore volatile loads." should probably be " Ignore volatile and atomic loads".

482

Yes, this bug was pointed out by Hal, and could affect Power intrinsics from what I understand. I thought I'd include it here. If you prefer, I can do this under a seperate bug fix once the refactor is sorted out.

510

AArch64 store intrinsics are not marked as readnone, and mayReadFromMemory will always return true (unlike for regular non-atomic stores). So, we wouldn't get a chance to common intrinsic stores. It doesn't look to be legal to mark store intrinsics as readnone, according to the LLVM Language Reference Manual.

545

The previous change, isMatchingMemLoc, has a misleading name. It checks for both the memory operand and the matching id. This is what LoadStoreSiteBase::Match does as well. I renamed LoadStoreSiteBase::Match to LoadStoreSiteBase::MatchInstructionAndLocation in the new patch.

567

MemInst.isVolatile() is not correctly named - it checks for !isSimple() for regular loads and stores.

Previous to MemInst change, the check was

if (SI->isSimple())

The new change, if (SS.isSimple()), restores the original check.

ssijaric updated this revision to Diff 23125.Apr 2 2015, 1:39 AM
ssijaric edited edge metadata.

A few changes to address Philip's comments for LoadSite, StoreSite and the target query functions. Uploading with full context.

hfinkel added inline comments.Apr 16 2015, 11:33 AM
include/llvm/Analysis/TargetTransformInfo.h
501

We have the IntrReadArgMem intrinsic property, how does this differ?

532

II->mayReadFromMemory()?

ssijaric added inline comments.Apr 16 2015, 5:23 PM
include/llvm/Analysis/TargetTransformInfo.h
501

From what I understand, IntrReadArgMem results in an intrinsic being annotated with the readonly attribute. However, the readonly attribute doesn't imply that the intrinsic has only one pointer argument.

532

I think I need to provide a better comment here. For regular, non-atomic stores, Instruction::mayReadFromMemory() returns false. This is not the case for calls unless they are marked with readnone, and store intrinsics are not marked with readnone.

A few standalone comments before I go through the entire patch again.

lib/Target/AArch64/AArch64TargetTransformInfo.cpp
539

I believe you've changed the name since the last version. Make sure the comments indicate that this only applies to target load and store intrinsics. I'm then fine with that.

lib/Transforms/Scalar/EarlyCSE.cpp
469

Update the comment please.

482

I would strongly prefer this be done separately.

510

You misunderstood my comment. I was suggesting that the *function* mayReadFromMemory should check whether the intrinsic is a target specific store and return false. I'm fine with this being a separate refactor, but this type of special case doesn't belong in a particular transform pass when the answer is the same for anyone who asks.

Ok, I'm running out of comments to make. The EarlyCSE code is much improved here; thank you for taking the time to do this.

I'm not entirely thrilled with how complex the target interface is, but that can be addressed incrementally. What you proposing appears to be correct and is now clearly documented enough to not be confusing.

If you do last one last update with all the naming and small things addressed, I'll do one more run through for a final LGTM. I considered giving a conditional LGTM now, but there a few too many issues that were deferred. (In particular, separation of unrelated bug fixes and naming.) The next round should be fast though.

include/llvm/IR/LoadStoreSite.h
41 ↗(On Diff #23125)

Leaving the member state unchanged after a call to setInstruction seems bug prone at best. Please either assert or clear the underlying data. (Looking through the calling code, I don't believe this is currently an issue. This is purely future proofing.)

lib/Target/AArch64/AArch64TargetTransformInfo.cpp
544

I would prefer you not make the variable const rather than need a const_cast here.

ssijaric updated this revision to Diff 32037.Aug 13 2015, 2:15 AM
ssijaric updated this object.

Sorry for taking so long to get back to this. This update addresses Philip's comments.

Sorry for taking so long to get back to this. The updated patch is revised to address previous comments. Also moved LoadStoreSite.h to include/llvm/Analysis, since it makes use of TargetTransformInfo.

include/llvm/IR/LoadStoreSite.h
41 ↗(On Diff #23125)

Addressed in the updated patch - the state is always updated.

lib/Target/AArch64/AArch64TargetTransformInfo.cpp
544

Done in the updated patch.

588

I changed this a little, so that mayTargetIntrinsicReadFromMemory and mayTargetIntrinsicWriteToMemory apply to all intrinsics that are supported by the target (if they choose to do so).

The default implementation in TargetTransformInfoImplBase is:

bool mayTargetIntrinsicReadFromMemory(const IntrinsicInst *II) {

return !cast<CallInst>(II)->doesNotAccessMemory();

}

I also introduced bool mayReadFromMemory(const Instruction *I) and bool mayWriteToMemory(const Instruction *I) to TTI that may invoke mayTargetIntrinsicReadFromMemory and mayTargetIntrinsicWriteToMemory, respectively. This lets us do away with the kludgy check in EarlyCSE.

lib/Transforms/Scalar/EarlyCSE.cpp
511–513

Updated in the new patch.

524

Will be done in the updated patch - the bug will be addressed separately. The bug can only affect non-AArch64 users, of which there are none as far I can see.

550

Thanks for clarifying. Reworked in the updated patch - the checks are now done using TTI.mayReadFromMemory(Inst).

ssijaric updated this revision to Diff 32169.Aug 14 2015, 11:40 AM

Fix a couple of function signatures.

ssijaric updated this revision to Diff 32266.Aug 16 2015, 10:29 PM
ssijaric edited edge metadata.

Upload with full context this time.

One final round of comments. This looks really really close to ready to check in. Please make the requested changes (and nothing else) and I'll give it a LGTM once the patch is updated. Any further enhancements should go in separate follow on patches.

include/llvm/Analysis/LoadStoreSite.h
48

auto

100

This API feels really awkward. For now, I'll ask that you pull this function out of the helper class and make it a static function in EarlyCSE. That at least localizes the awkwardness to the place it's used.

123

Same as previous.

include/llvm/Analysis/TargetTransformInfo.h
501

It looks like you may have merged two patches by accident here?

504

Important detail that needs to be in this comment: not all load like target instructions will return true. This is correct because the optimizer must treat intrinsics calls conservatively. i.e. this is an optimization hook the target can hook into, not a correctness hook.

551

Why not just mark the intrinsics with readonly or readnone? Unless you have a very good reason, please do that.

561

This is the wrong place for this API. It should probably be on LoadSiteBase. It doesn't need to be part of TTI.

568

Same as above.

include/llvm/Analysis/TargetTransformInfoImpl.h
342

A slightly clearer way to phrase these assertions would be that the intrinsic passes either the LikeLoad or LikeStore predicate. This allows a target to override those two and not have to update the others to get a conservatively correct answer.

lib/Target/AArch64/AArch64TargetTransformInfo.cpp
518

You should assert that the intrinsic is in either the load or store list. Otherwise, this could return an incorrect result for an unknown intrinsic.

Alternatively, it could be:
if intrinsic in [known list] return false;
else return true;

523

Same.

534

Submit separately please.

596

If you rephrase the assertions above, you can just call the existing check functions and remove all these comments. Please do so.

lib/Transforms/Scalar/EarlyCSE.cpp
312–315

Separable change. Please submit separately.

549

Ah, given where this is used, ignore my comment about putting this on LoadStoreBase. Making it a static function in EarlyCSE is better for the moment. Still not on TTI, but LSB isn't a good place either.

592

Use the assignment idiom as above.

I agree with Philip's comments (except, perhaps, for the one on which I've commented below).

include/llvm/Analysis/LoadStoreSite.h
100

Philip, why do you feel this is awkward? It seems that if the point of the abstraction is to hide the differentiation between regular load/stores and related target intrinsics. As such, it seems that this is an important part of the abstraction.

126

MatchInstructionAndLocation -> matchInstructionAndLocation

reames added inline comments.Sep 22 2015, 3:59 PM
include/llvm/Analysis/LoadStoreSite.h
100

Er, I've forgotten it's been so long since I looked at this. This was a minor comment and can be safely ignored. This can be addressed post commit if needed; I really don't want to block submission of this given how long it's been outstanding.

ssijaric added inline comments.Oct 6 2015, 12:33 PM
include/llvm/Analysis/LoadStoreSite.h
48

Will change to auto throughout in the updated patch.

100

I'll keep isCompatible in LoadStoreSite for now. I'll move matchInstructionAndLocation to EarlyCSE.

123

Will move it to EarlyCSE.

126

Will rename and move to EarlyCSE.

include/llvm/Analysis/TargetTransformInfo.h
501

Git format-patch added a bit of additional context - removing areInlineCompatible, and then adding it back in. The function areInlineCompatible is part of TTI already.

504

Will update in the new patch.

551

I did this mainly for consistency. I need mayTargetIntrinsicReadFromMemory to return false for store-like intrinsics (which can't have the readnone attribute set). So, I thought I'd introduce mayTargetIntrinsicWriteToMemory.

I'll remove mayTargetIntrinsicWriteToMemory in the updated patch, as load intrinsics coming from clang are already marked with the "readonly" attribute. (At least AArch64 ones are).

561

I put this function (and mayWriteToMemory) here so that we can use a single check in EarlyCSE, as in:

if (TTI.mayReadFromMemory(Inst))

...

Otherwise, if it's moved to LoadStoreBase, we have to check along the lines:

StoreSite S(Inst, TTI);
if (Inst->mayReadFromMemory() && !(StoreSite && !StoreSite.mayReadFromMemory())

I can move mayReadFromMemory(cons Instruction *I) to EarlyCSE?

568

Will move it to EarlyCSE.

include/llvm/Analysis/TargetTransformInfoImpl.h
342

I will remove the assert for isTargetIntrinsicAtomic, and call II->isAtomic() for the default implementation. For isTargetIntrinsicVolatile, I'll add the asserts if an intrinsic isn't like a load or a store, and return true otherwise to err on the conservative side.

I will keep asserts for getTargetIntrinsicPointerOperand, getOrCreateResultFromTargetIntrinsic and getTargetIntrinsicMatchingId, as these must be implemented by the target if they are called.

lib/Target/AArch64/AArch64TargetTransformInfo.cpp
518

I will add a check for load/store intrinsic for isTargetIntrinsicVolatile. For isTargetIntrinsicAtomic, I'll return Inst->isAtomic() in the base TTI implementation.

523

Will change this in the updated patch.

534

Will change in the updated patch.

596

I'll add a check for volatility and atomicity in mayTargetIntrinsicReadFromMemory in the updated patch, and remove the comments. I'll remove mayTargetIntrinsicWriteToMemory altogether as load intrinsics are marked as readonly.

lib/Transforms/Scalar/EarlyCSE.cpp
312–315

Will do in the updated patch.

549

Will make it a static function in the updated patch.

592

Will fix in the updated patch.

ssijaric updated this revision to Diff 36655.Oct 6 2015, 1:45 PM
ssijaric added a reviewer: aadg.
ssijaric updated this revision to Diff 36831.Oct 8 2015, 2:04 AM

Update to take into account Arnaud's bug fix.

hfinkel added inline comments.Feb 2 2016, 3:38 PM
include/llvm/Analysis/TargetTransformInfo.h
501

From what I understand, IntrReadArgMem results in an intrinsic being annotated with the readonly attribute. However, the readonly attribute doesn't imply that the intrinsic has only one pointer argument.

Not exactly. Such an intrinsic is read-only, but BasicAA also (automatically) learns the "arg mem" part of the property so that AA->getModRefBehavior can return FMRB_OnlyReadsArgumentPointees. Also, argmemonly is now a generic IR attribute that can be applied to any function.

518

If this is important, why not add function attributes for these?

reames resigned from this revision.Feb 25 2020, 9:02 AM

Resigning from a stale review (2016). Feel free to re-add if thread ever revived.