Page MenuHomePhabricator

Commoning of target specific load/store intrinsics in Early CSE
ClosedPublic

Authored by ssijaric on Jan 22 2015, 12:48 AM.

Details

Reviewers
pete
hfinkel
Summary

Common target specific load/store intrinsics in Early CSE. Currently, only a few AArch64 load/store intrinsics are supported (ld2, st2, etc). Updated patch based on Pete and Hal's reviews. Uploading to Phabricator this time.

Thanks,
Sanjin

Diff Detail

Event Timeline

ssijaric updated this revision to Diff 18588.Jan 22 2015, 12:48 AM
ssijaric retitled this revision from to Commoning of target specific load/store intrinsics in Early CSE.
ssijaric updated this object.
ssijaric edited the test plan for this revision. (Show Details)
ssijaric added reviewers: pete, hfinkel.
ssijaric added a subscriber: Unknown Object (MLST).
pete edited edge metadata.Jan 22 2015, 10:07 AM

Thanks for fixing all my prior comments.

The patch LGTM with just the few additional comments I made. I'm no expert on this part of the compiler though, so it would be best if Hal or someone else can give the final LGTM.

Thanks,
Pete

include/llvm/Analysis/TargetTransformInfo.h
39

Please use \brief here instead of 'MemIntrinsicInfo -'

43

You need to initialize Vol and NumMemRefs here

463

I think this comment should say it returns a value not an instruction. I think its possible for your code to return a Value here if the st2 intrinsic was storing constants for example.

lib/Transforms/Scalar/EarlyCSE.cpp
408

I think this can also be getPointerOperand() as you've done on the store

ssijaric updated this revision to Diff 18645.Jan 22 2015, 5:34 PM
ssijaric edited edge metadata.

Diff contains a few fixes to address Pete's latest review.

Thanks for the review, Pete. New patch uploaded.

include/llvm/Analysis/TargetTransformInfo.h
39

Changed in the updated patch.

43

Thanks for catching this. Fixed in the updated patch.

463

Yes, changed description in the updated patch.

lib/Transforms/Scalar/EarlyCSE.cpp
408

Yes, changed in the updated patch.

ssijaric updated this revision to Diff 18646.Jan 22 2015, 6:11 PM

Fix initialization list order.

hfinkel accepted this revision.Jan 24 2015, 6:50 AM
hfinkel edited edge metadata.

LGTM too.

lib/Transforms/Scalar/EarlyCSE.cpp
478

Don't add an extra blank line here.

This revision is now accepted and ready to land.Jan 24 2015, 6:50 AM
ssijaric updated this revision to Diff 18727.Jan 25 2015, 1:24 AM
ssijaric edited edge metadata.

Remove a blank line and rebase the change.

Thanks for the review, Hal.

lib/Transforms/Scalar/EarlyCSE.cpp
478

Removed in the latest update.

Okay (there was no need to repost this for just the whitespace change); feel free to commit.

Hi Hal,

I don't have commit rights. Can you or Chad please commit?

Thanks,
Sanjin

mcrosier closed this revision.Jan 26 2015, 2:54 PM
mcrosier added a subscriber: mcrosier.

Committed r227149.

reames added a subscriber: reames.Feb 13 2015, 3:54 PM

Sorry to reopen a long dead review thread, but I had a reason to look at EarlyCSE today and just noticed the changes.

I feel the introduction of the ParseMemoryInst class has badly obscured the logic of this transform. I get that you wanted to abstract over the existence of both normal loads and stores and target specific loads and stores, but the extra layer of abstraction makes this code substantially harder to follow.

Particular areas of concern include:

  • isValid checks as opposed to dyn_cast, cast style tests (i.e. less error checking!)
  • MatchingId - what does this even do? It's not documented at all.
  • isVolatile is confused with isSimple in problematic ways. These are related but distinct concepts.
  • The usage of fields in the object appear unnecessary. Simply dispatching by the contained instruction type would be far more clear and less error prone.

A potentially clearer design would have been to introduce a helper class for each type of operation: load, and store. Like CallSite, each helper class could simply proxy to the underlying instruction/information based on the type of the instruction so contained.

Sorry to reopen a long dead review thread, but I had a reason to look at EarlyCSE today and just noticed the changes.

I feel the introduction of the ParseMemoryInst class has badly obscured the logic of this transform. I get that you wanted to abstract over the existence of both normal loads and stores and target specific loads and stores, but the extra layer of abstraction makes this code substantially harder to follow.

Particular areas of concern include:

  • isValid checks as opposed to dyn_cast, cast style tests (i.e. less error checking!)

Can you please provide an example of what you mean?

  • MatchingId - what does this even do? It's not documented at all.

There is a comment in the TTI header explaining what this does:

// Same Id is set by the target for corresponding load/store intrinsics.
unsigned short MatchingId;

and there is a comment in the implementation too:

// For regular (non-intrinsic) loads/stores, this is set to -1. For
// intrinsic loads/stores, the id is retrieved from the corresponding
// field in the MemIntrinsicInfo structure.  That field contains
// non-negative values only.
int MatchingId;
  • isVolatile is confused with isSimple in problematic ways. These are related but distinct concepts.

I assume you're talking about this:

Vol = !LI->isSimple();

FWIW, I don't believe there was a behavioral change here. We can certainly re-name these variables.

  • The usage of fields in the object appear unnecessary. Simply dispatching by the contained instruction type would be far more clear and less error prone.

    A potentially clearer design would have been to introduce a helper class for each type of operation: load, and store. Like CallSite, each helper class could simply proxy to the underlying instruction/information based on the type of the instruction so contained.

I'm not sure. There is common behavior between load intrinsics and loads, store intrinsics and stores, etc.

Sorry to reopen a long dead review thread, but I had a reason to look at EarlyCSE today and just noticed the changes.

I feel the introduction of the ParseMemoryInst class has badly obscured the logic of this transform. I get that you wanted to abstract over the existence of both normal loads and stores and target specific loads and stores, but the extra layer of abstraction makes this code substantially harder to follow.

Particular areas of concern include:

  • isValid checks as opposed to dyn_cast, cast style tests (i.e. less error checking!)

Can you please provide an example of what you mean?

If you write
if (LoadInst *LI = dyn_cast<LoadInst>(I))
or even:
if (isa<LoadInst>(I)) { LI = cast<LoadInst>(I); ... }

You are pretty much guaranteed that LI is in fact a load.

With ParseMemoryInst(I).isLoad() there's no self check. You could see a path being added that set Load=true, but forgot to actually save the instruction, or similar things.

To be clear, I am *not* stating the current impl is buggy, just that its more error prone going forward.

  • MatchingId - what does this even do? It's not documented at all.

There is a comment in the TTI header explaining what this does:

// Same Id is set by the target for corresponding load/store intrinsics.
unsigned short MatchingId;

and there is a comment in the implementation too:

Neither of these comments describe the *semantics*. What is *meaning* or *usage* of the matching id? Does it effect must aliasing rules? (Serious question here, I can't tell what it actually means.)

// For regular (non-intrinsic) loads/stores, this is set to -1. For
// intrinsic loads/stores, the id is retrieved from the corresponding
// field in the MemIntrinsicInfo structure.  That field contains
// non-negative values only.
int MatchingId;
  • isVolatile is confused with isSimple in problematic ways. These are related but distinct concepts.

I assume you're talking about this:

Vol = !LI->isSimple();

FWIW, I don't believe there was a behavioral change here. We can certainly re-name these variables.

Thanks. A rename is definitely appropriate.

  • The usage of fields in the object appear unnecessary. Simply dispatching by the contained instruction type would be far more clear and less error prone.

    A potentially clearer design would have been to introduce a helper class for each type of operation: load, and store. Like CallSite, each helper class could simply proxy to the underlying instruction/information based on the type of the instruction so contained.

I'm not sure. There is common behavior between load intrinsics and loads, store intrinsics and stores, etc.

I think you're misreading my suggestion. Here's some psuedo code:
class LoadSite {

Instruction *Inst;
LoadSite(Instruction* I);

operator bool() const;

bool isSimple() {
  if(LoadInst *LI = dyn_cast<LoadInst>) return LI->isSimple();
  assert is target intrinsic
  return TTI->getTarget...().isVolatile;

}
}

if (LoadSite LS = I) {
...
}
class StoreSite { ... };

if (StoreSite SS = I)

Having the abstraction is fine, it should just be minimal. In particular, the abstraction itself shouldn't be stateful.

Given there's little shared between loads (of any type) and stores (of any type), I think these should be separate.

Hi Philip,

Thanks for the review. Just to briefly add onto Hal's comment.

  • isVolatile is confused with isSimple in problematic ways. These are related but distinct concepts.

Yes, this should be renamed to avoid confusion. There is no behavioral change, as Hal noted.

  • MatchingId - what does this even do? It's not documented at all.

The reason for MatchingId is to identify matching loads and stores for target specific intrinsics. For example, the pointer type on different target specific intrinsic calls may be i8*, and MatchingId is there to make sure that only matching intrinsics get commoned (e.g. test case test_nocse3). This comment could be clearer.

  • The usage of fields in the object appear unnecessary. Simply dispatching by the contained instruction type would be far more clear and less error prone.

    A potentially clearer design would have been to introduce a helper class for each type of operation: load, and store. Like CallSite, each helper class could simply proxy to the underlying instruction/information based on the type of the instruction so contained.

I thought about introducing new classes to deal with both regular loads and target specific intrinsic loads (same for stores and target specific intrinsic stores). This was only needed for EarlyCSE, so I kept it localized.

Looking back, it may have been cleaner to introduce these new classes, as they can be reused elsewhere (GVN?). If everyone agrees, I can come up with a new patch to do away with ParseMemInst.

Thanks,
Sanjin

Hi Philip,

Thanks for the review. Just to briefly add onto Hal's comment.

  • isVolatile is confused with isSimple in problematic ways. These are related but distinct concepts.

Yes, this should be renamed to avoid confusion. There is no behavioral change, as Hal noted.

Okay, please do so.

  • MatchingId - what does this even do? It's not documented at all.

The reason for MatchingId is to identify matching loads and stores for target specific intrinsics. For example, the pointer type on different target specific intrinsic calls may be i8*, and MatchingId is there to make sure that only matching intrinsics get commoned (e.g. test case test_nocse3). This comment could be clearer.

I think the way to state this is that the action of a target-specific memory intrinsic is not uniquely identified by the pointer type. We need to make sure that magic permuting load of type 1 is only matched with magic permuting store of type 1, magic permuting load of type 1 is only matched with magic permuting store of type 2, etc.

  • The usage of fields in the object appear unnecessary. Simply dispatching by the contained instruction type would be far more clear and less error prone.

    A potentially clearer design would have been to introduce a helper class for each type of operation: load, and store. Like CallSite, each helper class could simply proxy to the underlying instruction/information based on the type of the instruction so contained.

I thought about introducing new classes to deal with both regular loads and target specific intrinsic loads (same for stores and target specific intrinsic stores). This was only needed for EarlyCSE, so I kept it localized.

Looking back, it may have been cleaner to introduce these new classes, as they can be reused elsewhere (GVN?). If everyone agrees, I can come up with a new patch to do away with ParseMemInst.

I think this is worth trying.

Thanks,
Sanjin