This is an archive of the discontinued LLVM Phabricator instance.

Create instruction classes for identifying any atomicity of memory intrinsic. (NFC)
ClosedPublic

Authored by dneilson on Sep 29 2017, 12:24 PM.

Details

Summary

For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html

This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:

IntrinsicInst

-> MemIntrinsicBase (methods-only class)
  -> MemIntrinsic (non-atomic intrinsics)
    -> MemSetInst
    -> MemTransferInst
      -> MemCpyInst
      -> MemMoveInst
  -> AtomicMemIntrinsic (atomic intrinsics)
    -> AtomicMemSetInst
    -> AtomicMemTransferInst
      -> AtomicMemCpyInst
      -> AtomicMemMoveInst
  -> AnyMemIntrinsic (both atomicities)
    -> AnyMemSetInst
    -> AnyMemTransferInst
      -> AnyMemCpyInst
      -> AnyMemMoveInst

This involves some class renaming:

ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst

A script for doing this renaming in downstream trees is included below.

An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).


Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"

REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"

FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )

SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"

for f in $FILES; do

echo "Processing: $f"
sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f

done

Diff Detail

Repository
rL LLVM

Event Timeline

dneilson created this revision.Sep 29 2017, 12:24 PM
dneilson abandoned this revision.Sep 29 2017, 12:29 PM

Pulling this, for now. I'm going to make sure that the joint part of the hierarchy (i.e. the AnyMem* classes) will work before trying to land this part.

Pulling this, for now. I'm going to make sure that the joint part of the hierarchy (i.e. the AnyMem* classes) will work before trying to land this part.

FWIW, I don't really like the PainMem* names, although the AtomicMem* looks like an improvement. Can't you just leave the plain ones alone?

dneilson added a comment.EditedSep 29 2017, 1:02 PM

Pulling this, for now. I'm going to make sure that the joint part of the hierarchy (i.e. the AnyMem* classes) will work before trying to land this part.

FWIW, I don't really like the PainMem* names, although the AtomicMem* looks like an improvement. Can't you just leave the plain ones alone?

I have no particular objection to leaving the plain ones as is. It's what I suggested: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116716.html

A counter suggestion, that wasn't challenged, was made to change the plain ones as well: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116722.html

Pulling this, for now. I'm going to make sure that the joint part of the hierarchy (i.e. the AnyMem* classes) will work before trying to land this part.

FWIW, I don't really like the PainMem* names, although the AtomicMem* looks like an improvement. Can't you just leave the plain ones alone?

I have no particular objection to leaving the plain ones as is. It's what I suggested: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116716.html

A counter suggestion, that wasn't challenged, was made to change the plain ones as well: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116722.html

Fair enough. I challenge :-) -- Sanjoy's comment about silent failures downstream is legitimate, but given how new the atomic versions of these are, I don't think that I'm concerned (and it will fail in the right way: by not recognizing the atomic forms, which is what any current code is probably intended to do (because it was written before the atomic forms existed)).

Pulling this, for now. I'm going to make sure that the joint part of the hierarchy (i.e. the AnyMem* classes) will work before trying to land this part.

FWIW, I don't really like the PainMem* names, although the AtomicMem* looks like an improvement. Can't you just leave the plain ones alone?

I have no particular objection to leaving the plain ones as is. It's what I suggested: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116716.html

A counter suggestion, that wasn't challenged, was made to change the plain ones as well: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116722.html

Fair enough. I challenge :-) -- Sanjoy's comment about silent failures downstream is legitimate, but given how new the atomic versions of these are, I don't think that I'm concerned (and it will fail in the right way: by not recognizing the atomic forms, which is what any current code is probably intended to do (because it was written before the atomic forms existed)).

Playing a little devil's advocate... A benefit I can see, off the top of my head, in favour of changing to PlainMem* is cognitive. Having all of PlainMem* & AtomicMem* & AnyMem* makes the programmer have to think, briefly, about whether they're dealing with atomic or non-atomic variants. Leaving the non-atomic variants as is makes it easier to not even consider that the atomic variants exist.

Pulling this, for now. I'm going to make sure that the joint part of the hierarchy (i.e. the AnyMem* classes) will work before trying to land this part.

FWIW, I don't really like the PainMem* names, although the AtomicMem* looks like an improvement. Can't you just leave the plain ones alone?

I have no particular objection to leaving the plain ones as is. It's what I suggested: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116716.html

A counter suggestion, that wasn't challenged, was made to change the plain ones as well: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116722.html

Fair enough. I challenge :-) -- Sanjoy's comment about silent failures downstream is legitimate, but given how new the atomic versions of these are, I don't think that I'm concerned (and it will fail in the right way: by not recognizing the atomic forms, which is what any current code is probably intended to do (because it was written before the atomic forms existed)).

Playing a little devil's advocate... A benefit I can see, off the top of my head, in favour of changing to PlainMem* is cognitive. Having all of PlainMem* & AtomicMem* & AnyMem* makes the programmer have to think, briefly, about whether they're dealing with atomic or non-atomic variants. Leaving the non-atomic variants as is makes it easier to not even consider that the atomic variants exist.

I know. Of course, trying to force people to think in this fashion doesn't always have the desired effect (doing the text substitution and moving on is more likely). It's also true that most people will never see these and we're not really giving people clear guidance on how to update their code. I think we can give relatively-clear guidance, something like, "If your code is looking at Mem* to reason about its effect on memory, then it can probably use AnyMem*, if you're transforming the Mem*, then you probably need to switch to using PlainMem* (or leave it alone, as the case may be). Frankly, I think we'll just need to be diligent in code review for a while regardless of what we do.

I'm also, in general, just not a fan of "Plain" as a prefix. Our memcpy intrinsics (and friends) are already extensions of the underlying C functions (we can specify alignments, volatileness, etc.). If we really want to rename in this way, I think that using NonAtomic as the prefix is better.

dneilson updated this revision to Diff 117548.Oct 3 2017, 11:10 AM

Flush out the hierarchy, and do not rename the classes for the non-atomic memory intrinsics.

dneilson retitled this revision from Rename instruction classes for memory intrinsics. (NFC) to Create instruction classes for identifying any atomicity of memory intrinsic..Oct 3 2017, 11:11 AM
dneilson edited the summary of this revision. (Show Details)

ping - Could someone please take a look at this? Thank you.

ping - Review still requested/needed. It may look daunting, but the change is pretty simple.

dneilson updated this revision to Diff 120302.Oct 25 2017, 1:26 PM
dneilson edited the summary of this revision. (Show Details)
  • Rebasing.
  • Removing change to AliasSetTracker; this makes this change purely NFC.
dneilson retitled this revision from Create instruction classes for identifying any atomicity of memory intrinsic. to Create instruction classes for identifying any atomicity of memory intrinsic. (NFC).Oct 25 2017, 1:27 PM
dneilson edited the summary of this revision. (Show Details)
sanjoy edited edge metadata.Oct 25 2017, 2:23 PM

Hi Daniel,

I'll take a look at this today; but you might want to reply to the llvm-dev RFC thread with a link to this patch to get some more eyeballs on it.

Just to be clear, at the isa<>, dyn_cast<> level, this is the hierarchy you're shooting for right:

digraph G {
	AnyMemI -> MemI
	AnyMemI -> AtomicMemI
	AnyMemI -> AnyMemSetI
	AnyMemI -> AnyMemTransferI

	AnyMemSetI -> MemSetI
	AnyMemSetI -> AtomicMemSetI

	AnyMemTransferI -> AtomicMemTransferI
	AnyMemTransferI -> MemTransferI
	AnyMemTransferI -> AnyMemCpyI
	AnyMemTransferI -> AnyMemMoveI

	AtomicMemTransferI -> AtomicMemcpyI
	AtomicMemTransferI -> AtomicMemMoveI

	MemTransferI -> MemcpyI
	MemTransferI -> MemMoveI

	AnyMemCpyI -> MemcpyI
	AnyMemCpyI -> AtomicMemcpyI

	AnyMemMoveI -> MemMoveI
	AnyMemMoveI -> AtomicMemMoveI

	MemI -> MemTransferI
	MemI -> MemMoveI

	AtomicMemI -> AtomicMemTransferI
	AtomicMemI -> AtomicMemMoveI
}
include/llvm/IR/IntrinsicInst.h
218 ↗(On Diff #120302)

This will let users up-cast AtomicMemInstrinsic and MemIntrinsic objects to MemIntrinsicBase objects -- is that okay? If it isn't, we can use CRTP to avoid giving AtomicMemInstrinsic and MemIntrinsic the same base class, as in:

template<typename Derived>
class MemIntrinsicBase : public IntrinsicInst { ... /* No real use of Derived anywhere */ };

class AtomicMemIntrinsic : public MemIntrinsicBase<AtomicMemIntrinsic> { ... };
class MemIntrinsic : public MemIntrinsicBase<MemIntrinsic> { ... };

// Now MemIntrinsic and AtomicMemIntrinsic do not have a common base class (at the C++ level).
236 ↗(On Diff #120302)

Maybe explicitly note that this also strips addrspacecast?

320 ↗(On Diff #120302)

The inline attribute is not necessary here I think -- member defined within the class are automatically inline.

401 ↗(On Diff #120302)

Should we implement this for the atomic intrinsics as well?

523 ↗(On Diff #120302)

Isn't the the common base for atomic *and non-atomic* mem(set|move|cpy)?

531 ↗(On Diff #120302)

Did you consider not providing the isVolatile() accessor on AtomicMemIntrinsic? Design-wise I find that (not having the acccessor) slightly more preferable, but if that would require a lot of extra code then it isn't worth it.

630 ↗(On Diff #120302)

These comments need to be updated.

dneilson marked 3 inline comments as done.Oct 26 2017, 9:36 AM

Just to be clear, at the isa<>, dyn_cast<> level, this is the hierarchy you're shooting for right:

digraph G {
	AnyMemI -> MemI
	AnyMemI -> AtomicMemI
	AnyMemI -> AnyMemSetI
	AnyMemI -> AnyMemTransferI

	AnyMemSetI -> MemSetI
	AnyMemSetI -> AtomicMemSetI

	AnyMemTransferI -> AtomicMemTransferI
	AnyMemTransferI -> MemTransferI
	AnyMemTransferI -> AnyMemCpyI
	AnyMemTransferI -> AnyMemMoveI

	AtomicMemTransferI -> AtomicMemcpyI
	AtomicMemTransferI -> AtomicMemMoveI

	MemTransferI -> MemcpyI
	MemTransferI -> MemMoveI

	AnyMemCpyI -> MemcpyI
	AnyMemCpyI -> AtomicMemcpyI

	AnyMemMoveI -> MemMoveI
	AnyMemMoveI -> AtomicMemMoveI

	MemI -> MemTransferI
	MemI -> MemMoveI

	AtomicMemI -> AtomicMemTransferI
	AtomicMemI -> AtomicMemMoveI
}

The current isa<> hierarchy is this, minus the transitive edges:

digraph Existing {
  MemCpy -> MemTransfer
  MemMove -> MemTransfer
  MemTransfer -> MemI
  MemSet -> MemI
}

So, all of the following are true: isa<MemTransfer>(MemCpy), isa<MemTransfer>(MemMove), isa<MemI>(MemTransfer), and isa<MemI>(MemSet). Plus, the transitive relations: isa<MemI>(MemCpy), isa<MemI>(MemMove)

The new hierarchy is this exact same thing within each of of the atomicity-types (non-atomic, Atomic, and Any). In addition we have as true: isa<AnyX>(X) , isa<AnyX>(AtomicX), and all of the transitive edges like isa<AnyMemTransfer>(MemCpy) and isa<AnyMemI>(AtomicMemMove).

Stating the obvious... this is not a bidirectional relation. For example, it is not the case that isa<X>(AnyX) or isa<AtomicX>(AnyX).

include/llvm/IR/IntrinsicInst.h
218 ↗(On Diff #120302)

I don't know that it actually matters whether they have a common C++ base class or not; I can't really see the base class ever being used outside of these class definitions.

CRTP seems to be more the norm in the LLVM codebase, so I could change this to use CRTP if only to be more consistent with other parts of LLVM.

I've no strong preference; I could go either way.

236 ↗(On Diff #120302)

No harm in that.

401 ↗(On Diff #120302)

The alignment stuff is trickier for the atomic memory intrinsics. Alignment is a property of the source & dest arguments, rather than an argument of its own. For the atomic intrinsics we'd need to implement getSourceAlignmentCst() and getDestAlignmentCst().

So, I was erring on the side of leaving it unimplemented for now and adding it in once we see whether it's useful and how it would be used.

Also, I'm hoping to get to revisiting the old patch that removes the alignment parameter from the non-atomic intrinsics and makes it an argument property. Once that's been done, then we could unify the alignment interfaces between atomic & non-atomic intrinsics.

531 ↗(On Diff #120302)

Definitely considered it. I went for symmetry of implementation for no other reason than aesthetics.

Design-wise, there's definitely benefit to not having isVolatile() in the atomic hierarchy. Minimally, it's less confusing to implementers if it's not there -- since its always false for the atomic intrinsics.

If someone wants to add a volatile version of the atomic memory intrinsics down the line, for some reason, then they could always add isVolatile as part of that work.

dneilson updated this revision to Diff 120438.Oct 26 2017, 9:38 AM
dneilson marked an inline comment as done.
  • Rebase.
  • Fixing cut & paste errors in comments.
  • Remove isVolatile() method from the atomic memory intrinsic hierarchy.
  • Change MemInstrinsicBase to use CRTP.
  • Remove inline keyword from classof() method declarations.
sanjoy accepted this revision.Oct 26 2017, 11:16 AM

lgtm

I'd suggest waiting 1-2 days to give time for someone to chime in (after reading the llvm-dev ping you sent out) and then checking this in.

include/llvm/IR/IntrinsicInst.h
217 ↗(On Diff #120438)

Please also add a comment on why we have CRTP here.

218 ↗(On Diff #120302)

I don't know that it actually matters whether they have a common C++ base class or not; I can't really see the base class ever being used outside of these class definitions.

Another way of stating what I meant to say is that the fact that there is a MemIntrinsicBase class that is the base of all these classes is an implementation detail (I assumed this, maybe you have a different picture in mind); and for this reason we should help users not rely on this property by not actually providing a common base class.

401 ↗(On Diff #120302)

SGTM

531 ↗(On Diff #120302)

Minor nit: you could just use one dyn_cast here:

if (auto *MI = dyn_cast<MemIntrinsic>(...))
  return MI->isVolatile();
lib/IR/Verifier.cpp
4034 ↗(On Diff #120438)

Mind removing this extra semi-colon as well?

This revision is now accepted and ready to land.Oct 26 2017, 11:16 AM
dneilson updated this revision to Diff 120462.Oct 26 2017, 11:42 AM
  • Address minor nits.
  • Clang-formatting change.
dneilson updated this revision to Diff 120464.Oct 26 2017, 11:46 AM
  • Missed a comment regarding why we use CRTP for the MemIntrinsicBase class.
This revision was automatically updated to reflect the committed changes.