This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/IR/
-
lib/
-
IR/
-
Instruction.cpp

Differential D102348

[Instructions]: Calls marked with inaccessiblememonly attribute should be considered to not read/write memory
Needs RevisionPublic

Authored by etiotto on May 12 2021, 11:50 AM.

Download Raw Diff

Details

Reviewers

Meinersbur
bmahjour
Whitney
nikic

Summary

The documentations states that functions with the inaccessiblememonly only access memory that is not accessible by the module bering compiled:

inaccessiblememonly
This attribute indicates that the function may only access memory that is not accessible by the module being compiled. This is a weaker form of readnone. If the function reads or writes other memory, the behavior is undefined.

Therefore the Instruction::mayReadFromMemory() and Instruction::mayWriteToMemory() should report that such functions do not read not write memory in the module being compiled.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

etiotto created this revision.May 12 2021, 11:50 AM

Herald added subscribers: dexonsmith, hiraditya. · View Herald TranscriptMay 12 2021, 11:50 AM

etiotto requested review of this revision.May 12 2021, 11:50 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 12 2021, 11:50 AM

I'm not exactly sure what it means to "access memory that is not accessible by the module being compiled.". My guess is that it's for things like intrinsics that take a global string as argument (eg. the file name), for compile-time mapping, but don't get lowered to any instructions that actually access that memory in the final generated assembly. Is that correct, or are there other examples to consider?

This looks wrong to me. To give one example: Accessing inaccessible memory is the typical way to model side-effects. mayHaveSideEffects() is defined in terms of mayWriteToMemory(). This change makes inaccessiblememonly side-effect free. Oops.

I do think that "only writes accessible memory" is a useful predicate, and many -- but not all -- current callers of mayWriteToMemory() can be switched to it. I would recommend to introduce mayWriteAccessibleMemory() / mayReadAccessibleMemory() and then migrate call-sites after review of the guarantees they actually need.

This revision now requires changes to proceed.May 12 2021, 1:10 PM

In D102348#2754951, @bmahjour wrote:

I'm not exactly sure what it means to "access memory that is not accessible by the module being compiled.". My guess is that it's for things like intrinsics that take a global string as argument (eg. the file name), for compile-time mapping, but don't get lowered to any instructions that actually access that memory in the final generated assembly. Is that correct, or are there other examples to consider?

I copied this from the LLVM manual definition. From this thread in the LLVM mailing list (https://lists.llvm.org/pipermail/llvm-dev/2019-April/131893.html):

> So what kind of function can be annotated with this attribute,
> functions which malloc heap memory and free it within the same scope?

Short answer: Yes, we probably could but do not do it right now. Longer
answer below. Also consider the use case of (known) function defined in
another translation unit (=module). They could then even work on some
global state, e.g., a global variable, if it never "leaks" out of the
other scope.

In D102348#2755223, @nikic wrote:

This looks wrong to me. To give one example: Accessing inaccessible memory is the typical way to model side-effects. mayHaveSideEffects() is defined in terms of mayWriteToMemory(). This change makes inaccessiblememonly side-effect free. Oops.

I do think that "only writes accessible memory" is a useful predicate, and many -- but not all -- current callers of mayWriteToMemory() can be switched to it. I would recommend to introduce mayWriteAccessibleMemory() / mayReadAccessibleMemory() and then migrate call-sites after review of the guarantees they actually need.

OK. So the semantics the inaccessiblememonly attribute allows the function to read or write memory pointed to by an argument. This PR was motivated by wanting to make llvm.annotation be consider side-effects free. That intrinsic is defined as:

def int_annotation : DefaultAttrsIntrinsic<
    [llvm_anyint_ty],
    [LLVMMatchType<0>, llvm_ptr_ty, llvm_ptr_ty, llvm_i32_ty],
    [IntrInaccessibleMemOnly], "llvm.annotation">;

and documented as:

The first argument is an integer value (result of some expression), the second is a pointer to a global string, the third is a pointer to a global string which is the source file name, and the last argument is the line number. It returns the value of the first argument.

So the intrinsic can read the arguments but it should not be able to modify their values. Nor does it need to modify global memory. So would marking that intrinsic with the readonly attribute be acceptable ?

In D102348#2755385, @etiotto wrote:

In D102348#2755223, @nikic wrote:

This looks wrong to me. To give one example: Accessing inaccessible memory is the typical way to model side-effects. mayHaveSideEffects() is defined in terms of mayWriteToMemory(). This change makes inaccessiblememonly side-effect free. Oops.

I do think that "only writes accessible memory" is a useful predicate, and many -- but not all -- current callers of mayWriteToMemory() can be switched to it. I would recommend to introduce mayWriteAccessibleMemory() / mayReadAccessibleMemory() and then migrate call-sites after review of the guarantees they actually need.

OK. So the semantics the inaccessiblememonly attribute allows the function to read or write memory pointed to by an argument.

No, it does not allow reading/writing arguments. That would be inaccessibleorargmemonly.

This PR was motivated by wanting to make llvm.annotation be consider side-effects free. That intrinsic is defined as:
def int_annotation : DefaultAttrsIntrinsic<
    [llvm_anyint_ty],
    [LLVMMatchType<0>, llvm_ptr_ty, llvm_ptr_ty, llvm_i32_ty],
    [IntrInaccessibleMemOnly], "llvm.annotation">;
and documented as:
The first argument is an integer value (result of some expression), the second is a pointer to a global string, the third is a pointer to a global string which is the source file name, and the last argument is the line number. It returns the value of the first argument.
So the intrinsic can read the arguments but it should not be able to modify their values. Nor does it need to modify global memory. So would marking that intrinsic with the readonly attribute be acceptable ?

That depends on the precise semantics @llvm.annotation is supposed to have, with which I'm not familiar, and which LangRef does not specify particularly clearly. Is it okay to drop an @llvm.annotation() whose return value is not used? If "no", then marking it readonly is not possible, it will be dropped as dead code. Is it okay to hoist @llvm.annotation() inside an unconditional loop header out of the loop? If "no", then dropping inaccessiblememonly is not possible either (being non-speculatable would prevent hoisting out of a conditional loop header, but not an unconditional one).

Harbormaster completed remote builds in B104099: Diff 344891.May 12 2021, 6:11 PM

That depends on the precise semantics @llvm.annotation is supposed to have, with which I'm not familiar, and which LangRef does not specify particularly clearly. Is it okay to drop an @llvm.annotation() whose return value is not used? If "no", then marking it readonly is not possible, it will be dropped as dead code. Is it okay to hoist @llvm.annotation() inside an unconditional loop header out of the loop? If "no", then dropping inaccessiblememonly is not possible either (being non-speculatable would prevent hoisting out of a conditional loop header, but not an unconditional one).

Indeed. The llvm.annotation documents states:

This intrinsic allows annotations to be put on arbitrary expressions with arbitrary strings. This can be useful for special purpose optimizations that want to look for these annotations. These have no other defined use; they are ignored by code generation and optimization.

The sentence "they are ignored by code generation and optimization" makes me think the intrinsics should not have side-effects because if it had then it could not be ignored by optimizations.

If the motivation is to make only llvm.annotation to be considered side-effect free, why not make onlyReadsMemory/doesNotReadMemory return the expected value?
(semantics are weird. Dpes onlyReadsMemory impy that it does read from memory)?

Revision Contents

Path

Size

llvm/

lib/

IR/

Instruction.cpp

12 lines

Diff 344891

llvm/lib/IR/Instruction.cpp

Show First 20 Lines • Show All 547 Lines • ▼ Show 20 Lines	bool Instruction::mayReadFromMemory() const {
case Instruction::Fence: // FIXME: refine definition of mayReadFromMemory		case Instruction::Fence: // FIXME: refine definition of mayReadFromMemory
case Instruction::AtomicCmpXchg:		case Instruction::AtomicCmpXchg:
case Instruction::AtomicRMW:		case Instruction::AtomicRMW:
case Instruction::CatchPad:		case Instruction::CatchPad:
case Instruction::CatchRet:		case Instruction::CatchRet:
return true;		return true;
case Instruction::Call:		case Instruction::Call:
case Instruction::Invoke:		case Instruction::Invoke:
case Instruction::CallBr:		case Instruction::CallBr: {
return !cast<CallBase>(this)->doesNotReadMemory();		const CallBase &CB = *cast<CallBase>(this);
		return !CB.doesNotReadMemory() && !CB.onlyAccessesInaccessibleMemory();
		}
case Instruction::Store:		case Instruction::Store:
return !cast<StoreInst>(this)->isUnordered();		return !cast<StoreInst>(this)->isUnordered();
}		}
}		}

bool Instruction::mayWriteToMemory() const {		bool Instruction::mayWriteToMemory() const {
switch (getOpcode()) {		switch (getOpcode()) {
default: return false;		default: return false;
case Instruction::Fence: // FIXME: refine definition of mayWriteToMemory		case Instruction::Fence: // FIXME: refine definition of mayWriteToMemory
case Instruction::Store:		case Instruction::Store:
case Instruction::VAArg:		case Instruction::VAArg:
case Instruction::AtomicCmpXchg:		case Instruction::AtomicCmpXchg:
case Instruction::AtomicRMW:		case Instruction::AtomicRMW:
case Instruction::CatchPad:		case Instruction::CatchPad:
case Instruction::CatchRet:		case Instruction::CatchRet:
return true;		return true;
case Instruction::Call:		case Instruction::Call:
case Instruction::Invoke:		case Instruction::Invoke:
case Instruction::CallBr:		case Instruction::CallBr: {
return !cast<CallBase>(this)->onlyReadsMemory();		const CallBase &CB = *cast<CallBase>(this);
		return !CB.onlyReadsMemory() && !CB.onlyAccessesInaccessibleMemory();
		}
case Instruction::Load:		case Instruction::Load:
return !cast<LoadInst>(this)->isUnordered();		return !cast<LoadInst>(this)->isUnordered();
}		}
}		}

bool Instruction::isAtomic() const {		bool Instruction::isAtomic() const {
switch (getOpcode()) {		switch (getOpcode()) {
default:		default:
▲ Show 20 Lines • Show All 252 Lines • Show Last 20 Lines