This is an archive of the discontinued LLVM Phabricator instance.

[Instructions]: Calls marked with inaccessiblememonly attribute should be considered to not read/write memory
Needs RevisionPublic

Authored by etiotto on May 12 2021, 11:50 AM.

Details

Summary

The documentations states that functions with the inaccessiblememonly only access memory that is not accessible by the module bering compiled:

inaccessiblememonly
This attribute indicates that the function may only access memory that is not accessible by the module being compiled. This is a weaker form of readnone. If the function reads or writes other memory, the behavior is undefined.

Therefore the Instruction::mayReadFromMemory() and Instruction::mayWriteToMemory() should report that such functions do not read not write memory in the module being compiled.

Diff Detail

Event Timeline

etiotto created this revision.May 12 2021, 11:50 AM
etiotto requested review of this revision.May 12 2021, 11:50 AM

I'm not exactly sure what it means to "access memory that is not accessible by the module being compiled.". My guess is that it's for things like intrinsics that take a global string as argument (eg. the file name), for compile-time mapping, but don't get lowered to any instructions that actually access that memory in the final generated assembly. Is that correct, or are there other examples to consider?

nikic requested changes to this revision.May 12 2021, 1:10 PM
nikic added a subscriber: nikic.

This looks wrong to me. To give one example: Accessing inaccessible memory is the typical way to model side-effects. mayHaveSideEffects() is defined in terms of mayWriteToMemory(). This change makes inaccessiblememonly side-effect free. Oops.

I do think that "only writes accessible memory" is a useful predicate, and many -- but not all -- current callers of mayWriteToMemory() can be switched to it. I would recommend to introduce mayWriteAccessibleMemory() / mayReadAccessibleMemory() and then migrate call-sites after review of the guarantees they actually need.

This revision now requires changes to proceed.May 12 2021, 1:10 PM

I'm not exactly sure what it means to "access memory that is not accessible by the module being compiled.". My guess is that it's for things like intrinsics that take a global string as argument (eg. the file name), for compile-time mapping, but don't get lowered to any instructions that actually access that memory in the final generated assembly. Is that correct, or are there other examples to consider?

I copied this from the LLVM manual definition. From this thread in the LLVM mailing list (https://lists.llvm.org/pipermail/llvm-dev/2019-April/131893.html):

> So what kind of function can be annotated with this attribute,
> functions which malloc heap memory and free it within the same scope?

Short answer: Yes, we probably could but do not do it right now. Longer
answer below. Also consider the use case of (known) function defined in
another translation unit (=module). They could then even work on some
global state, e.g., a global variable, if it never "leaks" out of the
other scope.

This looks wrong to me. To give one example: Accessing inaccessible memory is the typical way to model side-effects. mayHaveSideEffects() is defined in terms of mayWriteToMemory(). This change makes inaccessiblememonly side-effect free. Oops.

I do think that "only writes accessible memory" is a useful predicate, and many -- but not all -- current callers of mayWriteToMemory() can be switched to it. I would recommend to introduce mayWriteAccessibleMemory() / mayReadAccessibleMemory() and then migrate call-sites after review of the guarantees they actually need.

OK. So the semantics the inaccessiblememonly attribute allows the function to read or write memory pointed to by an argument. This PR was motivated by wanting to make llvm.annotation be consider side-effects free. That intrinsic is defined as:

def int_annotation : DefaultAttrsIntrinsic<
    [llvm_anyint_ty],
    [LLVMMatchType<0>, llvm_ptr_ty, llvm_ptr_ty, llvm_i32_ty],
    [IntrInaccessibleMemOnly], "llvm.annotation">;

and documented as:

The first argument is an integer value (result of some expression), the second is a pointer to a global string, the third is a pointer to a global string which is the source file name, and the last argument is the line number. It returns the value of the first argument.

So the intrinsic can read the arguments but it should not be able to modify their values. Nor does it need to modify global memory. So would marking that intrinsic with the readonly attribute be acceptable ?

nikic added a comment.May 12 2021, 2:10 PM

This looks wrong to me. To give one example: Accessing inaccessible memory is the typical way to model side-effects. mayHaveSideEffects() is defined in terms of mayWriteToMemory(). This change makes inaccessiblememonly side-effect free. Oops.

I do think that "only writes accessible memory" is a useful predicate, and many -- but not all -- current callers of mayWriteToMemory() can be switched to it. I would recommend to introduce mayWriteAccessibleMemory() / mayReadAccessibleMemory() and then migrate call-sites after review of the guarantees they actually need.

OK. So the semantics the inaccessiblememonly attribute allows the function to read or write memory pointed to by an argument.

No, it does not allow reading/writing arguments. That would be inaccessibleorargmemonly.

This PR was motivated by wanting to make llvm.annotation be consider side-effects free. That intrinsic is defined as:

def int_annotation : DefaultAttrsIntrinsic<
    [llvm_anyint_ty],
    [LLVMMatchType<0>, llvm_ptr_ty, llvm_ptr_ty, llvm_i32_ty],
    [IntrInaccessibleMemOnly], "llvm.annotation">;

and documented as:

The first argument is an integer value (result of some expression), the second is a pointer to a global string, the third is a pointer to a global string which is the source file name, and the last argument is the line number. It returns the value of the first argument.

So the intrinsic can read the arguments but it should not be able to modify their values. Nor does it need to modify global memory. So would marking that intrinsic with the readonly attribute be acceptable ?

That depends on the precise semantics @llvm.annotation is supposed to have, with which I'm not familiar, and which LangRef does not specify particularly clearly. Is it okay to drop an @llvm.annotation() whose return value is not used? If "no", then marking it readonly is not possible, it will be dropped as dead code. Is it okay to hoist @llvm.annotation() inside an unconditional loop header out of the loop? If "no", then dropping inaccessiblememonly is not possible either (being non-speculatable would prevent hoisting out of a conditional loop header, but not an unconditional one).

That depends on the precise semantics @llvm.annotation is supposed to have, with which I'm not familiar, and which LangRef does not specify particularly clearly. Is it okay to drop an @llvm.annotation() whose return value is not used? If "no", then marking it readonly is not possible, it will be dropped as dead code. Is it okay to hoist @llvm.annotation() inside an unconditional loop header out of the loop? If "no", then dropping inaccessiblememonly is not possible either (being non-speculatable would prevent hoisting out of a conditional loop header, but not an unconditional one).

Indeed. The llvm.annotation documents states:

This intrinsic allows annotations to be put on arbitrary expressions with arbitrary strings. This can be useful for special purpose optimizations that want to look for these annotations. These have no other defined use; they are ignored by code generation and optimization.

The sentence "they are ignored by code generation and optimization" makes me think the intrinsics should not have side-effects because if it had then it could not be ignored by optimizations.

If the motivation is to make only llvm.annotation to be considered side-effect free, why not make onlyReadsMemory/doesNotReadMemory return the expected value?
(semantics are weird. Dpes onlyReadsMemory impy that it does read from memory)?