This is an archive of the discontinued LLVM Phabricator instance.

PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz.
ClosedPublic

Authored by rsmith on May 4 2020, 5:04 PM.

Details

Summary

This transformation is correct for a builtin call to 'free(p)', but not
for 'operator delete(p)'. There is no guarantee that a user replacement
'operator delete' has no effect when called on a null pointer.

However, the principle behind the transformation *is* correct, and can
be applied more broadly: a 'delete p' expression is permitted to
unconditionally call 'operator delete(p)'. So do that in Clang under
-Oz where possible. We do this whether or not 'p' has trivial
destruction, since the destruction might turn out to be trivial after
inlining, and even for a class-specific (but non-virtual,
non-destroying, non-array) 'operator delete'.

Diff Detail

Event Timeline

rsmith created this revision.May 4 2020, 5:04 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 4 2020, 5:04 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

Is it reasonable to figure out ahead of time that we can skip the null check completely? It'd be kindof nice to also take advantage of this at -O0 whenever we don't have real work to do.

I believe we can avoid creating some blocks for latter removing them, no?

clang/lib/CodeGen/CGExprCXX.cpp
2042–2049

Unless I missed something, isn't it better to just avoid emitting this check and basic blocks all together if we are optimizing for size and when we know that Ptr is never null?
I would consider in doing something alike:

const bool emitNullCheck = CGM.getCodeGenOpts().OptimizeSize <= 1;
llvm::BasicBlock *DeleteNotNull;
llvm::BasicBlock *DeleteEnd;
if (emitNullCheck){
  // Null check the pointer.
  DeleteNotNull = createBasicBlock("delete.notnull");
  DeleteEnd = createBasicBlock("delete.end");

  llvm::Value *IsNull = Builder.CreateIsNull(Ptr.getPointer(), "isnull");

  Builder.CreateCondBr(IsNull, DeleteEnd, DeleteNotNull);
  EmitBlock(DeleteNotNull);
}

and we use the same emitNullCheck to avoid EmitBlocks below.

2090

If avoiding the emission above it would not require changing EmitObjectDelete at all.

rsmith marked an inline comment as done.May 10 2020, 2:44 PM

Is it reasonable to figure out ahead of time that we can skip the null check completely? It'd be kindof nice to also take advantage of this at -O0 whenever we don't have real work to do.

In simple cases, we could get a win at -O0 (deleting an object or array with a trivial (and therefore non-virtual) destructor + non-destroying operator delete + no array cookie). Perhaps in those cases we could sidestep the whole process and just generate a direct call to operator delete. That's a different optimization from the one I'm suggesting here (which in turn is trying to be the non-miscompiling subset of the original LLVM-side optimization), though there's lots of overlap. If we don't care about the -Oz cases where the destructor is non-trivial before optimization but can be removed by the optimizer, then we could do that instead of this change. If we decide we care about both situations (removing the branch at -O0 for trivial deletes and removing the branch at -Oz for deletes that optimize into being trivial deletes), then I think we need both something like this *and* that check.

What do you think? I'm somewhat inclined to prefer special-casing deletes that the frontend can see are trivial, rather than making the delete unconditional under -Oz. That means we might see a code size regression in some cases for -Oz, but I suspect that's mostly a theoretical risk.

If we do special-case trivial deletes, should we still insert the branch for them at -O1 and above? In order for that to be profitable, I think the branch would need to be both well-predicted and usually null, which seems like it could happen but is unlikely to be the common case. I suspect the best answer may be to do this regardless of optimization level.

clang/lib/CodeGen/CGExprCXX.cpp
2042–2049

I don't think we can reasonably do this. There are a lot of different ways that delete emission can be performed, and many of them (for example, calling a virtual deleting destructor, destroying operator delete, or array delete with cookie) require the null check to be performed in advance for correctness. It would be brittle to duplicate all of those checks here.

We *could* sink the null checks into the various paths through EmitArrayDelete and EmitObjectDelete, but I think that makes the code significantly more poorly factored.

From my point it does LGTM.

clang/lib/CodeGen/CGExprCXX.cpp
2042–2049

Fair enough.

Is it reasonable to figure out ahead of time that we can skip the null check completely? It'd be kindof nice to also take advantage of this at -O0 whenever we don't have real work to do.

In simple cases, we could get a win at -O0 (deleting an object or array with a trivial (and therefore non-virtual) destructor + non-destroying operator delete + no array cookie). Perhaps in those cases we could sidestep the whole process and just generate a direct call to operator delete. That's a different optimization from the one I'm suggesting here (which in turn is trying to be the non-miscompiling subset of the original LLVM-side optimization), though there's lots of overlap. If we don't care about the -Oz cases where the destructor is non-trivial before optimization but can be removed by the optimizer, then we could do that instead of this change. If we decide we care about both situations (removing the branch at -O0 for trivial deletes and removing the branch at -Oz for deletes that optimize into being trivial deletes), then I think we need both something like this *and* that check.

What do you think? I'm somewhat inclined to prefer special-casing deletes that the frontend can see are trivial, rather than making the delete unconditional under -Oz. That means we might see a code size regression in some cases for -Oz, but I suspect that's mostly a theoretical risk.

If we do special-case trivial deletes, should we still insert the branch for them at -O1 and above? In order for that to be profitable, I think the branch would need to be both well-predicted and usually null, which seems like it could happen but is unlikely to be the common case. I suspect the best answer may be to do this regardless of optimization level.

It doesn't have to be *usually* null, just null often enough that it pays for itself. I think that isn't that unlikely given that a lot of deletes are from abstracted locations like destructors.

On balance, I think I'm okay with your current approach, as long as we document our assumption that skipping the check is not usually a performance win. I'm not too worried about the -O0 cost.

dnsampaio accepted this revision.May 14 2020, 5:01 AM

LGTM, as far @rjmccall 's concern about documentation is addressed.

This revision is now accepted and ready to land.May 14 2020, 5:01 AM

Hi @rsmith,
are you still looking into this?
cheers

rsmith added a comment.Jun 4 2020, 6:04 PM

Hi @rsmith,
are you still looking into this?
cheers

Sorry for the delay, I'll be getting back to this soon.

This revision was automatically updated to reflect the committed changes.

Hi,

I have a question for the delete function call sinking in -Oz.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf.
According to 3.7.4.2/3
The value of the first argument supplied to a deallocation function may be a null pointer value; if so, and if the deallocation function is one supplied in the standard library, the call has no effect. Otherwise, the behavior is undefined

It seems the null checking can be skipped only when the delete is not from the standard library(have been overloaded by the user), isn't it?

Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2023, 8:41 PM

Hi,

I have a question for the delete function call sinking in -Oz.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf.
According to 3.7.4.2/3
The value of the first argument supplied to a deallocation function may be a null pointer value; if so, and if the deallocation function is one supplied in the standard library, the call has no effect. Otherwise, the behavior is undefined

It seems the null checking can be skipped only when the delete is not from the standard library(have been overloaded by the user), isn't it?

The paragraph you just quoted says that the deallocation functions provided by the standard library are required have no effect on null pointers. That means it's fine to call them with a null pointer even if the source wouldn't normally execute that call. We could just insert these calls into random other code if we wanted to.

As discussed in the original review, removing null checks around calls to operator delete is rarely going to be a good code-speed optimization, but it is a good code-size optimization, and -Oz in Clang is a request to minimize code size even if it makes the code slower.

Hi,

I have a question for the delete function call sinking in -Oz.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf.
According to 3.7.4.2/3
The value of the first argument supplied to a deallocation function may be a null pointer value; if so, and if the deallocation function is one supplied in the standard library, the call has no effect. Otherwise, the behavior is undefined

It seems the null checking can be skipped only when the delete is not from the standard library(have been overloaded by the user), isn't it?

The paragraph you just quoted says that the deallocation functions provided by the standard library are required have no effect on null pointers. That means it's fine to call them with a null pointer even if the source wouldn't normally execute that call. We could just insert these calls into random other code if we wanted to.

Is a null operand of delete in the source code no effect because there will be null pointer checking generated? Or the delete(null) in assembly code also valid?

As discussed in the original review, removing null checks around calls to operator delete is rarely going to be a good code-speed optimization, but it is a good code-size optimization, and -Oz in Clang is a request to minimize code size even if it makes the code slower.

Hi,

I have a question for the delete function call sinking in -Oz.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf.
According to 3.7.4.2/3
The value of the first argument supplied to a deallocation function may be a null pointer value; if so, and if the deallocation function is one supplied in the standard library, the call has no effect. Otherwise, the behavior is undefined

It seems the null checking can be skipped only when the delete is not from the standard library(have been overloaded by the user), isn't it?

The paragraph you just quoted says that the deallocation functions provided by the standard library are required have no effect on null pointers. That means it's fine to call them with a null pointer even if the source wouldn't normally execute that call. We could just insert these calls into random other code if we wanted to.

Is a null operand of delete in the source code no effect because there will be null pointer checking generated? Or the delete(null) in assembly code also valid?

In [expr.delete], the standard specifies that using the delete operator on a null pointer is required to have no effect. It does not specify how that should be implemented. In [basic.std.dynamic.deallocation], the standard also specifies that the library-provided operator delete functions must have no effect when given a null pointer. This requirement applies no matter how the function is called, so yes, a call from assembly that passed a null pointer would also be required to have no effect. More officially, you can simply take the address of operator delete, yielding an rvalue of void (*)(void *) (or similar, depending on which operator delete you use), which can then be called later in a context that doesn't understand it's calling a deallocation function; such a call is still required to have no effect when passing a null pointer, because the requirement is laid on the function, separate from the circumstances of it being called as part of the delete operator.

Note also that calling operator delete unconditionally when implementing the delete operator is explicitly blessed by [expr.delete]p7:

If the value of the operand of the delete-expression is not a null pointer value, then:

...

Otherwise, it is unspecified whether the deallocation function will be called.

Hi,

I have a question for the delete function call sinking in -Oz.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf.
According to 3.7.4.2/3
The value of the first argument supplied to a deallocation function may be a null pointer value; if so, and if the deallocation function is one supplied in the standard library, the call has no effect. Otherwise, the behavior is undefined

It seems the null checking can be skipped only when the delete is not from the standard library(have been overloaded by the user), isn't it?

The paragraph you just quoted says that the deallocation functions provided by the standard library are required have no effect on null pointers. That means it's fine to call them with a null pointer even if the source wouldn't normally execute that call. We could just insert these calls into random other code if we wanted to.

Is a null operand of delete in the source code no effect because there will be null pointer checking generated? Or the delete(null) in assembly code also valid?

In [expr.delete], the standard specifies that using the delete operator on a null pointer is required to have no effect. It does not specify how that should be implemented. In [basic.std.dynamic.deallocation], the standard also specifies that the library-provided operator delete functions must have no effect when given a null pointer. This requirement applies no matter how the function is called, so yes, a call from assembly that passed a null pointer would also be required to have no effect. More officially, you can simply take the address of operator delete, yielding an rvalue of void (*)(void *) (or similar, depending on which operator delete you use), which can then be called later in a context that doesn't understand it's calling a deallocation function; such a call is still required to have no effect when passing a null pointer, because the requirement is laid on the function, separate from the circumstances of it being called as part of the delete operator.

Note also that calling operator delete unconditionally when implementing the delete operator is explicitly blessed by [expr.delete]p7:

If the value of the operand of the delete-expression is not a null pointer value, then:

...

Otherwise, it is unspecified whether the deallocation function will be called.

Hi @rjmccall,

That make sense to me.
Thanks for sharing your insight of C++ standard and taking time to answer the questions.
Thank you ^^