This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Implement RFC: Decomposing deref(N) into deref(N) + nofree
AbandonedPublic

Authored by reames on Mar 22 2021, 11:59 AM.

Details

Summary

This implements the semantic change to dereferenceability described in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree".

At the moment, it shows the (widespread) optimization impact of simply toggling the behavior. My plan is to examine each test change to see if a) we can generalize the transform slightly to not depend on global deref, or b) what appropriate test changes (e.g. adding attributes) make sense without destroying the intend of the test. My plan is to tackle each transform in it's own review, and rebase this one incrementally as we go.

In addition to the updated tests, there are currently 4 failing tests. These simply happen to be difficult to show updates in easily due to limitations of our auto-update tests. They will be included before final review of this patch.

LLVM :: Analysis/BasicAA/dereferenceable.ll
LLVM :: Analysis/ValueTracking/memory-dereferenceable.ll
LLVM :: Transforms/VectorCombine/X86/load-inseltpoison.ll
LLVM :: Transforms/VectorCombine/X86/load.ll

Diff Detail

Event Timeline

reames created this revision.Mar 22 2021, 11:59 AM
reames requested review of this revision.Mar 22 2021, 11:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2021, 11:59 AM
reames updated this revision to Diff 358754.Jul 14 2021, 2:29 PM

Rebase over previous changes and now stablized tests.

This is nearing the point of being ready for real review. When that happens, I'm going to open a new review with a much revised description. We've ended up moving in a direction which doesn't align well with the original framing on the review.

The last major piece before I put this up for review is some performance validation to make sure the impact of this isn't "too bad". Some of that can happen after the review is posted, but I want to at least have a sanity check first.

nikic added a subscriber: nikic.Jul 14 2021, 2:46 PM

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

That's because there was disagreement as to what the semantics of such an argument were. I got frustrated in trying to drive that towards any useful conclusion, and don't plan to return to the topic.

On the general topic, https://reviews.llvm.org/D101701 is still open, but that doesn't seem to directly involve the parameter attribute. I can't find what I'm thinking of, so maybe I just misremembered that particular sub-piece being controversial?

nikic added a comment.Jul 14 2021, 3:18 PM

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

That's because there was disagreement as to what the semantics of such an argument were. I got frustrated in trying to drive that towards any useful conclusion, and don't plan to return to the topic.

On the general topic, https://reviews.llvm.org/D101701 is still open, but that doesn't seem to directly involve the parameter attribute. I can't find what I'm thinking of, so maybe I just misremembered that particular sub-piece being controversial?

I think the controversy was about the function attribute only. It's pretty important to me that using dereferenceable plus nofree on an argument works (and does so without any additional nosync requirements), because that means you can get the current dereferenceable semantics back simply by emitting dereferenceable nofree in your frontend wherever you used plain dereferenceable before. That gives a clear migration path for which no regressions should be expected (is that right?)

From a quick look at canBeFreed(), it seems like the case of a nofree argument isn't handled yet.

That's because there was disagreement as to what the semantics of such an argument were. I got frustrated in trying to drive that towards any useful conclusion, and don't plan to return to the topic.

On the general topic, https://reviews.llvm.org/D101701 is still open, but that doesn't seem to directly involve the parameter attribute. I can't find what I'm thinking of, so maybe I just misremembered that particular sub-piece being controversial?

I think the controversy was about the function attribute only. It's pretty important to me that using dereferenceable plus nofree on an argument works (and does so without any additional nosync requirements), because that means you can get the current dereferenceable semantics back simply by emitting dereferenceable nofree in your frontend wherever you used plain dereferenceable before. That gives a clear migration path for which no regressions should be expected (is that right?)

To be clear, there is no current migration plan for which no regressions are expected. I tried to come up with one, it failed miserably. I'm no longer working towards that goal.

Your point about wanting nofree on a parameter to not require nosync runs exactly into the discussion I linked.

If you want to drive the nofree parameter case, feel free. If you want to collect some numbers and tell me whether this a real or hypothetical regression, that would be super useful.

Lest I seem too dismissive, let me expand on something I was planning to include in the description of the real patch after we had some preliminary numbers. We will infer nofree+nosync on the function if possible. In practice, this covers a lot of the concerning cases I saw. The biggest hesitation I have is that inlining small functions effectively destroys our ability to infer nofree/nosync regions. We don't really have an answer for context nofree reasoning after heavy inlining. If we're going to have a major regression, I'm expecting it to come from that interaction.

@nikic I remember the situation with the parameter attribute. The issue is that the attribute describes actions taken through that particular copy of the pointer, not all copies of the pointer. This makes it easier to infer, but makes it very challenging to use for optimization. As an example, consider the following:
void foo(bool c, char * deref(1) nofree a, char * b) {

loop {
  if (c) { free(b); break; }
  v = *a;
}

}
foo(true, p, p);

In this example, we'd want to hoist the load from a, but since b can point to the same object, we can't.

This was the motivation for the use of noalias in the original proposal. Review discussion made it quickly clear that there was no consensus as to what noalias actually meant, and I dropped that approach. There might be room to drive that forward, but it'll intersect with all the aliasing work involving the same attribute in complicated ways.

Extending this pointer property to an object property was the notion behind the "nofreeobj" idea. I still think that would work if pursued.