This is an archive of the discontinued LLVM Phabricator instance.

[ArgPromotion] Copy !range metadata for loads.
AbandonedPublic

Authored by wecing on Dec 29 2020, 5:06 PM.

Details

Summary

Current ArgPromotion implementation does not copy it: https://godbolt.org/z/zzTKof

Diff Detail

Event Timeline

wecing created this revision.Dec 29 2020, 5:06 PM
wecing requested review of this revision.Dec 29 2020, 5:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 29 2020, 5:06 PM
fhahn added a comment.Jan 7 2021, 6:12 AM

This looks reasonable to me. The load gets only moved, exactly the same value should get loaded. What about other metadata on loads?

wecing updated this revision to Diff 315275.Jan 7 2021, 4:38 PM

Copy all metadata.

ping -- @fhahn , does the new change look reasonable to you?

Copy all metadata.

Sorry for the delay. I am not sure if it is safe to copy all metadata, it would probably be better to add them one-by-one. I was mostly asking to check which other kinds would be safe. For example, debug metadata is function specific, so it can't simply be copied I think. Could you add a test with !dbg metadata?

wecing updated this revision to Diff 329845.Mar 10 2021, 10:02 PM

Only copy seleted metadata.

Thank you @fhahn for the response. I went through the metadata types mentioned in https://llvm.org/docs/LangRef.html#load-instruction, and those listed in https://llvm.org/docs/LangRef.html#metadata that mentiones could be used on 'load', and hand picked those I think should be safe to copy.

Since I do not have write access to the repo, if this change looks good to you, would you want to commit it on my behalf?

Only copy seleted metadata.

Thank you @fhahn for the response. I went through the metadata types mentioned in https://llvm.org/docs/LangRef.html#load-instruction, and those listed in https://llvm.org/docs/LangRef.html#metadata that mentiones could be used on 'load', and hand picked those I think should be safe to copy.

The attributes in question look like a good start to me, thanks.

llvm/test/Transforms/ArgumentPromotion/metadata.ll
6

If you are checking the full IR, you may want to use llvm/utils/update_test_checks.py to auto-generate the checks. The auto-generated checks use regular expressions + variables for value names, so they are less prone to break when value names change due to unrelated changes.

11

Would a simple pointer type also work, or do we need a struct?

wecing updated this revision to Diff 330086.Mar 11 2021, 3:35 PM

update test cases:

  • remove irrelevant constructions in test cases
  • use update_test_checks.py to generate regex-based checks
wecing marked 2 inline comments as done.Mar 11 2021, 3:37 PM
wecing added inline comments.
llvm/test/Transforms/ArgumentPromotion/metadata.ll
11

Thank you. Yes, a simple pointer is enough. The getelementptr is also unnecessary, so I also removed that.

fhahn accepted this revision.Mar 12 2021, 2:16 AM

LGTM, thanks! Just a few small remaining edits for the test. Please also let me know which name + email to use when I commit the change after the updates.

This revision is now accepted and ready to land.Mar 12 2021, 2:16 AM
fhahn added inline comments.Mar 12 2021, 2:18 AM
llvm/test/Transforms/ArgumentPromotion/metadata.ll
7

can you add a manual check line at the end of the file to check that the !0 metadata matches the expected range?

32

please also add a test for dereferenceable

wecing updated this revision to Diff 330390.Mar 12 2021, 3:45 PM
wecing marked an inline comment as done.
  • add test for !dereferenceable
  • check metadata objects after optimization
wecing marked 2 inline comments as done.Mar 12 2021, 3:46 PM

Please use "Chenguang Wang" and w3cing[at]gmail.com. Thank you @fhahn!

This revision was automatically updated to reflect the committed changes.

This is causing a runtime failure in one of our tests. Specifically due to the nonnull metadata. It appears that this should not be cloned because the circumstances under which it is valid and applied to the load in the callee may not exist in the caller. I'm not sure about whether this is an issue for some of the other metadata. I'm going to try to create a smaller test case. But perhaps this change should be reverted.

This is causing a runtime failure in one of our tests. Specifically due to the nonnull metadata. It appears that this should not be cloned because the circumstances under which it is valid and applied to the load in the callee may not exist in the caller. I'm not sure about whether this is an issue for some of the other metadata. I'm going to try to create a smaller test case. But perhaps this change should be reverted.

Here is a small test case. Note I have to split the invocation into 2 to get the problem to reproduce, due to phase ordering between InstCombine Calls that creates the !nonnull metadata and ArgPromotion that propagates it during promotion. We are hitting this in a ThinLTO build, where the !nonnull metadata is added during the initial compile, and after some additional inlining in the ThinLTO post link pipeline it then gets the argument promotion.

I am not sure if the other attributes being propagated could also hit issues like this. Can you revert while it is being investigated?

$ cat c.cc
struct Bar {
  int i;
};
struct Foo final {
  Bar *b;
  bool x;
};

__attribute__((noinline)) static int func(const Foo &f) {
  // This initially looked like:
  // __builtin_assume(!f.x || f.b != nullptr);
  // and in the original test case jump threading transformed it into the form
  // below with the assume split. Since I couldn't get jump threading to do this
  // for my small test case I have manually made the transformation here.
  if (!f.x)
    __builtin_assume(true);
  else
    __builtin_assume(f.b != nullptr);
  if (f.x)
    // Because this is essentially under the same condition as the
    // __builtin_assume(f.b != nullptr), the load of f.b gets the
    // !nonnull metadata.
    return f.b->i;
  return 0;
}

int caller(const Foo &f) {
  return func(f);
}

$ clang++ -O1 c.cc -flto=thin -S
# c.s has !nonnull on dereference in func()
$ opt < c.s -argpromotion -S
# !nonnull has been propagated to dereference now in caller(). However, this load is no longer guarded by the check of "if (f.x)"

Sorry about the inconveniences, @tejohnson. I am fine with rolling this back, but since I do not have write access to the repo, @fhahn, would you like to roll it back?


I found a much shorter program that could reproduce this issue:

%struct.Foo = type { %struct.Bar*, i8 }
%struct.Bar = type { i32 }

; Function Attrs: nounwind uwtable
define i32 @caller(%struct.Foo* dereferenceable(16) %0) local_unnamed_addr #0 {
  %2 = call i32 @func(%struct.Foo* dereferenceable(16) %0)
  ret i32 %2
}

; Function Attrs: nounwind uwtable
define internal i32 @func(%struct.Foo* dereferenceable(16) %0) unnamed_addr #0 {
  %2 = getelementptr inbounds %struct.Foo, %struct.Foo* %0, i64 0, i32 1
  %3 = load i8, i8* %2, align 8
  %4 = icmp eq i8 %3, 0
  br i1 %4, label %10, label %5

5:                                                ; preds = %1
  %6 = getelementptr inbounds %struct.Foo, %struct.Foo* %0, i64 0, i32 0
  %7 = load %struct.Bar*, %struct.Bar** %6, align 8, !nonnull !0
  %8 = getelementptr inbounds %struct.Bar, %struct.Bar* %7, i64 0, i32 0
  %9 = load i32, i32* %8, align 4
  br label %10

10:                                               ; preds = %1, %5
  %11 = phi i32 [ %9, %5 ], [ 0, %1 ]
  ret i32 %11
}

attributes #0 = { nounwind uwtable }

!0 = !{}

Running opt < arg_promo.ll -argpromotion -S > arg_promo.opt2.ll produces:

; ModuleID = '<stdin>'
source_filename = "<stdin>"

%struct.Foo = type { %struct.Bar*, i8 }
%struct.Bar = type { i32 }

; Function Attrs: nounwind uwtable
define i32 @caller(%struct.Foo* dereferenceable(16) %0) local_unnamed_addr #0 {
  %.idx = getelementptr %struct.Foo, %struct.Foo* %0, i64 0, i32 0
  %.idx.val = load %struct.Bar*, %struct.Bar** %.idx, align 8, !nonnull !0
  %.idx1 = getelementptr %struct.Foo, %struct.Foo* %0, i64 0, i32 1
  %.idx1.val = load i8, i8* %.idx1, align 8
  %2 = call i32 @func(%struct.Bar* %.idx.val, i8 %.idx1.val)
  ret i32 %2
}

; Function Attrs: nounwind uwtable
define internal i32 @func(%struct.Bar* %.0.0.val, i8 %.0.1.val) unnamed_addr #0 {
  %1 = icmp eq i8 %.0.1.val, 0
  br i1 %1, label %5, label %2

2:                                                ; preds = %0
  %3 = getelementptr inbounds %struct.Bar, %struct.Bar* %.0.0.val, i64 0, i32 0
  %4 = load i32, i32* %3, align 4
  br label %5

5:                                                ; preds = %2, %0
  %6 = phi i32 [ %4, %2 ], [ 0, %0 ]
  ret i32 %6
}

attributes #0 = { nounwind uwtable }

!0 = !{}

Copying !nonnull isn't the issue here; the root problem is, the second load (%7 from block %5) should not be promoted, since block %5 will not always be executed.

I can try to fix this once I get a chance, but I cannot guarantee an ETA.

nikic reopened this revision.Mar 26 2021, 2:04 PM
nikic added a subscriber: nikic.

Here is a further reduction:

define i32* @caller(i1 %c, i32** dereferenceable(8) %arg) {
  %p = call i32* @func(i1 %c, i32** dereferenceable(8) %arg)
  ret i32* %p
} 

define internal i32* @func(i1 %c, i32** dereferenceable(8) %arg) {
bb:
  br i1 %c, label %if, label %else
  
if:
  %p = load i32*, i32** %arg, align 8, !nonnull !{}
  ret i32* %p 

else:
  ret i32* null                                    
}

It's okay to promote the load, because the pointer is dereferenceable, but it's not okay to transfer the metadata. There are two pre-existing bugs in this respect: We were already transfering the AA metadata, which is also incorrect. And we were transferring the load alignment (this probably shouldn't be transferred at all but rather a precondition). We can only transfer metadata if the load is guaranteed-to-execute from the entry of the function.

This revision is now accepted and ready to land.Mar 26 2021, 2:04 PM

We were already transfering the AA metadata, which is also incorrect.

IIUC, lack of AA metadata wouldn't result in incorrect code, so that SGTM. But for this:

And we were transferring the load alignment (this probably shouldn't be transferred at all but rather a precondition)

I have an counterexample:

int func(int *p) {
    if (p % 4 != 0) {
        return load(p, align=1);
    }
    return 0;
}

From LLVM lang ref:

an omitted align argument means that the operation has the ABI alignment for the target... Overestimating the alignment results in undefined behavior... An alignment of 1 is always safe.

Not copying the align info should cause LLVM to fall back to load align 4, but since p % 4 != 0, it would produce undefined behavior.

So my guess is we should always copy align and !align; for the rest metadata items, as you suggested, only copy if they are guaranteed to be executed.

wecing updated this revision to Diff 334030.Mar 29 2021, 6:30 PM

Conditionally copy most metadata info.

Updated this patch according to my previous comment. Would you mind taking another look, @fhahn and @nikic ? Thanks!

ping... or are there any other owners of this pass?

nikic added inline comments.May 10 2021, 11:11 AM
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
319

Without taking a closer look at the overall patch, this check is insufficient, because not all instructions in the entry block are guaranteed to execute (e.g. due to unwinding). There is isGuaranteedToTransferExecutionToSuccessor to check this.

wecing updated this revision to Diff 344275.May 10 2021, 8:00 PM

Check OrigLoad reachability with isGuaranteedToTransferExecutionToSuccessor().

wecing marked an inline comment as done.May 10 2021, 8:00 PM
MatzeB added a subscriber: MatzeB.Jul 29 2021, 2:49 PM

Hey, I was just about to submit something like your first patch version here independently (and with the same bugs).

Anyway the latest version of the diff makes sense to me. I added some nitpicks but this is essentially good to go.

llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
320–327

avoid check for isOrigLoadGuaranteedToExecute in every loop iteration.

330–333

This already happens in line 312.

335–339

How about using llvm::copyMetadataForLoad from Transforms/Utils/Local.h at a first glance it seems to work fine even for loads in different functions.

wenlei added a subscriber: wenlei.Jul 29 2021, 3:05 PM
MatzeB added inline comments.Jul 29 2021, 3:41 PM
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
330–333

Please ignore my comment, just realized 312 is about alignment not AAInfo.

wecing marked 3 inline comments as done.Aug 1 2021, 10:10 AM
wecing added inline comments.
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
335–339

Quoting @fhahn's comment above:

I am not sure if it is safe to copy all metadata, it would probably be better to add them one-by-one. I was mostly asking to check which other kinds would be safe. For example, debug metadata is function specific, so it can't simply be copied I think.

llvm::copyMetadataForLoad also copies debug metadata, so I think it's not safe to use here. It might be better to just hand pick a few ones that are safe to copy.

wecing marked 2 inline comments as done.Aug 1 2021, 10:20 AM
wecing added inline comments.
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
320–327

We need to check the result of llvm::isGuaranteedToTransferExecutionToSuccessor() in every iteration anyways. The difference between my code and yours is minimal. :)

wecing marked an inline comment as done.Aug 1 2021, 10:21 AM
wecing updated this revision to Diff 363342.Aug 1 2021, 10:23 AM

Fix clang-tidy warning on variable naming style.

MatzeB accepted this revision.Aug 2 2021, 10:05 AM
MatzeB added inline comments.
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
320–327

Yep the difference is minuscule. Though as long as the code doesn't get more complicated, why not... Anyway no strong opinion if you want to keep as-is.

335–339

Maintaining a custom list of metadata to copy is not great for future extension. Whenever someone adds a new kind of metadata he has to be aware of this list, understand the ArgumentPromotion pass and then add the new metadata to the list or the new metadata will get lost by the optimization.

I wonder what makes debug info function-dependent. At a first glance it doesn't look like it is (given that the DILocations has a scope pointing to a DISubprogram which specifies the relevant function). Anyway I'm not too experienced here, so I may miss something.

Well I'm gonna accept the changes as-is, but I think it's not ideal to have a custom list.

nikic added inline comments.Aug 2 2021, 11:52 AM
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
314–320

Why is it okay to copy align metadata unconditionally? I would have expected it to get the same treatment as other metadata.

wecing added inline comments.Aug 4 2021, 9:07 PM
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
314–320

Not an expert on this but I would expect alignments for the same type should never change? The logic has been there for 11 years: https://github.com/llvm/llvm-project/commit/81c03447fc6358197258903e3fc6ee65b16ee768

nikic added inline comments.Sep 2 2021, 2:16 PM
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
314–320

These are two separate things: The align attribute specifies the alignment of the load operand. The align metadata specifies the alignment of the load result (for pointer results).

wecing abandoned this revision.Jan 13 2023, 9:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2023, 9:52 AM
Herald added a subscriber: StephenFan. · View Herald Transcript