This is an archive of the discontinued LLVM Phabricator instance.

[llvm-reduce] Reducing attributes
ClosedPublic

Authored by lebedev.ri on Jul 7 2020, 3:26 PM.

Details

Summary

This handles all three places where attributes could currently be - GlobalVariable, Function and CallBase.
For last two, it correctly handles all three possible attribute locations (return value, arguments and function itself)

There was a previous attempt at it D73853,
which was committed in rGfc62b36a000681c01e993242b583c5ec4ab48a3c,
but then reverted all the way back in rGb12176d2aafa0ccb2585aa218fc3b454ba84f2a9
due to some (osx?) test failures.

Diff Detail

Event Timeline

lebedev.ri created this revision.Jul 7 2020, 3:26 PM
arsenm added a subscriber: arsenm.Jul 7 2020, 3:33 PM

One of the problems with bugpoint's attribute handling is it burns a lot of time trying to reduce attributes on intrinsics, which cannot actually be removed since getDeclaration will just put them back on

llvm/tools/llvm-reduce/DeltaManager.h
36

Doing this last is an improvement over bugpoint's attempt to do this first.

I don't think removing attributes is actually a great reduction strategy. For most of the hard to reduce testcases I debug, removing attributes is entirely pointless (and adding them is more helpful). I think this needs a flag to disable it.

I think it would be good in the description+commit message to reference https://reviews.llvm.org/D73853 (and that it was reverted).

Does llvm-reduce still have "interestingness tests" in llvm/test/Reduce/Inputs/? If so, shouldn't this change add such a test?

Thanks for the patch!

llvm/tools/llvm-reduce/DeltaManager.h
36

Counterpoint, I find removing attributes very helpful in reducing the amount of noise in reduced test cases and have had bugs when I needed to figure out which attribute was the source of differences in codegen.

I don't mind a flag (I don't think it's necessary, but doesn't hurt); but I'd prefer it to be default on so you can opt-out if you don't want to reduce attributes.

llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
147–149

does zip actually simplify this sequence? Looks kind of complicated.

178

I wish all these members of AttributeRemapper were private and these three loops maybe hidden in a method of AttributeRemapper.

lebedev.ri added a comment.EditedJul 7 2020, 4:01 PM

I think it would be good in the description+commit message to reference https://reviews.llvm.org/D73853 (and that it was reverted).

Does llvm-reduce still have "interestingness tests" in llvm/test/Reduce/Inputs/? If so, shouldn't this change add such a test?

*cough* llvm/test/Reduce/*.ll
There are (no more new!) llvm/test/Reduce/Inputs/* files because they are pointless.
As you can see we can just as easily use FileCheck to drive interestingness tests.

Thanks for the patch!

lebedev.ri edited the summary of this revision. (Show Details)Jul 7 2020, 4:03 PM
lebedev.ri edited the summary of this revision. (Show Details)

I think it would be good in the description+commit message to reference https://reviews.llvm.org/D73853 (and that it was reverted).

Does llvm-reduce still have "interestingness tests" in llvm/test/Reduce/Inputs/? If so, shouldn't this change add such a test?

*cough* llvm/test/Reduce/*.ll
There are (no more new!) llvm/test/Reduce/Inputs/* files because they are pointless.
As you can see we can just as easily use FileCheck to drive interestingness tests.

Might be worth proving it if you're bored and doing other janitorial work in the area. Having homogeneous tests is nice, since we'd rather future travelers use FileCheck for BOTH interesting-ness and functional testing of their reduction pass itself.

MaskRay added a subscriber: MaskRay.Jul 7 2020, 4:38 PM
MaskRay added inline comments.
llvm/test/Reduce/remove-call-site-attributes.ll
4

rm -f %t is sufficient.

21

Add -SAME: if applicable

llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
156

If it is non-deterministic, use stable_sort

dblaikie added inline comments.Jul 7 2020, 5:19 PM
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
61–62

range-based-for loop, probably?

108–109

drop this since it's implicit/default?

147–149

+1 to this.

std::vector<std::pair<unsigned, AttributeSet>> Res;
Res.reserve(AttributeSets.size());
for (const auto &V : AttributeSets)
  Res.push_back({V.first, convertAttributeRefToAttributeSet(C, V.second)});

Seems simpler.

153–155

Could roll this into the previous for loop, making the push_back conditional:

std::vector<std::pair<unsigned, AttributeSet>> Res;
Res.reserve(AttributeSets.size());
for (const auto &V : AttributeSets) {
  AttributeSet S = convertAttributeRefToAttributeSet(C, V.second);
  if (S.hasAttributes())
    Res.push_back({V.first, S}); // maybe std::move(S)? Not sure if it's sufficiently heavy to benefit from moving
}
170

why std::for_each rather than a range-based for loop?

lebedev.ri updated this revision to Diff 276363.Jul 8 2020, 3:38 AM
lebedev.ri marked 18 inline comments as done.

Thanks for taking a look!
Applied some refactoring, addressed some comments.

I will split some of the changes into separate commits when actually landing this.

llvm/tools/llvm-reduce/DeltaManager.h
36

We currently have no such options.
How about we deal with that afterwards, by just consistently adding one for each?

llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
61–62

Hm, can use either. Why not for_each ?

147–149

After some refactoring...

156

It is.

170

why range-based for loop rather than a std::for_each?

These cases i'll prefer to keep as for_each, because it really doesn't matter in which order we iterate here,
while in range-based for loop is more for traditional direct forward iteration.

178

The delta pass consists of three stages:

  1. just counting all the features
  2. enumerating each feature (in a stable order) and recording whether or not it is to be kept
  3. actually applying the rewrite from the previous step

Right now, i think each step is neatly separated into AttributeCounter,
AttributeRemapper and extractAttributesFromModule.
I think, sinking implementation detail of the last step into middle step will convolute things.

llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
61–62

I find the range-for more concise, FWIW.

69–71
if (AS.getNumAttributes())
  visitAttributeSet(AS, GlobalVariablesToRefine[&GV]);
lebedev.ri marked an inline comment as done.

Addressing review nit.

nickdesaulniers accepted this revision.Jul 8 2020, 10:36 AM

I'm not a fan of the inconsistent use of range-for and for-each; I would prefer range-for everywhere since it's more concise.

But I don't feel strongly enough to block the patch based on that. Maybe LLVM's style guide should provide clarity and guidance on the difference of opinion?

It might be polite to see if other reviewers have additional and timely feedback.

This revision is now accepted and ready to land.Jul 8 2020, 10:36 AM

I'm not a fan of the inconsistent use of range-for and for-each; I would prefer range-for everywhere since it's more concise.

My (consistently inconsistent) headcanon as to which to use when is that for_each should be used
when in principle we don't care in which order each item will be processed.

But I don't feel strongly enough to block the patch based on that. Maybe LLVM's style guide should provide clarity and guidance on the difference of opinion?

It might be polite to see if other reviewers have additional and timely feedback.

Sure, let's see if @dblaikie / etc have any other comments.

I'm not a fan of the inconsistent use of range-for and for-each; I would prefer range-for everywhere since it's more concise.

My (consistently inconsistent) headcanon as to which to use when is that for_each should be used
when in principle we don't care in which order each item will be processed.

I don't think that's how the rest of LLVM is written, nor probably a great model. std::for_each is guaranteed to visit the elements in order, so it doesn't have a different contract to a range-based-for loop & adds some extra syntax (the lambda introducers, etc), complications to error messages, etc.

But I don't feel strongly enough to block the patch based on that. Maybe LLVM's style guide should provide clarity and guidance on the difference of opinion?

I think this is one I'm willing to say LLVM convention's pretty clear - there's 30 calls to std::for_each across LLVM and subprojects - and about 25,000 range-based-for loops... - so please change these to range based for loops.

There's also llvm::for_each() with 40 uses.
Can you please quote specific part of the whatever documentation you believe dictates this?
If there isn't one, i'd like to see ProgrammersManual patch.

There's also llvm::for_each() with 40 uses.

Still a very tiny fraction of all iterations. I expect most (or at least an order of magnitude or two more than the <100 for_each()s) of those iterations don't depend on the order of iteration - so I don't think it provides the signal you're thinking of, at least not to enough other developers to be useful - making the code quirky/different in a way that I think is likely to be confusing to other readers ("Why isn't this a range-based for loop? is there some subtle difference in behavior that the author intended that I'm not understanding?" & there isn't)

Can you please quote specific part of the whatever documentation you believe dictates this?
If there isn't one, i'd like to see ProgrammersManual patch.

Perhaps this is sufficient: https://llvm.org/docs/CodingStandards.html#use-range-based-for-loops-wherever-possible

There's also llvm::for_each() with 40 uses.

Still a very tiny fraction of all iterations. I expect most (or at least an order of magnitude or two more than the <100 for_each()s) of those iterations don't depend on the order of iteration - so I don't think it provides the signal you're thinking of, at least not to enough other developers to be useful - making the code quirky/different in a way that I think is likely to be confusing to other readers ("Why isn't this a range-based for loop? is there some subtle difference in behavior that the author intended that I'm not understanding?" & there isn't)

Can you please quote specific part of the whatever documentation you believe dictates this?
If there isn't one, i'd like to see ProgrammersManual patch.

Perhaps this is sufficient: https://llvm.org/docs/CodingStandards.html#use-range-based-for-loops-wherever-possible

I'm afraid it is not. It only speaks about range-based for loop vs. old-style for loops.
If for_each is so bad, we really should proactively document it, so i'm still waiting on the link/patch.
Thanks.

This should skip trying to remove attributes from intrinsics

There's also llvm::for_each() with 40 uses.

Still a very tiny fraction of all iterations. I expect most (or at least an order of magnitude or two more than the <100 for_each()s) of those iterations don't depend on the order of iteration - so I don't think it provides the signal you're thinking of, at least not to enough other developers to be useful - making the code quirky/different in a way that I think is likely to be confusing to other readers ("Why isn't this a range-based for loop? is there some subtle difference in behavior that the author intended that I'm not understanding?" & there isn't)

Can you please quote specific part of the whatever documentation you believe dictates this?
If there isn't one, i'd like to see ProgrammersManual patch.

Perhaps this is sufficient: https://llvm.org/docs/CodingStandards.html#use-range-based-for-loops-wherever-possible

I'm afraid it is not. It only speaks about range-based for loop vs. old-style for loops.
If for_each is so bad, we really should proactively document it, so i'm still waiting on the link/patch.
Thanks.

I don't think this is an LLVM-specific C++ stylistic issue & there's lots of ways that we could write less than ideal code that we don't document in the style guide. (general guidance like "don't use dynamic allocation when static allocation will do" for instance) & I think this fits under that kind of category.

The existing uses of llvm::for_each look like mostly cases where an existing lambda is used in a few different places and it's slightly shorter/easier to use std::for_each than a range-based-for. I think those are OK/wouldn't object to them. But in these cases where a lambda expression is being passed immediately to for_each, it doesn't seem beneficial to me.

nickdesaulniers requested changes to this revision.Jul 8 2020, 12:41 PM

so i'm still waiting on the link/patch.

Treat your fellow contributors with more respect, please.  I know style disagreements aren't exciting, but we're all on the same team.  I much prefer LLVM's community to the Linux kernel's for a reason, and I think it's worthwhile to speak up in defense of it, lest it decay.  I'm not the best at maintaining this myself, so if you see me break my own standards, please feel empowered to call me out.

Another benefit of range for here is we don't need the braces, so these can be 2 lines instead of 3. Please change them to be consistent.

This revision now requires changes to proceed.Jul 8 2020, 12:41 PM
lebedev.ri updated this revision to Diff 276552.Jul 8 2020, 2:18 PM
lebedev.ri added a reviewer: arsenm.

Addressing nits:

  • It turns out, we can't add attributes to true intrinsics, so indeed, there is no point in trying to reduce them. Originally, i thought that wasn't the case and didn't want to deal with all that (with having to understand which attribute will just reappear if it is deleted) right away. So just ignore true intrinsics.
  • for_each changes

I will split this up into several commits when landing.

so i'm still waiting on the link/patch.

Treat your fellow contributors with more respect, please.  I know style disagreements aren't exciting, but we're all on the same team.

Please remember that any [self-respecting] community is diverse, and almost by definition
that includes people with different 'base' languages. American speak of sugar-coating
very message is neither universally-used nor is required,
and not always following it does not mean ill-intent.

I much prefer LLVM's community to the Linux kernel's for a reason, and I think it's worthwhile to speak up in defense of it, lest it decay.  I'm not the best at maintaining this myself, so if you see me break my own standards, please feel empowered to call me out.

Another benefit of range for here is we don't need the braces, so these can be 2 lines instead of 3. Please change them to be consistent.

I will split this up into several commits when landing.

What's the plan?

llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
75–77
// We can neither add nor remove attributes from intrinsics.
if (F.getIntrinsicID() == Intrinsic::not_intrinsic)
  visitAttributeList(F.getAttributes(), FunctionsToRefine[&F]);

I will split this up into several commits when landing.

What's the plan?

langref(?), test cleanup, for_each cleanup, rest of the patch

lebedev.ri added inline comments.Jul 8 2020, 2:47 PM
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
75–77

We can do that, with no line count diff and doesn't the comment loos context - right now it's accurately on the early-return?

I will split this up into several commits when landing.

What's the plan?

langref(?), test cleanup, for_each cleanup, rest of the patch

SGTM, if you want to split those off in child revisions/phab reviews I'd be happy to review/approve test cleanup, for_each, and the rebased patch for this feature.

I think you meant coding standard (as opposed to langref)? (Just checking, or was there a LangRef change, too, and I missed it?)

lebedev.ri updated this revision to Diff 276573.Jul 8 2020, 3:06 PM

I will split this up into several commits when landing.

What's the plan?

langref(?), test cleanup, for_each cleanup, rest of the patch

SGTM, if you want to split those off in child revisions/phab reviews I'd be happy to review/approve test cleanup, for_each, and the rebased patch for this feature.

Since that wasn't paired with an accept, i can only surmise that it wasn’t an offer.

nickdesaulniers accepted this revision.Jul 8 2020, 3:10 PM
This revision is now accepted and ready to land.Jul 8 2020, 3:10 PM

Hi, your git commit contains extra Phabricator tags. You can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:

arcfilter () {
        arc amend
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people. Please keep the tag. (--date=now is my personal preference (author dates are usually not useful. Using committer dates can make log almost monotonic in time))

llvm/utils/git/pre-push.py can validate the message does not include unneeded tags.

so i'm still waiting on the link/patch.

Treat your fellow contributors with more respect, please.  I know style disagreements aren't exciting, but we're all on the same team.

Please remember that any [self-respecting] community is diverse, and almost by definition
that includes people with different 'base' languages. American speak of sugar-coating
very message is neither universally-used nor is required,
and not always following it does not mean ill-intent.

Sure enough - though the response did feel rather dismissive to me. I was explaining why I felt that the stylistic guidance was merited, but also wasn't what I think is best for the style guide (I think there's a lot of things that would be considered less readable that we can't encode in the style guide due to there being too many of them - that matter doesn't seem to have been answered/engaged with) and your response seemed to ignore the merits of that and abruptly repeat your point.

It's important for diversity to also create a welcoming community - that's not necessarily sugar coating, but engaging with kindness and understanding to other folks & hopefully be able to engage with the discussion in productive ways. None of us get this perfect all the time, to be sure.

gchatelet added inline comments.
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87

@lebedev.ri I was updating seq in D103900 and an assertion fired at this location because seq is being called with begin being greater than end
seq(AL.index_begin(), AL.index_end()) => seq(4294967295, 3)

I'm not familiar with this code but it seems that in the context of function attributes begin_index and end_index have special meaning (See here and here).

Can you have a look?

lebedev.ri added inline comments.Jul 5 2021, 5:08 AM
llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
87

It probably should be an integer, not an unsigned then.