This is an archive of the discontinued LLVM Phabricator instance.

[IR] Switch AttributeList to use an array for O(1) access
ClosedPublic

Authored by rnk on May 3 2017, 11:40 AM.

Details

Summary

Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.

This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995

Here are just the before and after cycle counts:

$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
    13,274,181,184      cycles                    #    3.047 GHz                      ( +-  0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
    12,906,927,263      cycles                    #    3.043 GHz                      ( +-  0.51% )

This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.

Diff Detail

Repository
rL LLVM

Event Timeline

rnk created this revision.May 3 2017, 11:40 AM
rnk edited the summary of this revision. (Show Details)May 8 2017, 3:22 PM
rnk edited the summary of this revision. (Show Details)
rnk updated this revision to Diff 98222.May 8 2017, 3:24 PM

rebase

rnk added a comment.May 8 2017, 3:28 PM

I benchmarked opt on sqlite3.bc, and this was a nice improvement.

I could try to measure memory usage, but I've convinced myself that there should be the exact same number of AttributeLists as their were before.

One question remains: do you think I should completely encapsulate this indexing change by adding 1 inside AttributeList::getAttributes, or is it better to update all the callers as I did in this change? The problem with this change is that it will break out-of-tree backends and frontends that use the APIs with one-based indexing. It won't even be a compile failure because the APIs still take unsigned integers. That's a bit rude without some more hand waving on llvm-dev and release notes.

javed.absar added inline comments.May 9 2017, 5:24 AM
llvm/include/llvm/IR/Attributes.h
287 ↗(On Diff #98222)

Yes, make the code much neater. Unfortunately, because of legacy stuff, as you mention, adjustments are needed in lot of places. I think encapsulating the index change inside AttributeList might be neater (less pervasive change).

303 ↗(On Diff #98222)

Should pImpl also have 'private' access-specifier, given getImpl is moved to there?

rnk added inline comments.May 9 2017, 10:01 AM
llvm/include/llvm/IR/Attributes.h
287 ↗(On Diff #98222)

OK, I'll go ahead and split the changes.

303 ↗(On Diff #98222)

It is already private.

rnk edited the summary of this revision. (Show Details)May 9 2017, 10:06 AM
rnk marked an inline comment as done.May 9 2017, 10:33 AM
In D32819#749163, @rnk wrote:

One question remains: do you think I should completely encapsulate this indexing change by adding 1 inside AttributeList::getAttributes, or is it better to update all the callers as I did in this change? The problem with this change is that it will break out-of-tree backends and frontends that use the APIs with one-based indexing. It won't even be a compile failure because the APIs still take unsigned integers. That's a bit rude without some more hand waving on llvm-dev and release notes.

@javed.absar suggested that this was worth it, so I tried it, but I don't like the result. It means this doesn't work anymore:

// Do lookups for all attribute groups.
for (unsigned i = 0, e = PAL.getNumAttrSets(); i != e; ++i) {
  AttributeSet AS = PAL.getAttributes(i);
  if (!AS.hasAttributes())
    continue;
  IndexAndAttrSet Pair = {i, AS};
  unsigned &Entry = AttributeGroupMap[Pair];
  if (Entry == 0) {
    AttributeGroups.push_back(Pair);
    Entry = AttributeGroups.size();
  }
}

We would need to expose something like index_begin() / index_end() / indices() to make it easy to iterate over AttributeList indices. If we need to have a loop over indices, that loop should start at zero. Anything else is surprising.

llvm/include/llvm/IR/Attributes.h
287 ↗(On Diff #98222)

I tried this and I don't like the result. I'll explain more in a top-level comment.

In D32819#749163, @rnk wrote:

One question remains: do you think I should completely encapsulate this indexing change by adding 1 inside AttributeList::getAttributes, or is it better to update all the callers as I did in this change? The problem with this change is that it will break out-of-tree backends and frontends that use the APIs with one-based indexing. It won't even be a compile failure because the APIs still take unsigned integers. That's a bit rude without some more hand waving on llvm-dev and release notes.

I'm very concerned about breaking out-of-tree code in a way that's not detectable at compile time. We have a whole bunch of out-of-tree passes that will be affected by this (...I'll fix them since I know about it) but others may not be watching the mailing lists as closely. Of course we make no guarantees about C++ API stability, but I feel we shouldn't cause unnecessary pain and breakage if it can be avoided.

How about renaming the APIs that take or return arg indices? That'll make it abundantly clear that their behavior has changed (and will also easily flag up all the places in upstream code that need to be changed).

ahatanak added inline comments.
llvm/include/llvm/IR/Attributes.h
484 ↗(On Diff #98222)

Is it necessary to have two functions, empty and isEmpty, which have the same functionality?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
4959 ↗(On Diff #98222)

Is this change related to the changes in AttributeList?

llvm/lib/IR/Attributes.cpp
843 ↗(On Diff #98222)

Since Attrs is already sorted when it's passed to this function, can you just look at the last element to find the max index?

895 ↗(On Diff #98222)

Space before and after +.

rnk updated this revision to Diff 98492.May 10 2017, 11:07 AM
rnk marked 2 inline comments as done.
  • Address Akira's comments
llvm/include/llvm/IR/Attributes.h
484 ↗(On Diff #98222)

I never actually ended up using this or size(), so I'll remove them. We can standardize on the STL names later.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
4959 ↗(On Diff #98222)

I will rebase and it should go away. This was in my local master, but not origin/master, which arcanist diffs against.

llvm/lib/IR/Attributes.cpp
843 ↗(On Diff #98222)

You're right, I can.

rnk added a comment.May 11 2017, 6:06 PM
In D32819#749163, @rnk wrote:

One question remains: do you think I should completely encapsulate this indexing change by adding 1 inside AttributeList::getAttributes, or is it better to update all the callers as I did in this change? The problem with this change is that it will break out-of-tree backends and frontends that use the APIs with one-based indexing. It won't even be a compile failure because the APIs still take unsigned integers. That's a bit rude without some more hand waving on llvm-dev and release notes.

I'm very concerned about breaking out-of-tree code in a way that's not detectable at compile time. We have a whole bunch of out-of-tree passes that will be affected by this (...I'll fix them since I know about it) but others may not be watching the mailing lists as closely. Of course we make no guarantees about C++ API stability, but I feel we shouldn't cause unnecessary pain and breakage if it can be avoided.

Having just debugged a lot of "index out of bounds" type runtime failures, I generally agree, I don't want to put others through it. :)

How about renaming the APIs that take or return arg indices? That'll make it abundantly clear that their behavior has changed (and will also easily flag up all the places in upstream code that need to be changed).

I don't want to use bad names (hasAttributeEx?) just for backwards compatibility. One idea I had was to make a better API, one that has no attribute list indices at all, but there are a handful of places in LLVM that abstract over argument and return attributes with attribute list indices, and I couldn't find a way to replace them. Maybe rename the old APIs to hasRetOrArgAttr or something painfully ugly like that to discourage people from using them? We'd have (has|get|add|remove)ParamAttr, *RetAttr, *FnAttr to make it easier to migrate code to do the right thing.

Anyway, I'll upload a patch that retains compatibility. I want to get the performance win before dealing with that.

rnk edited the summary of this revision. (Show Details)May 11 2017, 6:07 PM
rnk updated this revision to Diff 98708.May 11 2017, 6:08 PM
  • Undo AttributeList indexing change
rnk updated this revision to Diff 98709.May 11 2017, 6:09 PM
  • rebase to remove undesired changes
chandlerc edited edge metadata.May 11 2017, 6:10 PM
In D32819#752793, @rnk wrote:
In D32819#749163, @rnk wrote:

One question remains: do you think I should completely encapsulate this indexing change by adding 1 inside AttributeList::getAttributes, or is it better to update all the callers as I did in this change? The problem with this change is that it will break out-of-tree backends and frontends that use the APIs with one-based indexing. It won't even be a compile failure because the APIs still take unsigned integers. That's a bit rude without some more hand waving on llvm-dev and release notes.

I'm very concerned about breaking out-of-tree code in a way that's not detectable at compile time. We have a whole bunch of out-of-tree passes that will be affected by this (...I'll fix them since I know about it) but others may not be watching the mailing lists as closely. Of course we make no guarantees about C++ API stability, but I feel we shouldn't cause unnecessary pain and breakage if it can be avoided.

Having just debugged a lot of "index out of bounds" type runtime failures, I generally agree, I don't want to put others through it. :)

How about renaming the APIs that take or return arg indices? That'll make it abundantly clear that their behavior has changed (and will also easily flag up all the places in upstream code that need to be changed).

I don't want to use bad names (hasAttributeEx?) just for backwards compatibility. One idea I had was to make a better API, one that has no attribute list indices at all, but there are a handful of places in LLVM that abstract over argument and return attributes with attribute list indices, and I couldn't find a way to replace them. Maybe rename the old APIs to hasRetOrArgAttr or something painfully ugly like that to discourage people from using them? We'd have (has|get|add|remove)ParamAttr, *RetAttr, *FnAttr to make it easier to migrate code to do the right thing.

FWIW, I really like this plan. Keep the "compatible" indexing API but make it much more clear why it's a bad idea with a simple rename. Wait some time unit, then introduce new API that doesn't have these failuremodes with a nice set of names.

rnk updated this revision to Diff 98710.May 11 2017, 6:13 PM
  • Port changes addressing Akira's comments to the branch with no indexing change
rnk added a reviewer: hans.May 22 2017, 12:38 PM
rnk updated this revision to Diff 99793.May 22 2017, 12:38 PM
rnk removed a reviewer: hans.
  • rebase
rnk added a comment.May 22 2017, 12:39 PM

Added Hans as a reviewer. I think there is consensus from code owners that this is the right thing to do, but I'm still looking for a code reviewer with time.

(FWIW, given the overall plan, I'm very happy for Hans to handle the code review.)

hans added a subscriber: hans.May 22 2017, 4:44 PM
hans added inline comments.
llvm/lib/IR/AttributeImpl.h
235 ↗(On Diff #99793)

Not related to your patch, but why is this needed?

llvm/lib/IR/Attributes.cpp
813 ↗(On Diff #99793)

I might have gone for Sets[0] to make it more obvious that's what the assert above is about. Or would it make sense to just do Sets[attrIdxToArrayIdx(AttributeList::FunctionIndex)]?

815 ↗(On Diff #99793)

Not related to this patch of course, but is there any reason to not just use 1ULL?

847 ↗(On Diff #99793)

I suppose we'd still assert when hitting the AttributeListImpl constructor, but maybe this is worth keeping to assert early?

939 ↗(On Diff #99793)

I think it would be more clear to loop over ArgAttrs with an index and just set LastAttrPos = I + 2. Or see my suggestion further down.

951 ↗(On Diff #99793)

I didn't know that. Is it worth asserting?

954 ↗(On Diff #99793)

Are there always RetAttrs if there are ArgAttrs? Oh, it doesn't matter; pushing an empty RetAttrs does the job just as well.

957 ↗(On Diff #99793)

I had to do a double-take on that -1, but it looks right.

I wonder if this function could be simpler though. Perhaps first loop from the back of ArgAttrs to find the last one that has attributes, then based on that and .hasAttributes on FnAttrs and RetAttrs, either return early or push into the SmallVector straight up?

969 ↗(On Diff #99793)

Is 5 significant?

998 ↗(On Diff #99793)

Below you just do for (AttributeList List : Attrs, i.e. no auto and no reference. That seems nicer.

1035 ↗(On Diff #99793)

Maybe fold into the previous statement? unsigned MaxIndex = attrIdxToArrayIdx(Indices.back()).

llvm/lib/IR/Instructions.cpp
457 ↗(On Diff #99793)

Use ReturnIndex instead of i?

rnk marked 5 inline comments as done.May 23 2017, 9:40 AM
rnk added a subscriber: rsmith.

Thanks!

llvm/lib/IR/AttributeImpl.h
235 ↗(On Diff #99793)

This appears to inhibit sized deallocation. @rsmith added it in r260180. Perhaps it deserves a comment.

llvm/lib/IR/Attributes.cpp
813 ↗(On Diff #99793)

Perhaps, but the only reason that .front() or [0] are safe is because we asserted above that Sets isn't empty, so we still need some assertion or comment indicating that function attributes are always available. I'll do Set[0].

957 ↗(On Diff #99793)

I implemented this, and I also switched to a set count instead of a "last position" variable, which I think improves things a lot. I don't know why I used "LastPos" instead of a size or count variable. I think that is what was introducing these confusing off by one calculations.

969 ↗(On Diff #99793)

Not really, let's go to 8 for consistency. That is one thing I hate about SmallVector: you are forced to invent magic numbers all the time, and it makes the reader think the number was tuned. :)

rnk updated this revision to Diff 99935.May 23 2017, 9:40 AM
  • Address Hans's comments
hans accepted this revision.May 23 2017, 9:47 AM

lgtm

This revision is now accepted and ready to land.May 23 2017, 9:47 AM
This revision was automatically updated to reflect the committed changes.