Page MenuHomePhabricator

[Attributor] Pass infrastructure and fixpoint framework
ClosedPublic

Authored by jdoerfert on Mar 27 2019, 11:53 PM.

Details

Summary
NOTE: Note that no attributes are derived yet. This patch will not go in
      alone but only with others that derive attributes. The framework is
      split for review purposes.

This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.

In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.

Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
chandlerc added inline comments.Apr 17 2019, 3:18 AM
llvm/include/llvm/Transforms/IPO/Attributor.h
255–261 ↗(On Diff #193113)

Note that these use doxygen syntax but aren't actually doxygen comments or terribly useful for that IMO.

I'd much rather more detailed doxygen comments on the specific types than trying to partition up the code into sections. Just my preference.

This revision now requires changes to proceed.Apr 17 2019, 3:18 AM

Initial high-level file comment, more to come!

It looks like this is a completely new approach to attribute inference compared to the post-order function atters pass?

It is, one of the reasons I put it in a separate file. " post-order function atters" is actually 5-7 separate "passes" over an SCC just living in the same LLVM-IR pass.

It'd be really good to actually spell it out nicely.

Good point. I guess I can put that in the commit message and also work on other descriptions.

Is this just version #2 of that? Can we use the same name, and put the code for this behind a flag or something to allow experimenting?

I "kinda" could, I prefer not to. There are various justifications, let me give two now and it that is not sufficient I'll provide more:

  1. "functionattrs" outgrow "function attributes" a while ago when it started to derive argument attributes. The attributor framework is designed to derive everything, from function, to argument and return attributes, over instruction "attributes/metadata" to things that do not have an LLVM-IR equivalent.
  2. The design, as well as the capabilities, of the function attribute deduction pass is fairly specific. I would like to prevent "reuse" in either direction as the Attributor framework is supposed to replace the existing code with better deduction across the board.

Generally, and I think this goes to the comments already made, I think this needs a design overview. This could be in the commit log, but increasingly I think we'd benefit from having a more high level practice of including design documents with commits like this. So either in the file comment of the header file for this framework, or if long/complex enough, in a markdown document referenced form the file comment, I'd like to see an overview of the design and approach you're pursuing to do more complete attribute deduction.

I totally agree, as I did with Florian who requested better documentation. I added a first draft of an high-level file comment today. I will also provide a more elaborate commit message and class comments asap. Some clang cuda bugs kept me busy all day :(

I also think we should be clear what is happening, so it would be good to come up with terms and define them precisely.

Seems very reasonable to me. I won't claim I have a clear terminology throughout the commits/comments.

I think "deduction" should be used more for the *computing* of attributes by reasoning about the properties of the IR. And "inference" should be adding attributes due to some extrinsic knowledge such as target library info, command line flags, etc. But totally open to different / better terminology.

Fine with me. I think I exclusively used "deduction" (so far) as it the expected usage model. I should probably add some Attributor documentation which could also specify how we talk about things. Until then, please let me know if there is anything you want me to clarify!

Improved and added various comments & documentation

jdoerfert marked 3 inline comments as done.Apr 18 2019, 1:20 PM
jdoerfert added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
255–261 ↗(On Diff #193113)

I did so in the header now, except the last "section" comment. I thought/hoped these will help read and extend the files later on. If that hope is not shared I can remove them everywhere.

llvm/lib/Transforms/IPO/Attributor.cpp
179 ↗(On Diff #193113)

I'm still debating with myself if this is necessary/useful or not.

jdoerfert edited the summary of this revision. (Show Details)Apr 22 2019, 9:34 AM
fhahn added a comment.May 7 2019, 12:56 PM

And finally, what can we do to check correctness? As the current attribute definitions are a bit fuzzy, it seems this could lead to problems down the road, in case we infer invalid attributes, which only get used (and cause problems) a bit later. Assuming we have a clear definition of a set of attributes, how hard would it be to verify the attributes in a module (maybe we are doing it already)?

I'm unsure I understand this question. If a fixpoint is reached, as reported by the AbstractAttribute::updateImpl methods, we should have derived "correct" information. Obviously, there might be a bug in the logic of the AbstractAttribute::updateImpl function, or the associated materialization, the dependence tracking in the attributor, etc. but I do not have a better solution than more (stress) tests (see D59903). I open a few bugs while I was developing attribute deductions in this framework and compare the results to the existing one. We current derive different things we should not as described in PR41336 and PR41328. More attributes will also trigger, or better expose, "bugs" down the line (see D59917) and we will probably have to manually look into the root cause as they pop up, was the attribute not correct or later reasoning based on it.

Right, my thinking for checking correctness was along the following lines: Given some input IR with attributes, it should be fairly easy to check if (some of the) provided attributes are consistent with each other and the IR, right? At least easier than deducing them ;) I was wondering if we could implement a function/module verifier to check certain attributes. That way, we could check that certain types of attributes deduced are consistent with the underlying IR, without the attributes being used for some other optimization.

llvm/include/llvm/Transforms/IPO/Attributor.h
29 ↗(On Diff #195802)

nit: not sure if automagically is the best way to put it.

271 ↗(On Diff #195802)

it's defined as struct?

llvm/lib/Transforms/IPO/Attributor.cpp
78 ↗(On Diff #195802)

This doesn't do anything at the moment?

106 ↗(On Diff #195802)

Would it make sense to put this in Attribute?

110 ↗(On Diff #195802)

could be shortened to just return Old.getValueAsInt() >= New.getValueAsInt()

319 ↗(On Diff #195802)

Passing in the whole Attributor seems a bit heavy handed.

Conceptually, IIUC, to update an attribute, we just need a way to query other attributes (and potentially some other IR info) as input from the attributor, right?

451 ↗(On Diff #195802)

Is the plan here for the Attributor to provide an interface for the Attribute implementations to query "interesting" instructions, instead of the Attribute implementations looking at the IR themselves? It seems this might impose a tighter coupling of Attributes <-> Attributor than intended with the flexible attribute registration system.

Related to the comment at line 319, it might be worth splitting this into a more restricted query/info interface that Attribute::update can use. It seems not related to the fixed point iteration in Attributor.

jdoerfert marked 7 inline comments as done.May 7 2019, 3:52 PM

And finally, what can we do to check correctness? As the current attribute definitions are a bit fuzzy, it seems this could lead to problems down the road, in case we infer invalid attributes, which only get used (and cause problems) a bit later. Assuming we have a clear definition of a set of attributes, how hard would it be to verify the attributes in a module (maybe we are doing it already)?

I'm unsure I understand this question. If a fixpoint is reached, as reported by the AbstractAttribute::updateImpl methods, we should have derived "correct" information. Obviously, there might be a bug in the logic of the AbstractAttribute::updateImpl function, or the associated materialization, the dependence tracking in the attributor, etc. but I do not have a better solution than more (stress) tests (see D59903). I open a few bugs while I was developing attribute deductions in this framework and compare the results to the existing one. We current derive different things we should not as described in PR41336 and PR41328. More attributes will also trigger, or better expose, "bugs" down the line (see D59917) and we will probably have to manually look into the root cause as they pop up, was the attribute not correct or later reasoning based on it.

Right, my thinking for checking correctness was along the following lines: Given some input IR with attributes, it should be fairly easy to check if (some of the) provided attributes are consistent with each other and the IR, right? At least easier than deducing them ;) I was wondering if we could implement a function/module verifier to check certain attributes. That way, we could check that certain types of attributes deduced are consistent with the underlying IR, without the attributes being used for some other optimization.

Yes, we could implement this check, and yes doing so in this framework would probably help. Unfortunately, the logic we need for determining attributes and the one we need to "verify" attributes is only similar but often not the same. That said, we can most probably share a lot of it. So if that is something we want to look into I would propose to add AbstractAttribute::verify functionality which, if overloaded, will try to "disprove" existing attributes and report them as errors/warnings/notes/remarks/... Some of the logic in the updateImpl and verify is then overlapping for most attributes. It is unclear to me right now if you would have any need for the fixpoint framework though. I guess you'll end up with a separate "AttributeVerify" pass instead.

llvm/include/llvm/Transforms/IPO/Attributor.h
29 ↗(On Diff #195802)

I'll remove the magic :(

271 ↗(On Diff #195802)

True, but there is little difference anyway, right? I use structs here as there are no members.
You want me to use "struct" in the comment as well?

llvm/lib/Transforms/IPO/Attributor.cpp
106 ↗(On Diff #195802)

Potentially, yes. I don't right now where we would need similar functionality but I can most certainly move it there either way.

110 ↗(On Diff #195802)

Will do.

319 ↗(On Diff #195802)

Correct, query other in-flight attributes and cached IR information.

I pass the Attributor because the above functionality is pretty much everything it exposes as public interface anyway (I think). Would you like to have another thin interface layer on top?

451 ↗(On Diff #195802)

The first part goes into a very interesting direction I have thought about as well. Without doing performance tests (mostly because I didn't have any attributes etc.) I choose this solution for "performance" reasons. If we invert it, which we still can and which I do not oppose, every attribute is queried for every instruction in order to decide if it is interesting to cache it, at least in the naive implementation I have in mind right now.

If we can come up with a good interface to expose IR information, probably cached, that would be great. If we put that, and the attribute query interface outside the Attributor, that would be OK. Though, the attributor will need to know about attribute queries (now through Attributor::getAAFor<...>(...) to track dependencies).

So far I decided this coupling is not too bad even though we have to hardcode "interesting" here. If no one has super strong feelings about this, I'd argue we go with this for now and once we have a good handle on the use cases we try to design a clean interface. But again, I will not oppose any proposal for such an interface right now.

@fhahn, I could split the Attributor and the part that identifies the AbstractAttributes in the IR.
This way the Attributor becomes more lightweight, there is less coupling, and its easier to reuse the framework for other AbstractAttributes.

jdoerfert updated this revision to Diff 199096.May 10 2019, 3:04 PM
jdoerfert marked an inline comment as done.
jdoerfert edited the summary of this revision. (Show Details)

Introduce a separate information cache (slims down the Attributor). Allow reuse
of the Attributor framework for a subset of the "default" attributes as well as
custom ones.

jdoerfert updated this revision to Diff 199097.May 10 2019, 3:29 PM

Fix DenseSet usage.

jdoerfert marked 5 inline comments as done.May 10 2019, 5:25 PM
jdoerfert added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
78 ↗(On Diff #195802)

It has a call site but since there are no attributes yet, nothing is done.
I wanted to keep it in this patch because it is part of the framework. Though, I expected this patch to land only with at least one attribute.

319 ↗(On Diff #195802)

I slimmed down the Attributor further. What do you think now?

451 ↗(On Diff #195802)

The new separation of InformationCache allows you to extend it for custom attributes. I kept the old way for "default" attributes for now.

fhahn added inline comments.May 14 2019, 9:45 AM
llvm/include/llvm/Transforms/IPO/Attributor.h
240 ↗(On Diff #199097)

I am not sure "Determine opportunities to derive 'default' attributes ' is very clear. This is where we add the initial abstract attributes to the state and also collect the IR info, right? Maybe worth stating that a bit more explicitly.

I am not sure what the long term plan for the distinction between 'default' and other attributes is, but it might be worth to only add this notation once it is required?

275 ↗(On Diff #199097)

32 entries per attribute seems a bit large at a first look. Might be interesting to gather some statistics on the average number of dependencies.

llvm/lib/Transforms/IPO/Attributor.cpp
273 ↗(On Diff #199097)

QueriedAAs is a SmallPtrSet, so the order we insert items into the worklist here is not deterministic. Is there anything preventing us from reaching different fixed points when iterating over items in different orders?

434 ↗(On Diff #199097)

nit: information.

449 ↗(On Diff #199097)

spurious newline?

460 ↗(On Diff #199097)

IMO this and above comment does not add any meaningful information.

463 ↗(On Diff #199097)

IMO this can also be dropped, as the string in the assert gives enough details about the check.

467 ↗(On Diff #199097)

te -> the

485 ↗(On Diff #199097)

runAttributorOnSCC(SCCFunctions) returns true if it changed the IR, right? Shouldn't the return values be switched, i.e. do not preserve anything, if something changed?

As a side note, shouldn't we be able to preserve any analysis that does not make use of attributes? From a correctness perspective, we should be able to preserve all analysis, but we would like to invalidate those that could make use of the additional info. AFAIK at least in the new pass manager, it should be possible to preserve function-level analysis in a module/SCC pass.

jdoerfert marked 4 inline comments as done.May 14 2019, 12:06 PM

I'll update the patch with the requested (small) changes asap.

Are there more design comments? (@chandlerc?)

llvm/include/llvm/Transforms/IPO/Attributor.h
240 ↗(On Diff #199097)

Mh, fair point. Though, I think it makes sense to have a set of attributes that live here and the ability to add some that do not.

Would it help if I list the "default" attributes? Do you think I should hide this interface for now?

275 ↗(On Diff #199097)

Fair point. I can do that with the attributes we have. (or ask the GSoC students to do it).

llvm/lib/Transforms/IPO/Attributor.cpp
273 ↗(On Diff #199097)

Good catch. Short answer, if the attribute update methods are sane and an optimistic fixpoint is reached, we should get a unique one. If not, there might be different ones reached.

I can make it all deterministic.

485 ↗(On Diff #199097)

Good catch.

As a side note, shouldn't we be able to preserve any analysis that does not make use of attributes? From a correctness perspective, we should be able to preserve all analysis, but we would like to invalidate those that could make use of the additional info. AFAIK at least in the new pass manager, it should be possible to preserve function-level analysis in a module/SCC pass.

I should be, and we should list some. I'll look into it. I *think*, right now the placement ensures that we do not invalidate anything.

fhahn added a comment.May 23 2019, 7:07 AM

As additional verification, we could run the attributor again, after materializing the changes from the initial run and check if no new facts are discovered, if the iteration limit is not exceeded. We have something similar in NewGVN and it has been helpful to surface missing links in the virtual representation/ non-determinism (links between the abstract attributes in the Attributor case for example).

As additional verification, we could run the attributor again, after materializing the changes from the initial run and check if no new facts are discovered, if the iteration limit is not exceeded. We have something similar in NewGVN and it has been helpful to surface missing links in the virtual representation/ non-determinism (links between the abstract attributes in the Attributor case for example).

I like this idea very much. I'll implement it (as a EXPENSIVE_CHECK) a verify method. Though, I can only check if no new facts were materialized because virtual facts might be rediscovered.

jdoerfert marked 9 inline comments as done.

Update according to comments by @fhahn

Avoid recursion during verification

jdoerfert marked an inline comment as done.Tue, May 28, 3:38 PM
jdoerfert added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
179 ↗(On Diff #201313)

I'm actually not happy anymore with the fact that AAType::ID is supposed to be of AttributeKind type because of string attributes. Since it is an integer in the maps already we can work around that but proposals to change this are welcome.

hfinkel added inline comments.Tue, May 28, 8:26 PM
llvm/include/llvm/Transforms/IPO/Attributor.h
25 ↗(On Diff #201313)

An example would be useful here for explaining the abstract attributes "(e,g,, something)".

48 ↗(On Diff #201313)

I'm not yet sure how to think about this. Many current attributes are derives with the help of LLVM utility functions which walk use-def chains, etc. How can an attributor use those utilities and know on what else it depends?

Or are we going to enhance all utilities to return an optional dependency vector or similar?

Or is the idea that we don't change the IR until the end, so all such utilities see only pessimistic information?

llvm/lib/Transforms/IPO/Attributor.cpp
365 ↗(On Diff #201313)

This is true even if we did't reach the fixpoint and has to reset some attributes to their pessimistic state?

401 ↗(On Diff #201313)

Are there supposed to be some other cases here?

jdoerfert marked 4 inline comments as done.Tue, May 28, 9:46 PM
jdoerfert added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
25 ↗(On Diff #201313)

I have an example in the description of getAAFor below. I will add "(see theAttributor::getAAFor description)" here.

48 ↗(On Diff #201313)

Or is the idea that we don't change the IR until the end, so all such utilities see only pessimistic information?

I strongly urge us *not* to change the IR until the end. That is what I have done so far in all the following attributor patches. Trying to clean up after optimistically modifying the IR seems like a very fragile approach. It breaks with caches all over the place and with the common concept of the IR as the origin of truth.

Or are we going to enhance all utilities to return an optional dependency vector or similar?

We can do that if necessary. I haven't so far. Just for the record, current way of using utilities doesn't benefit from "in-flight" information either. All new deductions have been strictly more powerful on the LLVM-TS and SPEC2006 benchmarks (if I remember correctly).

I'm not yet sure how to think about this. Many current attributes are derives with the help of LLVM utility functions which walk use-def chains, etc. How can an attributor use those utilities and know on what else it depends?

I have various patches to derive attributes already on phabricator so we can actually look at them in detail (see the Stack button above). Some design decisions are described in the class comment of the Attributor but let me rephrase with regards to your question:

If you use utility functions you get potentially pessimistic answers. You can walk def-use chains, etc., just fine but you should query attributes in flight whenever possible. By construction, these attributes should provide the best possible result. An attribute can rely on utility functions as well but should only do so after other means failed to provide justification for a better state. I don't exclude that we might want to re-implement/enhance utility functions at some point such that they can, in turn, use in-flight attributes.

llvm/lib/Transforms/IPO/Attributor.cpp
365 ↗(On Diff #201313)

[I read it as "Is this true...".]

No it is not. If we do not reach an optimistic fixpoint we force a pessimistic one. If we modify the IR, we can, in the re-run, again find a better stable state (which could be an optimistic or pessimistic fixpoint) that allows again to enhance the IR.

After thinking about the use of utility functions a bit more, I guess this can actually also happen if we have a fixpoint based on pessimistic information. Maybe we have to weaken the "llvm_unreachable" at some point, or we have to avoid/extend utility functions. For now, and with up to 32 iterations, I haven't seen a single non-optimistic fixpoint in all the test runs on LLVM-TS and SPEC2006.

401 ↗(On Diff #201313)

Are there supposed to be some other cases here?

Yes, at some point. In this patch *no* attributes are registered or derived and we also do not have "interesting" instructions yet. They are introduced with the attributes patches (see dependent patches).

I didn't know a better way to split this. I opted to keep the framework together and put the attributes in separate patches. If you prefer it any other way, I'd be more than happy do change it.

jdoerfert marked an inline comment as done.

Improve wording, grammar, and spelling of comments

jdoerfert updated this revision to Diff 202638.Sun, Jun 2, 9:10 PM

Make the Attributor a Module Pass and move it earlier in the pipeline. Disable it by default.

hfinkel accepted this revision.Mon, Jun 3, 8:49 AM

A couple of things, but otherwise this LGTM. We can iterate and add this this in tree (my preference is to get his basic infrastructure committed so that the commits that add deduction for particular attributes can all be separate).

llvm/lib/Transforms/IPO/Attributor.cpp
365 ↗(On Diff #201313)

Okay. I don't like theoretically-unsound checks (that aren't explicitly labeled as such), however, and the fact that this is enabled by default when EXPENSIVE_CHECKS is defined isn't good if this isn't sound. Please update the comment in the cl::opt to indicate that this might spuriously fail for pessimistic fixed points, and don't enable this by default with EXPENSIVE_CHECKS.

399 ↗(On Diff #202638)

Please add a comment here explaining what gets added here (so that it's clear why it's here and empty).

fhahn added a comment.Mon, Jun 3, 9:18 AM

I think it is still overly aggressive with analysis invalidation, but I agree it would be good to iterate on those things once the basics are in. Maybe it's worth a FIXME comment. Otherwise, looks good, thanks for the patience.

jdoerfert updated this revision to Diff 202837.Mon, Jun 3, 6:23 PM
jdoerfert marked 5 inline comments as done.

Addressed the comments. Will do a final run of the tests and commit then.

I think it is still overly aggressive with analysis invalidation, but I agree it would be good to iterate on those things once the basics are in. Maybe it's worth a FIXME comment. Otherwise, looks good, thanks for the patience.

I added the FIXME and I verified that at the new point in the (old PM) pipeline there is no pass that is destroyed because of the Attributor. It is not part of the new PM pipeline right now.

llvm/lib/Transforms/IPO/Attributor.cpp
365 ↗(On Diff #201313)

You are right. I did that now.

I did not before because for the first few attributes this should be sound. They do use utility functions but such that they should be able to get the best possible result even if the assumed information is not manifested in IR. The first attribute, returned on arguments, is one of them. I will find a way to run this such that only attributes with this property are verified and then I'll go back to "auto-verify" it when expensive checks are enabled.

399 ↗(On Diff #202638)

Done

This revision was not accepted when it landed; it landed in state Needs Review.Tue, Jun 4, 7:59 PM
This revision was automatically updated to reflect the committed changes.