This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Add lightweight version for attribute deduction only
ClosedPublic

Authored by fhahn on Jun 3 2023, 1:21 PM.

Details

Summary

This patch adds a lightweight instance of Attributor that only deduces
attributes.

This is just an initial version and I might be missing some attributes,
but the goal is to have a version that only focuses on attributes to
replace the function-attrs pass. Please let me know if I am missing
any important knobs for the Attributor.

The initial version has a few open issues, the main one probably being
compile time. The main additional functionality I'd like to see in genera
l is propagating attributes to call sites.

Open issues:

  • some missed attribute inference

Diff Detail

Event Timeline

fhahn created this revision.Jun 3 2023, 1:21 PM
Herald added a project: Restricted Project. · View Herald Transcript
fhahn requested review of this revision.Jun 3 2023, 1:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2023, 1:21 PM
nikic added inline comments.Jun 3 2023, 1:35 PM
llvm/lib/Passes/PassBuilderPipelines.cpp
908 ↗(On Diff #528152)

This doesn't look right. You're effectively disabling most attribute inference here.

1297 ↗(On Diff #528152)

Note that what you're replacing here is the late module RPO inference pass, which only infers norecurse. All other attributes are inferred by the CGSCC pass. Doing the inference here is too late (at least for non-LTO scenarios), we need this to happen as part of the CGSCC pipeline.

goldstein.w.n added inline comments.
llvm/lib/Passes/PassBuilderPipelines.cpp
1297 ↗(On Diff #528152)

We also want to make sure we do callsite inference before inlining, otherwise a lot of the attribute information will be lost.

re:

There's an instcombine
patch D151943 that attempts to extend InstCombine to do that. IMO it
would probably preferable to do it directly when deriving the
attributes. That means either switching to attributor or extending
FunctionAttrs.

I think that makes sense, the only semi concern is FunctionAttr (and lesser
degree attributor) are IPO. Deducing for a callsite within a function is
inherently local. If the IPO passes will always be on and definetly run
before inlining I have no gripes, but if thats not always the case think it
may pay to split up callsite inference from function attr inference.

goldstein.w.n added a comment.EditedJun 3 2023, 2:57 PM

Also for reference the instcombine changes have simliar compile time impact:
https://llvm-compile-time-tracker.com/compare.php?from=9f94b2739fc445754c9e0cdbc91853197c287874&to=e34dd68718ef1d651f6eccc3ae4b5689d8d3b516&stat=instructions%3Au

Seems to be faster for -O3 and -O3-thinlto, but slow for proper LTO.

(The functionattr only version that is. The patch also replace CGSCC causes significant more compile time cost than the instcombine version).

fhahn updated this revision to Diff 528229.Jun 4 2023, 9:20 AM

Split off pipeline changes to D152104.

Compile time increase by +3.56% for O3 (details in link below) when replacing the CGSCC runs only. @jdoerfert any ideas on how this could be improved?

https://llvm-compile-time-tracker.com/compare.php?from=259b1949a54864afd23ae0efd8ca50540917d4c1&to=20e4eae6ebb3dd58d0f779230887d885b4f9aad2&stat=instructions:u

fhahn marked 3 inline comments as done.Jun 4 2023, 9:22 AM
fhahn added inline comments.
llvm/lib/Passes/PassBuilderPipelines.cpp
908 ↗(On Diff #528152)

Thanks, something went wrong when squashing various changes.

1297 ↗(On Diff #528152)

I evaluated compile-time for various combinations, and that would be the one with the least amount of impact.

Compile-time impact with CGSCC only is in my latest comment. Pipeline code has been split off from the patch to D152104

fhahn added a reviewer: arsenm.Jun 4 2023, 9:22 AM
fhahn marked 2 inline comments as done.
arsenm added inline comments.Jun 4 2023, 9:29 AM
llvm/lib/Transforms/IPO/Attributor.cpp
3823

Can you add AANoFPClass?

arsenm added inline comments.Jun 4 2023, 9:31 AM
llvm/lib/Passes/PassRegistry.def
45

I think attributor-light would be better name so they sort together

arsenm added inline comments.Jun 4 2023, 9:42 AM
llvm/include/llvm/Transforms/IPO/Attributor.h
3382

Document how the light one differs from the full

Stupid question: could the attributer become an analysis? Then you can query and do caching?

It learns new attributes over time and becomes an even more powerful analysis.

I saw this, will comment tomorrow, hopefully with actual results.

Stupid question: could the attributer become an analysis? Then you can query and do caching?

It learns new attributes over time and becomes an even more powerful analysis.

One can do it. For starters, you can allow it to manifest (as the current setup does) and then continue to use the AA results for things that could not be put in the IR.
However, the tricky part is invalidation and updates. The initial manifest is necessary since some AAs assume it will happen, and only then the information is correct, e.g., they assume the load is replaced with a constant value. If you do not manifest in that case, problems will immediately appear.

I was thinking about AttributerAnalysis, AttributerFuncAttrs, and AttributerOpt.

jdoerfert added a comment.EditedJun 7 2023, 4:26 PM

I have been on this the last two days, will need till the end of the week for patches and proper eval across CTMark.

Quick update, all relative to my favorite source file ((llvm-test-suite/MultiSource/Applications/SPASS/clause.c):
I modified the lightweight pass a little and disabled some features we don't need in the beginning via new switches.
The CGSCC pass is run instead the (full) FunctionAttrCGSCC pass for testing.
If I run them both, the FunctionAttrCGSCC does not add any attributes (according to the stats). (my version of Attributor!)
I currently see the following overall time results (-O3):

0.0116 (  1.5%)   0.0021 (  3.5%)   0.0137 (  1.6%)   0.0137 (  1.6%)  PostOrderFunctionAttrsPass

and

0.0171 (  2.2%)   0.0037 (  5.2%)   0.0208 (  2.4%)   0.0209 (  2.4%)  LightweightAttributorCGSCCPass

I think the initial results I got were ~2x that. So, on this file, and at this moment, we would add ~0.8% compile time for better attribute deduction.
Here are the stats results:

168 function-attrs               - Number of functions with improved memory attribute
  2 function-attrs               - Number of function returns marked noalias
223 function-attrs               - Number of arguments marked nocapture
 52 function-attrs               - Number of functions marked as nofree
160 function-attrs               - Number of functions marked as norecurse
167 function-attrs               - Number of functions marked as nosync
 62 function-attrs               - Number of arguments marked readnone
161 function-attrs               - Number of arguments marked readonly
 32 function-attrs               - Number of arguments marked returned
145 function-attrs               - Number of functions marked as willreturn
 24 function-attrs               - Number of arguments marked writeonly

vs

5081 attributor                   - Number of abstract attributes created
1886 attributor                   - Number of abstract attributes manifested in IR
2834 attributor                   - Number of abstract attributes in a valid fixpoint state
 301 attributor                   - Number of functions with exact definitions
   1 attributor                   - Number of functions without exact definitions
 223 attributor                   - Number of arguments marked 'nocapture'
 242 attributor                   - Number of arguments marked 'nofree'
  62 attributor                   - Number of arguments marked 'readnone'
 164 attributor                   - Number of arguments marked 'readonly'
  32 attributor                   - Number of arguments marked 'returned'
  31 attributor                   - Number of arguments marked 'writeonly'
  53 attributor                   - Number of call site arguments marked 'nocapture'
  99 attributor                   - Number of call site arguments marked 'nofree'
   4 attributor                   - Number of call site arguments marked 'readnone'
  36 attributor                   - Number of call site arguments marked 'readonly'
   1 attributor                   - Number of call site arguments marked 'writeonly'
  11 attributor                   - Number of call site marked 'willreturn'
 187 attributor                   - Number of function with known return values
 187 attributor                   - Number of function with unique return
   2 attributor                   - Number of function returns marked 'noalias'
 170 attributor                   - Number of functions marked 'nofree'
 300 attributor                   - Number of functions marked 'norecurse'
 167 attributor                   - Number of functions marked 'nosync'
  62 attributor                   - Number of functions marked 'readnone'
  56 attributor                   - Number of functions marked 'readonly'
 151 attributor                   - Number of functions marked 'willreturn'
  20 attributor                   - Number of functions marked 'writeonly'

Long story short, I think I can shave off a little more time and then we re-run proper testing.
We can also limit the deductions further, as you can see, it tracks, and annotates, way more than the existing pass.

P.S. If people want to follow my changes, feel free: https://github.com/jdoerfert/llvm-project/tree/compile_time_light

jdoerfert added a comment.EditedJun 8 2023, 5:43 PM

I didn't manage to work on this today, and I doubt I will be able tomorrow. Anyway, here are the compile time results when we use a slight variation of this pass rather than the FunctionAttr pass:
[EDIT] Actual prelim results:
http://llvm-compile-time-tracker.com/compare.php?from=cb17c48fdd8c0f8ab6cd7ad3f197b052db20d1f6&to=d0a19bd1146e84157de4f68fd2cbaa63398ef674&stat=instructions%3Au

nikic added a comment.Jun 9 2023, 12:37 AM

I didn't manage to work on this today, and I doubt I will be able tomorrow. Anyway, here are the compile time results when we use a slight variation of this pass rather than the FunctionAttr pass:
https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions%3Au&remote=jdoerfert

That doesn't seem to actually enable the new pass... It's commented out in PassBuilderPipelines.

jdoerfert added a comment.EditedJun 9 2023, 9:42 AM

I didn't manage to work on this today, and I doubt I will be able tomorrow. Anyway, here are the compile time results when we use a slight variation of this pass rather than the FunctionAttr pass:
https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions%3Au&remote=jdoerfert

That doesn't seem to actually enable the new pass... It's commented out in PassBuilderPipelines.

That happens if I rush stuff. I was collecting the stats for function-attrs for the post when I pushed the changes to my branch, yesterday I just grabbed that and run perf results. My bad. Let me try to rectify this.

[EDIT] Actual prelim results:
http://llvm-compile-time-tracker.com/compare.php?from=cb17c48fdd8c0f8ab6cd7ad3f197b052db20d1f6&to=d0a19bd1146e84157de4f68fd2cbaa63398ef674&stat=instructions%3Au

I will try to look at sqlite and see what's going on. A pathological case might make it simpler to find some saving potential.
That said, I first have to verify this is due to the pass and not a secondary effect.

The sqlite slowdown seems to be a secondary effect:
With the attributor pass:

0.0815 (  0.8%)   0.0058 (  1.9%)   0.0873 (  0.9%)   0.0875 (  0.9%)  LightweightAttributorCGSCCPass

with the function attributes pass:

0.0793 (  0.8%)   0.0044 (  1.5%)   0.0837 (  0.9%)   0.0839 (  0.9%)  PostOrderFunctionAttrsPass

Here is the -stats output for both runs: https://gist.github.com/jdoerfert/b29c4ac1f853e689fe5933e820766c84
Here is the pass time for both runs: https://gist.github.com/jdoerfert/d38cd4d9cbd885522752d4c39f79c322

Not super sure where the +3% come from, looks like many smaller things.

Thanks @jdoerfert!

[EDIT] Actual prelim results:
http://llvm-compile-time-tracker.com/compare.php?from=cb17c48fdd8c0f8ab6cd7ad3f197b052db20d1f6&to=d0a19bd1146e84157de4f68fd2cbaa63398ef674&stat=instructions%3Au

Those look quite promising. I'll try to address the comments here tomorrow and check what other cases are missing compared to function-attrs. Please let me know if there's anything I can do to help to address the remaining compile-time issues. Once we are confident that we can get this to acceptable compile-time, I think it makes sense to commit this patch to iterate on missing deductions and enablement.

fhahn updated this revision to Diff 537861.Jul 6 2023, 1:31 PM
fhahn marked 2 inline comments as done.

Address comments and rebase. Renamed to attributor-light, enabled more fn-attrs tests, add missing attributes.

fhahn marked an inline comment as done.Jul 6 2023, 1:33 PM
fhahn added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
3382

Added a short comment, thanks!

llvm/lib/Passes/PassRegistry.def
45

Updated, thanks!

llvm/lib/Transforms/IPO/Attributor.cpp
3823

Done in latest version, thanks!

fhahn updated this revision to Diff 537867.Jul 6 2023, 1:40 PM
fhahn marked an inline comment as done.

Add noalias attribute, restore full nonnull tests.

Overall LGTM. We can iterate in tree. A few suggestions below.

llvm/lib/Passes/PassRegistry.def
212

Can you please add it here too

llvm/lib/Transforms/IPO/Attributor.cpp
3843

AC.UseLiveness = false;

3968
3995
fhahn updated this revision to Diff 538226.Jul 7 2023, 12:19 PM
fhahn marked 2 inline comments as done.

Address latest comments, thanks!

fhahn marked an inline comment as done.Jul 7 2023, 12:36 PM
fhahn added inline comments.
llvm/lib/Passes/PassRegistry.def
212

Done, also added it to one of the tests.

llvm/lib/Transforms/IPO/Attributor.cpp
3843

Done, thanks!

3968

Done thanks!

nikic added inline comments.Jul 7 2023, 12:38 PM
llvm/lib/Transforms/IPO/Attributor.cpp
3841

I would suggest to not include any attributes that are not inferred by FunctionAttrs (or required by Attributor internally) initially. This is something we can consider after it is enabled by default. I think it will be challenging enough without mixing it together with new attributes.

fhahn marked 2 inline comments as done.Jul 7 2023, 12:47 PM
fhahn added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
3841

SGTM! FunctionAttrs has some limited support for inferring nonnull/noalias for function returns, not sure if it is worth trying to emulate that or to just drop them initially?

jdoerfert added inline comments.Jul 11 2023, 1:15 AM
llvm/lib/Transforms/IPO/Attributor.cpp
3841

I'm comparing a light attributor with funcattrs now. I'll make a suggestions what AAs to include and what emulation we can setup for now. We should get this in and adjust after. We also need to increase test coverage with attributor tests for the new pass, at least the one we want to run in O3.

fhahn updated this revision to Diff 543211.Jul 22 2023, 11:16 AM
fhahn marked an inline comment as done.

Rebase, removed NoAlias inferrence.

fhahn added inline comments.Jul 22 2023, 11:17 AM
llvm/lib/Transforms/IPO/Attributor.cpp
3278–3281

I noticed that checkAndQueryIRAttr would add additional attributes if it's directly implied by the IR, which is not what we want at least to start with; I updated things here to check Allowed for now, to avoid inferring additional attributes that aren't part of the Allowed set.

jdoerfert added inline comments.Jul 23 2023, 1:29 PM
llvm/lib/Transforms/IPO/Attributor.cpp
3278–3281

I don't see the point in this, honestly. This will only add known stuff, and we already don't have a 1:1 mapping, so it's unclear why we artificially prevent some known deductions that should not cost anything.

fhahn marked an inline comment as done.Jul 24 2023, 2:53 AM
fhahn added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
3278–3281

The main aim is to try to keep the inferred attributes as similar to legacy FunctionAttrs as possible, not necessarily for compile-time, but to also limit the functional differences. As @nikic mentioned earlier, the transition has the potential to expose a number of issues (incorrect inference of attributes, incorrect use of attributes in optimizations, UB in the user source code that now is exploited and causes mis-compiles).

To keep those issues manageable (and the pain to users to a minimum), IMO it would be desirable to only infer the attributes explicitly opted-in. Even though the attributes may be implied by the existing IR directly, it is still using code from the abstract attributes, so I think it would make sense to only apply it if the attribute is enabled.

fhahn updated this revision to Diff 545382.Jul 29 2023, 1:01 PM
fhahn marked an inline comment as done.

Rebase, updated tests and ping :)

llvm/lib/Transforms/IPO/Attributor.cpp
3278–3281

Any more thoughts on the above?

jdoerfert accepted this revision.Jul 30 2023, 3:27 PM

LG, this is not going to impact anyone but makes it easier to make progress.

This revision is now accepted and ready to land.Jul 30 2023, 3:27 PM

maybe update the statistic numbers though.

fhahn retitled this revision from [Attributor] Add lightweight version for attribute deduction only. (WIP) to [Attributor] Add lightweight version for attribute deduction only.Aug 5 2023, 2:57 AM
fhahn edited the summary of this revision. (Show Details)
fhahn updated this revision to Diff 547470.Aug 5 2023, 2:57 AM

Rebase before landing, thanks!

This revision was landed with ongoing or failed builds.Aug 5 2023, 3:52 AM
This revision was automatically updated to reflect the committed changes.
ebevhan added a subscriber: ebevhan.Aug 9 2023, 3:53 AM

Hi! We're seeing miscompiles downstream when running attributor-light in pipelines. In the below reproducer, DSE is eliminating a store that isn't dead.

define dso_local ptr @fun(ptr noundef %p) {
entry:
  %p.addr = alloca ptr, align 1
  store ptr %p, ptr %p.addr, align 1
  %0 = load ptr, ptr %p.addr, align 1
  ret ptr %0
}

define dso_local i16 @test_it() {
entry:
  %x = alloca i16, align 1
  %p = alloca ptr, align 1
  call void @llvm.lifetime.start.p0(i64 2, ptr %x)
  store i16 42, ptr %x, align 1
  call void @llvm.lifetime.start.p0(i64 2, ptr %p)
  %call = call ptr @fun(ptr noundef %x)
  store ptr %call, ptr %p, align 1
  %0 = load ptr, ptr %p, align 1
  %1 = load i16, ptr %0, align 1
  %cmp = icmp eq i16 %1, 42
  %conv = zext i1 %cmp to i16
  call void @llvm.lifetime.end.p0(i64 2, ptr %p)
  call void @llvm.lifetime.end.p0(i64 2, ptr %x)
  ret i16 %conv
}

declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture)

declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture)

Running this with opt -passes='attributor-light,function(dse)' rep.ll causes the store i16 42 to be removed.

This might not be an issue with attributor-light per se, but we just thought we'd report it here since the pass was added recently.

Ka-Ka added a subscriber: Ka-Ka.Aug 9 2023, 5:39 AM
bjope added a subscriber: bjope.Aug 9 2023, 8:53 AM