This is an archive of the discontinued LLVM Phabricator instance.

[FunctionAttrs] Propagate some func/arg/ret attributes from caller to callsite (WIP)
Needs ReviewPublic

Authored by goldstein.w.n on Jun 5 2023, 9:55 PM.

Details

Summary

This is the consolidation of D151644 and D151943 moved from
InstCombine to FunctionAttrs. This is based on discussion in the above
patches as well as D152081 (Attributor). This patch was written in a
way so it can have an immediate impact in currently active passes
(FunctionAttrs), but should be easy to port elsewhere (Attributor or
Inliner) if that makes more sense later on.

Some function attributes imply the attribute for all/some instructions
in the function. These attributes can be safely propagated to
callsites within the function that are missing the attribute. This can
be useful when 1) analyzing individual instructions in a function
and 2) if the original caller is later inlined, as if the attributes are
not propagated, they will be lost.

This patch implements propagation in a new class/file
InferCallsiteAttrs which can hypothetically be included elsewhere.

At the moment this patch infers the following:

Function Attributes:

  • mustprogress
  • nofree
  • willreturn
  • All memory attributes (readnone, readonly, writeonly, argmem, etc...)
    • The memory attributes are only propagated IFF the set of pointers available to the callsite is the same as the set available outside the caller (i.e no local memory arguments from alloca or local malloc like functions).

Argument Attributes:

  • noundef
  • nonnull
  • nofree
  • readnone
  • readonly
  • writeonly
  • nocapture
    • nocapture is only propagated IFF the set of pointers available to the callsite is the same as the set available outside the caller and its guranteed that between the callsite and function return, the state of any capture pointers will not change (so the nocaptured gurantee of the caller has been met by the instruction preceding the callsite and will not changed).

Argument are only propagated to callsite arguments that are also function
arguments, but not derived values.

Return Attributes:

  • noundef
  • nonnull

Return attributes are only propagated if the callsite's return value
is used as the caller's return and execution is guranteed to pass from
callsite to return.

The compile time hit of this for -O3 and -O3+thinLTO is ~[.02, .37]%
regression. Proper LTO, however, has more significant regressions (up
to 3.92%):
https://llvm-compile-time-tracker.com/compare.php?from=94407e1bba9807193afde61c56b6125c0fc0b1d1&to=79feb6e78b818e33ec69abdc58c5f713d691554f&stat=instructions:u

Diff Detail

Event Timeline

goldstein.w.n created this revision.Jun 5 2023, 9:55 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 5 2023, 9:55 PM
goldstein.w.n requested review of this revision.Jun 5 2023, 9:55 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
goldstein.w.n added a comment.EditedJun 5 2023, 9:59 PM

llvm-test-suite passes with this patch enabled.
But note: There currently some failing clang tests.
Most of them look benign (just need to propagate a few attributes to a few thousand places),
but I was hoping there is a way to auto-regenerate them before going through laboriously by hand.
All the ones that can be auto regenerated I updated + a few by hand before thinking better of it.
Currently the list of failing tests in the llvm-project with all projects enabled using ninja check-all is:

Failed Tests (52):
  Clang :: CodeGen/SystemZ/builtins-systemz-zvector-constrained.c
  Clang :: CodeGen/SystemZ/builtins-systemz-zvector.c
  Clang :: CodeGen/SystemZ/builtins-systemz-zvector2-constrained.c
  Clang :: CodeGen/SystemZ/builtins-systemz-zvector2.c
  Clang :: CodeGen/SystemZ/builtins-systemz-zvector3-constrained.c
  Clang :: CodeGen/SystemZ/builtins-systemz-zvector3.c
  Clang :: CodeGen/SystemZ/systemz-inline-asm.c
  Clang :: CodeGen/aarch64-neon-vcmla.c
  Clang :: CodeGen/aarch64-sme-intrinsics/acle_sme_ld1.c
  Clang :: CodeGen/aarch64-sme-intrinsics/acle_sme_ld1_vnum.c
  Clang :: CodeGen/aarch64-sme-intrinsics/acle_sme_st1.c
  Clang :: CodeGen/aarch64-sme-intrinsics/acle_sme_st1_vnum.c
  Clang :: CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c
  Clang :: CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.cpp
  Clang :: CodeGen/arm-cmse.c
  Clang :: CodeGen/arm64-mte.c
  Clang :: CodeGen/builtins-multiprecision.c
  Clang :: CodeGen/builtins-wasm.c
  Clang :: CodeGen/cfi-icall-cross-dso.c
  Clang :: CodeGen/fp-contract-on-pragma.cpp
  Clang :: CodeGen/fp-contract-pragma.cpp
  Clang :: CodeGen/fp-strictfp-exp.cpp
  Clang :: CodeGen/inline-asm-aarch64-flag-output.c
  Clang :: CodeGen/inline-asm-x86-flag-output.c
  Clang :: CodeGen/ms-intrinsics-other.c
  Clang :: CodeGen/ms-intrinsics.c
  Clang :: CodeGen/neon-crypto.c
  Clang :: CodeGenCXX/RelativeVTablesABI/dynamic-cast.cpp
  Clang :: CodeGenCXX/microsoft-abi-dynamic-cast.cpp
  Clang :: CodeGenCXX/microsoft-abi-typeid.cpp
  Clang :: CodeGenCXX/sizeof-unwind-exception.cpp
  Clang :: CodeGenObjC/synchronized.m
  Clang :: CodeGenObjCXX/exceptions-legacy.mm
  Clang :: CodeGenOpenCL/builtins-amdgcn-dl-insts-gfx11.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-dl-insts.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-fp8.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-gfx10.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-gfx11.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-gfx9.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-mfma.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-vi.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl
  Clang :: CodeGenOpenCL/builtins-amdgcn.cl
  Clang :: CodeGenOpenCL/builtins-f16.cl
  Clang :: CodeGenOpenCL/builtins-generic-amdgcn.cl
  Clang :: CodeGenOpenCL/single-precision-constant.cl
  Clang :: Headers/__clang_hip_cmath.hip
  Clang :: Headers/__clang_hip_math.hip
  Clang :: Headers/__clang_hip_math_ocml_rounded_ops.hip
  Clang :: Headers/ms-arm64-intrin.cpp
  Clang :: OpenMP/bug57757.cpp

Is there a way to autoregen them? Some of them need a few thousand changes (just adding noundef for the most part) that hopefully doesn't need to be done by hand.

sivachandra added inline comments.
libc/CMakeLists.txt
22 ↗(On Diff #528681)

Likely out of date and not required for this patch?

goldstein.w.n edited the summary of this revision. (Show Details)Jun 5 2023, 11:49 PM
goldstein.w.n marked an inline comment as done.

Remove libc changes

goldstein.w.n added inline comments.Jun 6 2023, 2:13 PM
libc/CMakeLists.txt
22 ↗(On Diff #528681)

woops, left that in from the bug earlier.
Fixed.

arsenm added inline comments.Jun 6 2023, 2:20 PM
llvm/include/llvm/Transforms/Utils/InferCallsiteAttrs.h
2

Missing license header

11–13

Why not a regular enum?

goldstein.w.n marked 2 inline comments as done.Jun 7 2023, 12:32 AM

Add header comments/license.
Use enum for kMaybe, kYes, kNo

Typo guranteed in message

llvm/include/llvm/Transforms/Utils/InferCallsiteAttrs.h
2

Missing C++ mode comment

29

Does changing this to something meaningful change the compile time results?

64

Without looking at the rest of the context I would assume CxtB is always valid if the class is valid and all the null checks should be dropped

llvm/lib/Transforms/IPO/FunctionAttrs.cpp
72

Random whitespace change

1761

Braces

llvm/lib/Transforms/Utils/InferCallsiteAttrs.cpp
34

Use ArrayRef?

49

Unk?

54

Don't mention malloc and just say noalias call?

344
346
358
386
671
arsenm added inline comments.Jun 12 2023, 7:01 PM
llvm/lib/Transforms/Utils/InferCallsiteAttrs.cpp
693

Not very C++y, use the default constructor?

This revision was not accepted when it landed; it landed in state Needs Review.Jun 12 2023, 10:51 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
goldstein.w.n marked an inline comment as done.
goldstein.w.n reopened this revision.Jun 12 2023, 10:56 PM
goldstein.w.n marked 8 inline comments as done.Jun 13 2023, 12:37 AM
goldstein.w.n added inline comments.
llvm/include/llvm/Transforms/Utils/InferCallsiteAttrs.h
29
64

Not quite, so CxtB is used to specify propagation of function attributes from a specific callsite. This may be useful in inline. i.e if we have:

foo -> bar -> baz
and
fiz -> bar -> buz

If bar in foo is getting inlined, we can propagate not only the function attributes of bar, but also the callsite attributes of that specific call to bar to any callsites (so baz).

At the moment its always null in fact.

llvm/lib/Transforms/Utils/InferCallsiteAttrs.cpp
344

I think callsite is regularly refererred to as singular word (in other files too) so prefer to keep as is. That okay?

358

did 'pad' -> 'padd' but prefer to keep callsite as one word

693

If PreserveCache is true, then we can redo same function (non-sequentially) and get saved analysis. If you think this is meaningless and we will always do one-off functions I can change to just construct with a caller (although I think for inlining this might come up).

nikic added inline comments.Jun 13 2023, 1:03 AM
llvm/lib/Transforms/Utils/InferCallsiteAttrs.cpp
598

For these you generally want to query getMemoryEffects() on the caller/callee once and then work on that representation, instead of doing separate queries for everything. This may allow more precise handling and likely reduces the compile-time impact, as doing these attribute lookups is quite expensive.

FWIW, I wish the time that went into this would have been to improve and enable the (lightweight) Attributor pass.
Anyway, I'm still doing that, and the latest numbers didn't look that bad. I will start to upstream stuff that I found.
At this point it might be good to understand if the (lightweight) Attributor would solve your problems or what would be left.

FWIW, I wish the time that went into this would have been to improve and enable the (lightweight) Attributor pass.
Anyway, I'm still doing that, and the latest numbers didn't look that bad. I will start to upstream stuff that I found.
At this point it might be good to understand if the (lightweight) Attributor would solve your problems or what would be left.

With the way the InferCallsiteAttrs is written, it should be equally usable in attributor as it is in FunctionAttrs. Guess my
feeling was want to get this in s.t its actually usable. Once LW attributor becomes usable should be easy to apply InferCallsiteAttrs
there.

FWIW, I wish the time that went into this would have been to improve and enable the (lightweight) Attributor pass.
Anyway, I'm still doing that, and the latest numbers didn't look that bad. I will start to upstream stuff that I found.
At this point it might be good to understand if the (lightweight) Attributor would solve your problems or what would be left.

With the way the InferCallsiteAttrs is written, it should be equally usable in attributor as it is in FunctionAttrs. Guess my
feeling was want to get this in s.t its actually usable. Once LW attributor becomes usable should be easy to apply InferCallsiteAttrs
there.

Guess I'm assuming this will be usable in the Attributor. That is the goal. If its not the case am happy to refactor so that it is.

Remove unused check count. Use memory effects directly instead of
function/callbase attribute API

goldstein.w.n marked an inline comment as done.Jun 14 2023, 2:07 AM
goldstein.w.n added inline comments.
llvm/lib/Transforms/Utils/InferCallsiteAttrs.cpp
598

Re-running compile time checks, will post revised numbers tomorrow.

goldstein.w.n added inline comments.Jun 14 2023, 2:11 AM
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
1764

Do we actually need to add the function to the changed list here? We aren't modifying the functions attributes, only callsites within the function.

goldstein.w.n added inline comments.Jun 14 2023, 11:46 AM
llvm/lib/Transforms/Utils/InferCallsiteAttrs.cpp
598

No real impact on compile time.