This is an archive of the discontinued LLVM Phabricator instance.

[StackSafety] Add info into function summary
ClosedPublic

Authored by vitalybuka on Jun 1 2020, 12:01 AM.

Details

Summary

This patch adds optional field into function summary,
implements asm and bitcode serialization. YAML
serialization is omitted and can be added later if
needed.

This patch includes this information into summary only
if module contains at least one sanitize_memtag function.
In a near future MTE is the user of the analysis.
Later if needed we can provede more direct control
on when information is included into summary.

Diff Detail

Event Timeline

vitalybuka created this revision.Jun 1 2020, 12:01 AM
vitalybuka marked an inline comment as done.Jun 1 2020, 1:27 PM
vitalybuka added inline comments.
llvm/include/llvm/Analysis/StackSafetyAnalysis.h
48

do you have preference Params vs Args?

Missing roundtrip textual and binary IR encoding test. And I think it won't pass because you are reading getUpper and writing getSignedMax (to textual) and those seem off-by-one.

llvm/docs/LangRef.rst
6828

"known access range"

llvm/include/llvm/Analysis/StackSafetyAnalysis.h
48

Parameters. Arguments are concrete values passed to a function call.

llvm/include/llvm/Bitcode/LLVMBitCodes.h
297

Should we call it something like ParamAccessRange here and everywhere? W/o stack safety in the name.

llvm/include/llvm/IR/ModuleSummaryIndex.h
556

This can be safely reduced to 32 bits or even less in practice, if that would help with the binary size.

llvm/lib/Analysis/StackSafetyAnalysis.cpp
696

Why not Arg.Calls.emplace_back(C.ParamNo, C.Callee->getGUID(), C.Offset) ?

vitalybuka updated this revision to Diff 268052.Jun 2 2020, 9:41 PM
vitalybuka marked 7 inline comments as done.

update

Missing roundtrip textual and binary IR encoding test. And I think it won't pass because you are reading getUpper and writing getSignedMax (to textual) and those seem off-by-one.

it's thinlto-function-summary-paramaccess.ll:18 and it works.
Asm read/write uses getSignedMax. I think for [A,B] it's more intuitive to read this as inclusive range
binary read/write uses getUpper

llvm/include/llvm/Bitcode/LLVMBitCodes.h
297

if you don't mind FS_PARAM_ACCESS as range just a current type, e.g in future it can be something more different.

llvm/include/llvm/IR/ModuleSummaryIndex.h
556

Writer/Parser push them as uint64 and smart encoded store them efficiently
on StackSafety side i plan to cut-off large values, e.g. 2^32 but it's going to be a separate patch

eugenis accepted this revision.Jun 3 2020, 2:55 PM

This seems find to me.
Peter, any concerns?

This revision is now accepted and ready to land.Jun 3 2020, 2:55 PM
This revision was automatically updated to reflect the committed changes.

I completely missed this patch (my filter didn't pick it up since it doesn't mention LTO). I have some post-commit comments on it.

Is there another patch somewhere that shows how this info is used?

llvm/docs/LangRef.rst
6828

Logically, what is "offset"? is this the byte offset from the pointer? What does it mean if this info doesn't exist - I assume that means unknown and therefore a need to be conservative?

6839

Can you show a small example? I assume the offset of each Callee in the list must be a subset of the offset of the containing Param?

llvm/include/llvm/IR/ModuleSummaryIndex.h
556

Byte offsets I assume?

556

If these will be max 32 bits can you use that here? Even though they will be compressed on serialization, I assume it would affect in-memory size.

567

Document constant parameter (here and in declaration for Use below)

625

Since this will only be used under stack safety (which I assume is optional and uncommon?), might this be better as a unique_ptr like we do for type id info declared just above? The overhead of an empty vector is going to be higher (3x on my system), which will add up in the combined summary.

llvm/lib/Analysis/StackSafetyAnalysis.cpp
672

I realize this is only for debug compilers, but it seems like an expensive assert - it walks all functions in the module, and we will call this function/assert once per function.

684

Please add comments to this code to describe logically what/why it is doing. I.e. why does it not add any param info if any offset "isFullSet".

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
6286

Will the use of this summary info be able to correctly distinguish from an old bitcode file (created before this support was added) from one created after that simply doesn't have this summary info for some reason? Does it need to for correct operation?

llvm/lib/IR/ModuleSummaryIndex.cpp
38

Is this needed? I don't see a use here.

vitalybuka marked 12 inline comments as done.Jul 1 2020, 3:38 AM

Thanks for feedback. Docs are here https://reviews.llvm.org/D82941
I'll create a new patch for unique_ptr<> change

llvm/include/llvm/IR/ModuleSummaryIndex.h
556

If you don't mind I'll fix "max 32" later, with other algorithm optimizations.

625

I'll create a separate patch for this tomorrow.

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
6286

Missing info is not a problem. Users of StackSafety just will not be able to optimize out safety checks.

Thanks for feedback. Docs are here https://reviews.llvm.org/D82941

Saw that, thanks.

I'll create a new patch for unique_ptr<> change

Sounds good.

llvm/include/llvm/IR/ModuleSummaryIndex.h
556

Sure, it doesn't affect the common case where this is empty, so doing as a follow on sounds fine.

625

Thanks!