This is an archive of the discontinued LLVM Phabricator instance.

Verify the LLVMContext for Attributes.
ClosedPublic

Authored by nickwasmer on Mar 25 2021, 11:34 AM.

Details

Summary

Attributes don't know their parent Context. Instead of using extra memory to store that, we have the verifier query the Attribute with a prospective parent LLVMContext and the Attribute checks whether its contents exists in that context's FoldingSet.

There's an oddity I noticed while working on this patch. There are methods to create attributes with types which also take a context. If I check for nullptr Type* in attributes in the verifier, there are none in a run of the llvm testsuite. However, the C API does create these things in LLVMCreateEnumAttribute. I'm not sure what the intended state is here, but I'm thinking that the Attribute::get* functions that take Context+Type should either remove the Context or assert that the Type's Context matches the supplied Context when the Type is not null.

Diff Detail

Event Timeline

nickwasmer created this revision.Mar 25 2021, 11:34 AM
nickwasmer requested review of this revision.Mar 25 2021, 11:34 AM

Apply clang-format suggestions, mostly just silence clang-tidy suggestions.

nickwasmer edited the summary of this revision. (Show Details)Apr 2 2021, 9:07 AM

Ping!

This patch includes a change to FoldingSet. Does it make sense for FoldingSet to have a Contains() method? Was it previously missing because we didn't need it yet, or was it missing because not having it makes for a better API? Should that be broken out into its own patch or

This patch adds "hasParentContext" which is a new way of querying the context. Is this an acceptable way API to add? Do we need to do this unusual API or is there a way we could have added a normal "getParent" method to these instead?

Is verifying the contexts of attributes worth adding the above methods? FWIW, I think that a front-end could have a bug where they generate IR using one LLVMContext per source function, but build the attributes for their builtin functions on the first LLVMContext and cache it. The Verifier won't catch the cross-LLVMContext usage before this patch.

Other than it not being "const" is there any reason you couldn't use FindNodeOrInsertPos to implement contains?

nickwasmer updated this revision to Diff 335031.Apr 2 2021, 2:48 PM
nickwasmer edited the summary of this revision. (Show Details)
nickwasmer edited the summary of this revision. (Show Details)

Other than it not being "const" is there any reason you couldn't use FindNodeOrInsertPos to implement contains?

That works, I don't recall why I thought it didn't. With that, I can remove the changes to FoldingSet entirely. Thank you!

Ping!

Is hasParentContext(LLVMContext &C) an acceptable name for what it does? Is there a better one?

Is there a better way to do this? Is it worth adding an API that is different from the common getContext() just to save the pointer and add the capability to the verifier?

dexonsmith accepted this revision.Apr 15 2021, 4:38 PM

LGTM, assuming @craig.topper is also happy.

This revision is now accepted and ready to land.Apr 15 2021, 4:38 PM
This revision was automatically updated to reflect the committed changes.
nikic added a subscriber: nikic.Apr 16 2021, 2:46 PM

This change has a noticable compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=8f683366afcfd7c03faea615d1529a0014d7dbde&to=244d9d6e41db71e76eeb55e56d84f658b3f56681&stat=instructions

Only LTO is affected, I believe because release builds only run the verifier when importing LTO bitcode. Link time for tramp3d-v4 increases by 1%, which seems rather excessive for a verifier addition.

Something to keep in mind is that FoldingSet lookups are quite expensive, and you're doing a lot of them here, because you will be re-checking the same attributes/sets/lists across many functions.

This change has a noticable compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=8f683366afcfd7c03faea615d1529a0014d7dbde&to=244d9d6e41db71e76eeb55e56d84f658b3f56681&stat=instructions

Only LTO is affected, I believe because release builds only run the verifier when importing LTO bitcode. Link time for tramp3d-v4 increases by 1%, which seems rather excessive for a verifier addition.

Agreed, that seems too high!

Something to keep in mind is that FoldingSet lookups are quite expensive, and you're doing a lot of them here, because you will be re-checking the same attributes/sets/lists across many functions.

I left a comment inline; I think there's a way to reduce the number of lookups.

llvm/lib/IR/Verifier.cpp
1863–1866

Can the verifier remember in a DenseSet if it has already checked this AttrSet / Attr / etc.?

nickwasmer added inline comments.Apr 18 2021, 9:15 PM
llvm/lib/IR/Verifier.cpp
1863–1866

Sorta. Note that these types are wrapper types that are passed by copy so we can't just address-compare them. Both AttributeList and Attribute have a way to get a void pointer that uniquely identifies the underlying object, but AttributeSet does not (though we can cheat and ask an AttributeSet to Profile itself in which case we get the hash of the pointer).

I think that we can just cache the AttributeList in a SmallPtrSet. The number of unique attribute lists is a little more than the number of "attributes #? = { ..." at the end of the ll file which is usually pretty small.

The compile time regression fix is up for review at https://reviews.llvm.org/D100738 .