This is an archive of the discontinued LLVM Phabricator instance.

Support intrinsic overloading on unnamed types
ClosedPublic

Authored by jeroen.dobbelaere on Nov 11 2020, 4:20 AM.

Details

Summary

This patch adds support for intrinsic overloading on unnamed types.

This fixes PR38117 and PR48340 and will also be needed for the Full Restrict Patches (D68484).

The main problem is that the intrinsic overloading name mangling is using 's_s' for unnamed types.
This can result in identical intrinsic mangled names for different function prototypes.

This patch changes this by adding a '.XXXXX' to the intrinsic mangled name when at least one of the types is based on an unnamed type, ensuring that we get a unique name.

Implementation details:

  • The mapping is created on demand and kept in Module.
  • It also checks for existing clashes and recycles potentially existing prototypes and declarations.
  • Because of extra data in Module, Intrinsic::getName needs an extra Module* argument and, for speed, an optional FunctionType* argument.
  • I still kept the original two-argument 'Intrinsic::getName' around which keeps the original behavior (providing the base name).
    • Main reason is that I did not want to change the LLVMIntrinsicGetName version, as I don't know how acceptable such a change is
    • The current situation already has a limitation. So that should not get worse with this patch.
  • Intrinsic::getDeclaration and the verifier are now using the new version.

Other notes:

  • As far as I see, this should not suffer from stability issues. The count is only added for prototypes depending on at least one anonymous struct
  • The initial count starts from 0 for each intrinsic mangled name.
  • In case of name clashes, existing prototypes are remembered and reused when that makes sense.

Diff Detail

Event Timeline

jeroen.dobbelaere requested review of this revision.Nov 11 2020, 4:20 AM
fhahn added a comment.Nov 11 2020, 4:24 AM

Is this the same idea as in D48541 (linked from PR38117), where we use LLVMContext to keep track of IDs for anonymous types? If that's the case, the same objections as for D48541 probably apply here as well. I think one of the problems is that the names we assign depend on the order different modules are loaded.

Is this the same idea as in D48541 (linked from PR38117), where we use LLVMContext to keep track of IDs for anonymous types? If that's the case, the same objections as for D48541 probably apply here as well. I think one of the problems is that the names we assign depend on the order different modules are loaded.

Somehow I missed that one before.. But yes, the current implementation will suffer the same problem. Moving the DenseMap from LLVMContext to Module should resolve that problem ?

jeroen.dobbelaere edited the summary of this revision. (Show Details)

This new version uses a different approach: we now work by 'uniquifying' the initial BaseName of an intrinsic mangled named when it is based on an anonymous struct.

As an example, the testcase from D68484#2372356 now results in:

%0 = type { i32 }
%1 = type { i32 }

define void @bar01(%0* %ptr0, %1* %ptr1) {
  %1 = call i8* @llvm.noalias.decl.p0i8.p0p0s_s.i64.0(%0** null, i64 0, metadata !0)
  %2 = call %0* @llvm.noalias.p0s_s.p0i8.p0p0s_s.i64.0(%0* %ptr0, i8* %1, %0** null, i64 0, metadata !0)
  store %0 zeroinitializer, %0* %2, align 4, !noalias !0
  %3 = call i8* @llvm.noalias.decl.p0i8.p0p0s_s.i64.1(%1** null, i64 0, metadata !3)
  %4 = call %1* @llvm.noalias.p0s_s.p0i8.p0p0s_s.i64.1(%1* %ptr1, i8* %3, %1** null, i64 0, metadata !3)
  store %1 zeroinitializer, %1* %4, align 4, !noalias !3
  ret void
}

; Function Attrs: argmemonly nounwind
declare i8* @llvm.noalias.decl.p0i8.p0p0s_s.i64.0(%0**, i64, metadata) #0

; Function Attrs: argmemonly nounwind speculatable
declare %0* @llvm.noalias.p0s_s.p0i8.p0p0s_s.i64.0(%0*, i8*, %0**, i64, metadata) #1

; Function Attrs: argmemonly nounwind
declare i8* @llvm.noalias.decl.p0i8.p0p0s_s.i64.1(%1**, i64, metadata) #0

; Function Attrs: argmemonly nounwind speculatable
declare %1* @llvm.noalias.p0s_s.p0i8.p0p0s_s.i64.1(%1*, i8*, %1**, i64, metadata) #1
jeroen.dobbelaere retitled this revision from Support intrinsic overloading on anonymous struct based types to Support intrinsic overloading on unnamed types.
jeroen.dobbelaere edited the summary of this revision. (Show Details)
jeroen.dobbelaere added reviewers: fhahn, chandlerc.

I would like to propose this version for review. As far as I am aware, this version should not suffer from the objections made in D48541. Also a testcase has been included.

jeroen.dobbelaere edited the summary of this revision. (Show Details)Nov 13 2020, 6:09 AM

Can't really comment on whether this makes sense approach wise. For testing, I'd suggest to test that this is correctly handled in bitcode serialization/loading as well, and to drop the current hack in PredicateInfo and see if it works fine for that use case.

... For testing, I'd suggest to test that this is correctly handled in bitcode serialization/loading as well...

The test should be doing that already by assembling and passing the bitcode to llvm-dis ?

fhahn added a comment.Nov 18 2020, 2:30 AM

I think we would also need tests for combining multiple modules with intrinsics with unnamed type.

If the module owns the numbering, can encounter name/type clashes, e.g. if 2 modules have unnamed types that are different. i.e. both have declare %0* @llvm.ssa.copy.p0s_s.0(%0* returned) ,but %0 is defined as different types in both modules ?

I think we would also need tests for combining multiple modules with intrinsics with unnamed type.

If the module owns the numbering, can encounter name/type clashes, e.g. if 2 modules have unnamed types that are different. i.e. both have declare %0* @llvm.ssa.copy.p0s_s.0(%0* returned) ,but %0 is defined as different types in both modules ?

This is indeed a good suggestion. I am adding a link test (and it does bring up a case that I did not foresee, so I am also fixing it).

I improved the patch to actually support linking.

Moved the call to 'forceRenaming' closer to the original location.

MSxDOS added a subscriber: MSxDOS.Dec 4 2020, 2:22 PM

The approach looks reasonable. Please ensure this this documented in LangRef.

Updated langref. Rebased.

fhahn added inline comments.Jan 8 2021, 7:14 AM
llvm/docs/LangRef.rst
20074

submit the llvm.ssa.copy fixes separately?

llvm/include/llvm/IR/Module.h
202

nit: period at end.

204

Where those the name come in here? It maps intrinsic ids to other ids?

llvm/lib/IR/Function.cpp
809

can we assert that there are no unnamed types involved here?

I am doing a testrun with the assert. Based on that outcome I'll update the patches.

llvm/docs/LangRef.rst
20074

ok. I can split it out.

llvm/include/llvm/IR/Module.h
202

nit: period at end.

What is the preference ? the comments before are also mixed wrt to the period.

204

Where those the name come in here? It maps intrinsic ids to other ids?

This map, based on the intrinsic ID and the FunctionType, allow us to remember what extension number has been associated to that combination. The uniqued name is based on the intrinsic ID, the FunctionType and that extension number.

llvm/lib/IR/Function.cpp
809

Should be possible. I would add the assert at line 796-797.

Taking comments into account:

  • doc update
  • added assertion

Hi @fhahn, @efriedma is there anything else that should be modified or is this ok to go ?

nikic added a comment.Mar 6 2021, 8:55 AM

This patch also fixes https://bugs.llvm.org/show_bug.cgi?id=48340, you might want to add a test case for that.

jeroen.dobbelaere edited the summary of this revision. (Show Details)
  • added reference to and testcase for PR48340
  • rebased
fhahn added a comment.Mar 10 2021, 2:49 AM

Thanks for the updates. I think this basically looks good now, just another round of smallish comments.

llvm/docs/LangRef.rst
11402

nit: Perhaps break up the sentence and mention that it allows differentiating intrinsics that have different unnamed types as arguments?

llvm/include/llvm/IR/Intrinsics.h
66

nit: Perhaps it's simpler to just say something like a function type \p FT can be provided to avoid computing it..

69

nit: llvm:: should not be needed.

llvm/lib/IR/Function.cpp
722

nit: I think it would be more explicit to return a pair with a bool, rather than having this out parameter. But. I guess the implementation would get a bit messier in the function.

799

nit: checking explicitly for == nullptr is very uncommon in LLVM, can you just check for FT/!FT?

llvm/lib/IR/Module.cpp
478

nit: no need to capture everything, it should be enough to capture BaseName?

Also, please fix the tidy warning.

502

nit: uncommon in LLVM to check == nullptr.

llvm/test/Transforms/LoopVectorize/pr48340.ll
3 ↗(On Diff #329310)

not needed.

4 ↗(On Diff #329310)

not needed.

6 ↗(On Diff #329310)

Is this required to the test? If so, please move in the X86 subdirectory.

13 ↗(On Diff #329310)

You could also check for the vector.body: block.

20 ↗(On Diff #329310)

is it possible to get rid for the undefs in the tests? (and also have named values for the basic blocks at least?

53 ↗(On Diff #329310)

is this needed?

55 ↗(On Diff #329310)

is this needed?

Changed according to comments.

jeroen.dobbelaere marked 5 inline comments as done.Mar 10 2021, 1:34 PM

The pr48340.ll test was also moved to the X86 directory.

fhahn accepted this revision.Mar 14 2021, 3:06 PM

LGTM, thanks! This patch should address the main issue earlier patches had, which were not handling name clashes on combining modules properly. Please wait a couple of days with committing, in cases there are any other comments.

llvm/docs/LangRef.rst
11405

nit: in the module?

11407

nit: changed how? new numbers will get assigned, right?

llvm/include/llvm/IR/Module.h
202

My preference is . at the end, but I am not sure if there's any rule.

204

Sorry, I meant the Name part in the name of the variable UniquedIntrinsicNames. The variable does not contain the full names, just the ids for intrinsic::id, type combinations. Not sure if there's a better name.

llvm/lib/IR/Function.cpp
809

Sounds good.

llvm/test/Transforms/LoopVectorize/X86/pr48340.ll
1 ↗(On Diff #329756)

can you add -force-vector-width=4 -force-vector-interleave=1 or some similar settings? That way the test should be more robust with respect to cost-model changes.

55 ↗(On Diff #329756)

nit: stray newline.

This revision is now accepted and ready to land.Mar 14 2021, 3:06 PM

Thank you for all the help with the review !

Rebased. Modified langref and test according to comments.

This revision was landed with ongoing or failed builds.Mar 19 2021, 6:35 AM
This revision was automatically updated to reflect the committed changes.

It looks like this causes assertion failures when dis-assembling some bitcode https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288

It looks like this causes assertion failures when dis-assembling some bitcode https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288

Do you have a access to a full backtrace ? I am not able to access the 'detailed report' and it will probably be hard for me to find out on how the
set up a docker image on the corporate system.'

One possible problem could be when LLVMIntrinsicCopyOverloadedName is used with an unnamed type. Not sure what guarantees we can give there.

fhahn added a comment.Mar 20 2021, 2:32 PM

It looks like this causes assertion failures when dis-assembling some bitcode https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288

Do you have a access to a full backtrace ? I am not able to access the 'detailed report' and it will probably be hard for me to find out on how the
set up a docker image on the corporate system.'

One possible problem could be when LLVMIntrinsicCopyOverloadedName is used with an unnamed type. Not sure what guarantees we can give there.

You should be able to download the reproducer. Running LLVM-dis should crash. https://oss-fuzz.com/download?testcase_id=5766959267446784

You should be able to download the reproducer. Running LLVM-dis should crash. https://oss-fuzz.com/download?testcase_id=5766959267446784

Ok. I am able to reproduce it now. It seems llvm.lifetime.end can also depend on unnamed types. I'll go over the various cases in AutoUpgrade.cpp and try to come up with a fix tomorrow.

Thanks for the notification !