Page MenuHomePhabricator

Implement intrinsic mangling for literal struct types. Fixes PR 31921
ClosedPublic

Authored by dberlin on Feb 13 2017, 9:44 PM.

Details

Summary

Predicateinfo requires an ugly workaround to try to avoid literal
struct types due to the intrinsic mangling not being implemented.
This workaround actually does not work in all cases (you can hit the
assert by bootstrapping with -print-predicateinfo), and can't be made
to work without DFS'ing the type (IE copying getMangledStr and using a
version that detects if it would crash).

Rather than do that, i just implemented the mangling. It seems
simple, since they are unified structurally.

Looking at the overloaded-mangling testcase we have, it actually turns
out the gc intrinsics will *also* crash if you try to use a literal
struct. Thus, the testcase added fails before this patch, and works
after, without needing to resort to predicateinfo.

Diff Detail

Repository
rL LLVM

Event Timeline

dberlin created this revision.Feb 13 2017, 9:44 PM
chandlerc added inline comments.Feb 13 2017, 10:05 PM
lib/IR/Function.cpp
538 ↗(On Diff #88303)

Uh.... Am I the only one horrified that we didn't prefix this with anything?

Won't this cause types named 's...' to potentially collide with literal types? Yeah, pretty sure it will. It's not unique to literal types.

Let's take a look at:

%i32 = type { i32 }

define %i32* @test_struct(%i32* %v) gc "statepoint-example" {
entry:
  %tok = call token (i64, i32, i1 ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_i1f(i64 0, i32 0, i1 ()* @return_i1, i32 0, i32 0, i32 0, i32 0, %i32* %v)
  %v-new = call %i32* @llvm.experimental.gc.relocate.p0i32(token %tok,  i32 7, i32 7)
  ret %i32* %v-new
}

declare zeroext i1 @return_i1()
declare token @llvm.experimental.gc.statepoint.p0f_i1f(i64, i32, i1 ()*, i32, i32, ...)
declare %i32* @llvm.experimental.gc.relocate.p0i32(token, i32, i32)
;declare i32* @llvm.experimental.gc.relocate.p0i32(token, i32, i32)

Also, I think we need to ensure nested types are distinguishable here much like below with function types.

dberlin added inline comments.Feb 13 2017, 10:16 PM
lib/IR/Function.cpp
538 ↗(On Diff #88303)

I just kinda assumed people knew what they were doing in the existing code :)
In any case, updated to prefix structs with s_ in the struct case, and sl_ in the literal case.
We end structs with "s" to be distinguishable.

dberlin marked an inline comment as done.
  • Update for review comments

Updating D29925: Implement intrinsic mangling for literal struct types.

Fixes PR 31921

majnemer added inline comments.
lib/IR/Function.cpp
542–543 ↗(On Diff #88308)

Range based loop over STyp->elements()?

  • Move to range based for-loop

Updating D29925: Implement intrinsic mangling for literal struct types.

Fixes PR 31921

dberlin marked an inline comment as done.Feb 14 2017, 12:00 PM
dberlin marked an inline comment as done.Feb 14 2017, 12:59 PM

Since this is going to break every file that uses the old struct mangling, i am splitting it to leave the structs alone for now, because i don't have time to try to figure out an auto-upgrade strategy.

  • Auto-upgrade by regenerating names when necesseary. Note: we can't really detect if the only change is the name, but it seems like this is the best we can do.

Updating D29925: Implement intrinsic mangling for literal struct types.

Fixes PR 31921

dberlin added inline comments.Feb 14 2017, 2:53 PM
test/CodeGen/Generic/overloaded-intrinsic-name.ll
73 ↗(On Diff #88452)

Note that these will assert if they don't end up with the right name, so CHECK lines are not strictly necessary.

chandlerc edited edge metadata.Feb 14 2017, 2:57 PM

This at least seems like a step in the right direction, so LGTM. Maybe touch base with @majnemer before landing though in case he sees something I don't.

lib/IR/AutoUpgrade.cpp
492–493 ↗(On Diff #88452)

Scan the types to see if any are pointers to structs before doing this?

dberlin added inline comments.Feb 14 2017, 5:07 PM
lib/IR/AutoUpgrade.cpp
492–493 ↗(On Diff #88452)

Unfortunately, that won't work.
In fact, that was my original workaround in predicateinfo that broke.
There are two problems:

  1. ssa_copy is anytype, so it can take a literal struct directly, which will still hit the assert.

But hey, we could special case that one.

There's a much worse problem though:

  1. The mangling is recursive over array types, struct type, and function arguments/return types, so it's not enough to identify pointers to structs.

They can be embedded *anywhere*.

So, for example, a pointer to a function type that takes a literal struct as an argument would still hit the old assert (and still be changed in mangling)
Or a pointer to function that has a return type of an array of pointers to functions that take literal structs ...
;(

Essentially, the only way to avoid this check is either:

A. We try to name match all the broken intrinsics and add any we missed over time.
B. We give up and just use it like it is.

I am happy to do A if we are concerned about the compile time cost

chandlerc added inline comments.Feb 14 2017, 10:40 PM
lib/IR/AutoUpgrade.cpp
492–493 ↗(On Diff #88452)

Wow. Just wow.

I wouldn't bother with name matching. The intrinsics i'm most concerned with will match (mem*).

Maybe we could check for iN and iN* parameters and skip it. Maybe not worth it.

Actually, yeah, why not just try auto upgrading a large bitcode file and see how much (if any) time is spent here? If none, we can revisit this if ever someone finds bitcode that has enough intrinsics to matter. I'm crossing my fingers and hoping that this just doesn't matter. =]

Okay,so i tried this on large bc files (all of clang, etc) i have around.

The amount of time spent in remangling is almost not measureable.
For clang, it claims 100 microseconds.
I'm not sure it has that kind of measurement accuracy, but i can't find a large testcase where i can measure any real amount of time spent here.

So i'm going to commit this, but we should definitely revert and try to focus it more if we do find a testcase that matters.

(the worrying code i can think of would be millions of lines of avx512 or neon intrinsics or something)

chandlerc accepted this revision.Feb 15 2017, 2:51 PM

Okay,so i tried this on large bc files (all of clang, etc) i have around.

The amount of time spent in remangling is almost not measureable.
For clang, it claims 100 microseconds.
I'm not sure it has that kind of measurement accuracy, but i can't find a large testcase where i can measure any real amount of time spent here.

So i'm going to commit this, but we should definitely revert and try to focus it more if we do find a testcase that matters.

This sounds great to me. I like the numbers you found (IE, it is doing work, but the work doesn't seem to consume enough time).

LGTM!

This revision is now accepted and ready to land.Feb 15 2017, 2:51 PM
This revision was automatically updated to reflect the committed changes.
materi added a subscriber: materi.Feb 27 2017, 1:54 AM

Hi!

I think this commit is causing a bug in my out-of-tree target. From opt -lint I started seeing messages like this:

Undefined behavior: Caller and callee calling convention differ
  %1 = call %rec8 @llvm.phx.llmac.i32.u32.s_rec8s(i32 %_tmp1, i32 %_tmp2, i32 %_tmp5, i32 %_tmp4), !dbg !10 (tmp.c:4:15)

The calling convention is dropped in UpgradeIntrinsicCall. It looks like this is easy to fix by just transferring the calling convention to the created CallInst here, is this correct?

llvm/trunk/lib/IR/AutoUpgrade.cpp
1837

The calling convention of CI does not carry over to the new call here.

Sorry about that.
I wasn't aware intrinsics were allowed to have a calling convention, since they are internal to llvm.
No intrinsic defined in Intrinsics* has one, from what i can tell, and as you can see, the code we use to upgrade generically does not do anything with calling convention either.
I'm not even sure what it would mean for an intrinsic to have a calling convention (instead of something it *lowers to* to have a calling convention).

Assuming our position is that intrinsics themselves, can have a calling convention, there are two ways to fix this.

We could just replace the called function instead of calling replaceAllUsesWith. Not sure why we don't, but CI->setCalledFunction(NewFn) should work (if we do that, we should stop renaming it .old).

Otherwise, if we really care, we should probably copy both the attributes and the calling convention.

Sorry about that.
I wasn't aware intrinsics were allowed to have a calling convention, since they are internal to llvm.
No intrinsic defined in Intrinsics* has one, from what i can tell, and as you can see, the code we use to upgrade generically does not do anything with calling convention either.
I'm not even sure what it would mean for an intrinsic to have a calling convention (instead of something it *lowers to* to have a calling convention).

Maybe we are just doing something stupid when we add a calling convention to our intrinsics.

I don't know exactly why we have this, maybe they are just an accident. Thanks for pointing it out!

Assuming our position is that intrinsics themselves, can have a calling convention, there are two ways to fix this.

I'll experiment with removing all calling conventions from our intrinsics (in the out-of-tree target). It's possible our position should be that CC is not allowed.

I wasn't aware intrinsics were allowed to have a calling convention, since they are internal to llvm.
No intrinsic defined in Intrinsics* has one, from what i can tell, and as you can see, the code we use to upgrade generically does not do anything with calling convention either.
I'm not even sure what it would mean for an intrinsic to have a calling convention (instead of something it *lowers to* to have a calling convention).

It is used for statepoints and patchpoints. call cconv42 @statepoint(@foo) is lowered to a call to @foo, and cconv42 is interpreted as the calling convention for the lowered call to @foo.