This is an archive of the discontinued LLVM Phabricator instance.

IR: add 'byval(<ty>)' variant to 'byval' function parameters
ClosedPublic

Authored by t.p.northover on May 23 2019, 7:45 AM.

Details

Reviewers
jyknight
dblaikie
Summary

One of the remaining blocking issues for merging all pointer types is that byval parameters need some way to communicate their size other than the element type of the pointer. This patch addresses the issue by adding a size function attribute, only usable with byval that encodes the information independently.

It's mostly a pretty straightforward copy/paste of the existing align handling, except I decided to allow more than 16-bits of size. It should be very rare, but I have seen people pass ridiculous structs by value and the compiler should probably be able to cope.

The main other thing I found slightly questionable was whether we could reuse the existing dereferencable or alloc_size attributes. In the end I decided the semantic contortion was just slightly too high so implemented a new keyword to be maximally compatible with the align it will sit next to.

Diff Detail

Event Timeline

t.p.northover created this revision.May 23 2019, 7:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2019, 7:45 AM
arsenm added a subscriber: arsenm.May 23 2019, 7:51 AM

Needs bitcode compatibility tests

llvm/test/Assembler/size-param-attr.ll
17 ↗(On Diff #200973)

Also a test with 0?

arsenm added inline comments.May 23 2019, 7:52 AM
llvm/test/Assembler/invalid-size2.ll
4 ↗(On Diff #200973)

Could also use a size > UINT_MAX

Hey Tim - thanks for looking into this!

The original discussions for this had centered around adding a parameter to the byval attribute itself, rather than having it dependent on another attribute.

In the typeless pointer future - this size attribute wouldn't be optional, but required when 'byval' is present. So this seems more like a transitionary state - and perhaps we should consider whether the intermediate state is worthwhile compared to what would be the final form.

ie: What about adding an integer parameter to 'byval'?

ie: What about adding an integer parameter to 'byval'?

I do agree that we should prioritize the final state, but I think that final state should somehow include the word "size" so that the number is interpretable without looking at the documentation (at least to some degree). I also quite like the symmetry in this scheme between "size" and "align". Both of those are reasonably mild opinions though.

I have even less strong views on other components, though I'm not sure if you were even talking about those. We could, for example, change ATTR_KIND_SIZE to ATTR_KIND_BYVAL_WITH_SIZE and in the C++ representation drop the separate Attribute::Byval.

OTOH we could do both of those (and change IR, come to that) at some later date without breaking compatibility.

One of the other suggestions was to pass a _type_ as a parameter to byval. IMO that would be the nicest idea (but I don't know if it's infeasibly difficult?)

In the transitory state, you'd need to specify the exact pointee-type to byval, as is specified in the pointee type of the argument.
E.g.:

define i8 @f({i32, i8}* byval({i32, i8}) %p) ...

In the future, that'd then turn into:

define i8 @f(ptr byval({i32, i8}) %p) ...

One of the other suggestions was to pass a _type_ as a parameter to byval. IMO that would be the nicest idea (but I don't know if it's infeasibly difficult?)

I wouldn't worry about the implementation difficulty, I can see a pretty clear path there. I also like the spelling of that option, it certainly bypasses the ambiguity of a bare number. I wonder if it's promising more than we actually deliver though. SelectionDAG only exposes the size and alignment to backends.

A final point in favour of the type idea is that I think targets where pointers are special (e.g. CHERI where a pointer in memory is only usable if it was stored by a special, blessed instruction) couldn't work with a plain size. They don't work with SelectionDAG either but in future they might want to use byval. I'm not entirely sure how theoretical that argument is though.

One of the other suggestions was to pass a _type_ as a parameter to byval. IMO that would be the nicest idea (but I don't know if it's infeasibly difficult?)

Yeah, sorry, forgot to mention that - one of the justifications for this was consistency with global variables (which, similarly, are really just bytes and alignment - but we use a type to describe those bytes and alignment)

In the transitory state, you'd need to specify the exact pointee-type to byval, as is specified in the pointee type of the argument.
E.g.:

define i8 @f({i32, i8}* byval({i32, i8}) %p) ...

In the future, that'd then turn into:

define i8 @f(ptr byval({i32, i8}) %p) ...

I'm liking the type idea more and more. I'll get started reworking the patch to do that.

Switched to byval(<ty>) syntax, still optional for now but auto-upgraded when reading .bc files.

t.p.northover retitled this revision from IR: add 'size <N>' attribute to 'byval' function parameters to IR: add 'byval(<ty>)' variant to 'byval' function parameters.May 24 2019, 5:34 AM
arsenm added inline comments.May 24 2019, 5:35 AM
llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
1206–1207

Ternary operator

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9076–9080

Ternary operator

9580–9584

Ternary operator

While modifying Clang to emit this new style attribute, it turned out I'd got the printing incorrect and you could end up with things like byval(%mystruct = type { i32 }). Fixed that and added a unit test for it.

Accidentally included half-done Clang changes in last upload, so this removes them.

Switched to ternary operators as Matt suggested.

t.p.northover marked 2 inline comments as done.May 24 2019, 7:18 AM
t.p.northover added inline comments.
llvm/lib/CodeGen/#SwiftErrorValueTracking.cpp#
1

Accidental addition. I'll remove it when I next upload.

llvm/test/Assembler/#invalid-size2.ll#
1

Accidental addition. It'll be gone next time.

Generally looking pretty good to me.

llvm/include/llvm/IR/Argument.h
81

Comment looks to be out of date (speaks of returning the size, rather than type)

llvm/lib/IR/Attributes.cpp
394–395

You can roll the declaration into the condition if you like - pretty common in the LLVM codebase, though not a stylistic requirement by any means.

554

This being pointer ordering - that means it'll be unstable across runs (which I assume/think none of the other attributes are?) - is that OK?(I don't know if that's OK - it probably is OK)

llvm/test/Assembler/invalid-size1.ll
1–4

Leftover from the previous version?

t.p.northover marked 4 inline comments as done.May 28 2019, 7:09 PM

Some replies, and I'll upload a new revision tomorrow with the things that definitely need fixing.

llvm/include/llvm/IR/Argument.h
81

Bother, I did a specific "git diff" run to catch those!

llvm/lib/IR/Attributes.cpp
394–395

Yep. No idea why I didn't do that.

554

I did worry about that, but came to the conclusion it doesn't matter unless the same attribute can appear repeatedly. And I don't think that is possible.

Even with IntAttributes, I replaced the corresponding comparison with an llvm_unreachable and the only thing that broke was a unit test specifically exercising this function.

So in summary I have no real preference between status quo and llvm_unreachable, but don't think it's worth implementing a stable order right now.

I should probably add some checks to that unittest myself though.

llvm/test/Assembler/invalid-size1.ll
1–4

Yes. I don't think there's an equivalent (at least not a new one) so I'll remove it.

dexonsmith added inline comments.May 29 2019, 10:04 AM
llvm/lib/IR/Attributes.cpp
554

I would prefer llvm_unreachable or report_fatal_error with a comment saying that comparing on type wouldn't be stable, but is also not possible. Or, sorting it in an unstable way with comment saying it's illegal, and adding a corresponding verifier error. Happy with whatever you and David decide though, if you think it's better as is.

dblaikie added inline comments.May 29 2019, 10:09 AM
llvm/lib/IR/Attributes.cpp
554

Yeah, +1 to that - if the code isn't used, unreachable/assert - then we can discover if it ever gets used and add test coverage for it.

I think I've fixed everything suggested, including making the comparison assert if it ever happens since that seems to be the consensus.

I also noticed some Clang tests will need to be updated because byval now sorts slightly differently, so I've included those in the diff here for completeness.

dblaikie accepted this revision.May 29 2019, 11:21 AM

Seems good to me - thanks a bunch (:

(I'd probably suggest making the type parameter to byval non-optional in the IR and builder APIs as soon as it's convenient - that was my general approach in the previous work I did on opaque pointers)

llvm/lib/IR/Attributes.cpp
553–554

Prefer assertion rather than branch-to-unreachable

assert(getKindAsEnum() != AI.getKindAsEnum() ...
This revision is now accepted and ready to land.May 29 2019, 11:21 AM

Thanks David. I've committed it as r362012 and r362013 with the tweak you suggested.

jordan_rose added inline comments.
llvm/lib/IR/Attributes.cpp
553–554

Swift is hitting this assertion on its master-next branch right now when two type attributes are wrapping the same type. Is that supposed to be checked earlier or canonicalized away?

t.p.northover marked an inline comment as done.Jun 27 2019, 7:51 AM
t.p.northover added inline comments.
llvm/lib/IR/Attributes.cpp
553–554

Sorry, I was on holiday last week. It should be fixed now. I've made FunctionComparator delve into attributes with types since it's main purpose seems to be going to heroic efforts to compare things stably anyway.