This is an archive of the discontinued LLVM Phabricator instance.

Add new 'flatten' LLVM attribute to fix clang's 'flatten' function attribute
Needs ReviewPublic

Authored by LevitatingLion on Nov 17 2019, 4:05 PM.

Details

Summary

This adds a new 'flatten' attribute, which works like 'always_inline' but applies recursively to inlined call sites. The addition was briefly discussed on the mailing list: http://lists.llvm.org/pipermail/llvm-dev/2019-November/136514.html

This patch also contains changes to clang, so that it uses the new LLVM attribute on functions marked with the clang attribute 'flatten'. Previously, clang marked all calls in such functions with 'always_inline'; in effect, only the first level of calls was inlined.

Currently this patch fails the '/llvm/test/Bitcode/highLevelStructure.3.2.ll' test. llvm-dis seems to be unable to correctly decode attributes stored in the bitcode when the new attribute is added, although other attributes don't seem to have required any handling of this problem, see https://reviews.llvm.org/D62766 or https://reviews.llvm.org/D49165. I speculated that's because this is the 65th attribute, so a bitmask indicating all attributes doesn't fit in 64 bit anymore.

Diff Detail

Event Timeline

LevitatingLion created this revision.Nov 17 2019, 4:05 PM
arsenm added inline comments.Nov 17 2019, 8:56 PM
llvm/docs/LangRef.rst
1431

It's not obvious to me what the flatten name means. flatteninline? recursive_alwaysinline? Something else?

Thank you for working on this!

LevitatingLion added inline comments.Nov 18 2019, 7:34 AM
llvm/docs/LangRef.rst
1431

I agree. What about always_inline_recurse or always_inline_recursively?

We need more tests here.

For one, the flatten attribute is not necessary to pass the test.
Second, we need to check the corner cases, e.g. reduction with different cycle lengths.

llvm/docs/LangRef.rst
1431

I'd prefer always_inline_recursively or recursive_alwaysinline so far. Though something shorter would be fine too.
always_inline_rec maybe?

[reverse ping] Is this still active?

Thanks for the ping. I hadn't looked at this since, but I'll update the patch this weekend.

I rebased my changes onto 49d00824bbb, renamed the attribute to 'alwaysinline_recursively', and added some more tests. The testcase 'highLevelStructure.3.2.ll' does not fail anymore, all regression tests pass.

Are there any more places where changes are required? I looked at the changes when other attributes were introduced and grep'd for 'Attribute::AlwaysInline' to find places which need handling of the new attribute.

I'm fine with this. I would hope a C/C++/Clang person will also take a look though.

llvm/docs/LangRef.rst
1398

Maybe mention the correspondence to the flatten C/C++ attribute here.

I'm fine with this. I would hope a C/C++/Clang person will also take a look though.

This is missing clang codegen test[s].
Seems to look fine to me otherwise.

The semantics of this llvm attribute seem to better match 'flatten'. However, it is unfortunate that this doesn't change any clang tests. Can you add/alter a test in clang to validate the IR?

While adding tests to clang I realized the attribute is not working as intended when using an optimization level of zero, because clang adds the noinline attribute to all functions. In this case the optimizer cannot distinguish between functions originally marked noinline (where recursive always-inlining should stop) and those where clang added the attribute (where recursive always-inlining should continue).

Is this acceptable? I think we should fix this, and recursively inline at optimization level zero. GCC's documentation on the flatten attribute states that "every call inside this function is inlined, if possible", clang's that calls are "inlined unless it is impossible to do so".

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

While adding tests to clang I realized the attribute is not working as intended when using an optimization level of zero, because clang adds the noinline attribute to all functions. In this case the optimizer cannot distinguish between functions originally marked noinline (where recursive always-inlining should stop) and those where clang added the attribute (where recursive always-inlining should continue).

Is this acceptable? I think we should fix this, and recursively inline at optimization level zero. GCC's documentation on the flatten attribute states that "every call inside this function is inlined, if possible", clang's that calls are "inlined unless it is impossible to do so".

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

TBH, I would issue a warning if we see flatten in O0 that says this will not work and be done with it.

TBH, I would issue a warning if we see flatten in O0 that says this will not work and be done with it.

I would argue against diagnostics that depend on optimization level, since that leads to an inconsistent developer experience. In practice developers tend to use Debug and Release configurations fairly interchangeably, with different optimization levels for each.

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

Another option (not sure if it's better) would be to add a noopt LLVM attribute that Clang adds for -O0 instead of noinline. Two possibilities would be to update the inliner to pay attention to that as well (with special logic for flatten), or to change the always-inliner to add noinline to anything marked noopt.

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

Another option (not sure if it's better) would be to add a noopt LLVM attribute that Clang adds for -O0 instead of noinline. Two possibilities would be to update the inliner to pay attention to that as well (with special logic for flatten), or to change the always-inliner to add noinline to anything marked noopt.

Or, have clang add a new noflatten attribute when it sees __attribute__((noinline)).

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

Another option (not sure if it's better) would be to add a noopt LLVM attribute that Clang adds for -O0 instead of noinline. Two possibilities would be to update the inliner to pay attention to that as well (with special logic for flatten), or to change the always-inliner to add noinline to anything marked noopt.

noopt == optnone? Both optnone and noinline are set in O0, so we could just not place noinline (I think).

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

Another option (not sure if it's better) would be to add a noopt LLVM attribute that Clang adds for -O0 instead of noinline. Two possibilities would be to update the inliner to pay attention to that as well (with special logic for flatten), or to change the always-inliner to add noinline to anything marked noopt.

noopt == optnone? Both optnone and noinline are set in O0, so we could just not place noinline (I think).

Sure, that could work. Or the noflatten idea is another possibility. It would be good to hear what others think.

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

Another option (not sure if it's better) would be to add a noopt LLVM attribute that Clang adds for -O0 instead of noinline. Two possibilities would be to update the inliner to pay attention to that as well (with special logic for flatten), or to change the always-inliner to add noinline to anything marked noopt.

noopt == optnone? Both optnone and noinline are set in O0, so we could just not place noinline (I think).

Sure, that could work. Or the noflatten idea is another possibility. It would be good to hear what others think.

optnone currently requires noinline. Can we simply remove this requirement or would that need more changes?

If I understand the noflatten idea correctly, we would change the LLVM behaviour so that alwaysinline_recursively ignores noinline and stops inlining only when a callee has a dedicated "stop-marker" attribute (e.g. noflatten)? I think that would be counter-intuitive, noinline should prevent inlining.

Maybe we can add an additional string attribute when adding the noinline attribute to functions which are not marked noinline in the source code, something like "noinline-added-by-clang". I don't know if that's a legitimate use case for a string attribute, but it wouldn't be very invasive. What do you think?

Another option (not sure if it's better) would be to add a noopt LLVM attribute that Clang adds for -O0 instead of noinline. Two possibilities would be to update the inliner to pay attention to that as well (with special logic for flatten), or to change the always-inliner to add noinline to anything marked noopt.

noopt == optnone? Both optnone and noinline are set in O0, so we could just not place noinline (I think).

Sure, that could work. Or the noflatten idea is another possibility. It would be good to hear what others think.

optnone currently requires noinline. Can we simply remove this requirement or would that need more changes?

I don't see a reason why we couldn't.

If I understand the noflatten idea correctly, we would change the LLVM behaviour so that alwaysinline_recursively ignores noinline and stops inlining only when a callee has a dedicated "stop-marker" attribute (e.g. noflatten)? I think that would be counter-intuitive, noinline should prevent inlining.

I would prefer the above solution instead of another "stop token".

Found the commit: dcbe35bad518

The way I see it we just have to teach the inliner about optnone so we can uncouple the two (optnone and noinline).
@probinson WDTY?