This is an archive of the discontinued LLVM Phabricator instance.

Add llvm.loop.unroll.threshold metadata
AbandonedPublic

Authored by timcorringham on Sep 24 2020, 4:17 AM.

Details

Summary

Add new loop metadata llvm.loop.unroll.threshold to allow the initial target
specific unroll threshold value to be specified on a loop by loop basis.

The intention is to be able to to allow more nuanced hints, e.g. specifying a
low threshold value to indicate that a loop may be unrolled if cheap enough
rather than using the all or nothing llvm.loop.unroll.disable metadata.

This new metadata is used in setting the default threshold in the target
specific loop options, so only has any effect for targets that make use of
it - in this change only AMDGPU uses this metadata.

This change was originally proposed as an AMDGPU specific change, but
following review comments (see https://reviews.llvm.org/D84779) is now
being presented as generic metadata.

Diff Detail

Event Timeline

timcorringham created this revision.Sep 24 2020, 4:17 AM
Herald added a project: Restricted Project. · View Herald Transcript
timcorringham requested review of this revision.Sep 24 2020, 4:17 AM
timcorringham edited the summary of this revision. (Show Details)Sep 24 2020, 4:21 AM
jdoerfert edited reviewers, added: arsenm, Meinersbur; removed: jdoerfert.Oct 9 2020, 9:15 AM

As implemented, the threshold is specific to the AMDGPU backed. Any possibility to implement it generally in LoopUnrollPass.cpp? If it is indeed intended to just by processed by a single backend, the attribute's name should reflect that, like for the amdgpu-unroll-threshold attribute.

The patch is identical to D84779? The same concerns to the undefined semantics for "threshold" still applies.

I guess the loop unroll threshold metadata could be consumed directly by the LoopUnrollPass.cpp, but that would cut out some functionality that is useful (at least for AMDGPU) - the target specific unroll options code for AMDGPU has heuristics that can adjust the unroll threshold. The metadata gives the starting point, but the value may be increased in he presence of certain constructs. If the metadata was used as the final threshold in LoopUnrollPass.cpp then to get the same behaviour the front-end would have to perform the same analysis to adjust the threshold value - which would have an impact on compile time, and duplicate code, etc.

As for whether the metadata should be llvm generic or amdgpu specific, I guess the question is whether any other targets would find this useful? My original approach was to make it amdgpu specific, but reviewers considered that it could be generic - hence this second review.

I'm not sure I'd describe the threshold semantics as undefined - I would describe it as imprecise. If a change has the effect of pushing code over a threshold (in either direction), the effect may be undesirable or unexpected, but equally it can mitigate the impact of changes that would otherwise be undesirable. In any case, the threshold approach is used in the loop unroll pass, so using any other control (such as a precise loop unroll count) would require significantly more effort (the unroll count would have to be calculated in advance, which may still use a threshold).

Unlike the reviewers in D84779, I would be OK with it being target specific. We already have precedence for this in form of the amdgpu-unroll-threshold function attribute and this just applied the same information more specifically to loops. I don't see the issue of non-generality being raised in D68873.

I'd object to an llvm.loop.unroll.threshold metadata without even the intend to define/implement it to other targets as well. I think the threshold number itself is inherently target-specific, so I think it is difficult to define "threshold of 10000" to be comparable between AMD, X86, PowerPC, etc.

Sorry you're caught between reviewer's opinions.

timcorringham abandoned this revision.Oct 21 2020, 3:07 AM

OK, thanks for the comments. I'll abandon the generic form of the metadata and have a re-think...