This is an archive of the discontinued LLVM Phabricator instance.

[TargetTransformInfo] Added an option for the cache line size

Authored by congzhe on Jun 8 2022, 1:30 PM.



This patch is to unblock D124926, where the tests needs a valid number of cache line size to proceed with loop cache analysis. However, for some backend targets, TTIImpl->getCacheLineSize() is not implemented and hence 'TTI.getCacheLineSize()' would just return 0 and breaks the analysis.

In this patch we add a user-specified opt/llc option for cache line size. If the option is specified by users we use the value supplied, otherwise we fall-back to the default value obtained from TTIImpl->->getCacheLineSize().

Added a test case under llvm/test/Analysis/LoopCacheAnalysis to make sure loop cache analysis produces different but sane costs when the option is specified with different values.

Diff Detail

Event Timeline

congzhe created this revision.Jun 8 2022, 1:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2022, 1:30 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
congzhe requested review of this revision.Jun 8 2022, 1:30 PM
bryanpkc added inline comments.

This wording is not consistent with the implementation of the option below. In the current implementation, the option always overrides TTIImpl, whether TTIImpl provides a non-zero cache line size or not.

If you want the option to always override TTIImpl, then the option should be named cache-line-size not default-cache-line-size.

I misread your plan. I thought you want to change the return type of getCacheLineSize to Option<unsigned>.

bmahjour added inline comments.Jun 9 2022, 9:50 AM
5 ↗(On Diff #435308)

Since the cost of the outer loops are not affected by the cache line size, we can just check the cost for for.k. Even better would be to have a simpler test involving a single loop for this (eg modeled after llvm/test/Analysis/LoopCacheAnalysis/PowerPC/compute-cost.ll). Remember the purpose of this test is just to make sure that the option works and that the cost values change with different CLS.

congzhe updated this revision to Diff 435663.EditedJun 9 2022, 1:32 PM
congzhe edited the summary of this revision. (Show Details)
congzhe added a reviewer: bryanpkc.

Thanks @bryanpkc @tschuett @bmahjour, update the patch according to your comments.

  1. Reworded the option to cache-line-size and reworded the message a bit.
  2. Deleted the previous test file and added a new file which mirrors "powerpc/compute-cost.ll" but with different cache line sizes specified.
  3. Reworded the summary of this patch a bit to be more clear.

Please let me know if I happen to miss any of your concerns.

bmahjour added inline comments.Jun 10 2022, 10:38 AM

We still need a RUN line (and corresponding checks) without -cache-line-size= to test the default value case (in the absence of target triple).

This option would also be available in llc, so the following in the description isn't quite accurate:

"In this patch we add an opt option for cache line size"

congzhe added inline comments.Jun 10 2022, 11:22 AM

Thanks, I did think about it. The reason I did not add this RUN line is that in the absence of target triple, the cache line size returned from TTI.getCacheLineSize() would depend on the test machine, given that the machine has a valid getCacheLineSize() implemented. So we would get different cache line size numbers on different test machines, which might make it difficult to check a RUN line without -cache-line-size=?

bmahjour added inline comments.Jun 10 2022, 11:41 AM

Good point. I guess the only way to test the default would be if we actually use a triple that's known to have 0 size cache line, so I don't think we can test it reliably. Ok, let's leave it out then.

Matt added a subscriber: Matt.Jun 10 2022, 2:57 PM
congzhe updated this revision to Diff 437430.Jun 15 2022, 8:42 PM
congzhe edited the summary of this revision. (Show Details)
congzhe added reviewers: Meinersbur, Restricted Project.

According to the discussion in the loopopt meeting, I've updated the patch such that the -ppc-loop-prefetch-cache-line option in powerpc is removed and generalized into the option in TTI that overrides the default and target-specific cacheline size. A powerpc codegen test is updated accordingly.

Summary of this patch is also updated to address a previous comment from Bardia.

bmahjour added inline comments.

Since we only use this when getNumOccurrences() > 0, the initialization to 64 isn't necessary or useful. I'd initialize it to 0 for consistency (and to avoid confusion) with the default implementation of getCacheLineSize.

congzhe updated this revision to Diff 437611.Jun 16 2022, 11:05 AM

Thanks @bmahjour, modified the default value to be zero.

bmahjour accepted this revision.Jun 16 2022, 11:37 AM
This revision is now accepted and ready to land.Jun 16 2022, 11:37 AM
This revision was landed with ongoing or failed builds.Jun 16 2022, 12:58 PM
This revision was automatically updated to reflect the committed changes.