This is an archive of the discontinued LLVM Phabricator instance.

[PGO][PGSO] TargetInstrInfo part.
AbandonedPublic

Authored by hjyamauchi on Nov 1 2019, 1:50 PM.

Details

Reviewers
davidxl
Summary

(Split of off D67120)

TargetInstrInfo changes for profile guided size optimization.

Event Timeline

hjyamauchi created this revision.Nov 1 2019, 1:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2019, 1:50 PM
Herald added subscribers: jsji, tpr, kbarton and 11 others. · View Herald Transcript
arsenm added inline comments.Nov 1 2019, 1:58 PM
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

Why is this needed for commuting?

hjyamauchi added inline comments.Nov 1 2019, 2:20 PM
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

There isn't anything particular about commuting, but wherever we make a size-vs-speed code gen choice (one in X86InstrInfo.cpp), and if we want to do it in a profile guided manner, we'd need to propagate profile (PSI/MBFI) down.

arsenm added inline comments.Nov 1 2019, 4:23 PM
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

I don’t think it’s appropriate to pass this in to commuteInstruction. It should just perform the commute and not concern itself with profitability. It would be the caller’s responsibility to determine if commuting is a good idea

hjyamauchi marked an inline comment as done.Nov 4 2019, 11:24 AM
hjyamauchi added inline comments.
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

Around X86InstrInfo.cpp:1576,

case X86::BLENDPDrri:
case X86::BLENDPDrri:
case X86::BLENDPSrri:
case X86::VBLENDPDrri:
case X86::VBLENDPSrri: {
  // If we're optimizing for size, try to use MOVSD/MOVSS.
  if (MI.getParent()->getParent()->getFunction().hasOptSize()) {
     ....
  }

this code tries to replace BLEND with MOVSD, if possible, under optsize.

This is unfortunately internal to Target/X86InstrInfo (TII) and isn't up to the caller, except X86InstrInfo internally checks for optsize on the function.

This patch would like to make it profile guided (eg. do this for cold code, even without optsize) and it needs access to the profile (PSI/MBFI) there in some way.

It doesn't seem like a good idea to pass the profile to TII per pass or function because it is TII is created once and stateless.

There doesn't seem a good way to access the profile directly from TII because the profile isn't attached to the function or the IR.

Do you see some other way to accomplish this?

We could consider instead passing "bool OptSize" to commuteInstruction. What do you think?

arsenm added inline comments.Nov 4 2019, 11:36 AM
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

This usage of OptSize seems broken in the first place. commuteInstruction is used, for example, in MachineCSE. This would mean optsize can potentially disable CSE for some operations, which will increase instruction count (and therefore code size) by blocking the commute.

AMDGPU also has situations for example where.a smaller encoding is possible, but we have a pass that uses this based on when it makes sense to switch encodings. commuteInstruction isn't making this decision.

davidxl added inline comments.Nov 5 2019, 10:52 AM
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

I agree with Matt. Can we remove this change for commute and see if it matters for size? The existing code is broken and we don't want this badness to be exposed at interface level.

hjyamauchi marked an inline comment as done.Nov 5 2019, 1:38 PM
hjyamauchi added inline comments.
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

Yeah, it looks like some CSE between blend and movsd/movss could indeed be missed depending on whether optsize.

This code came from https://reviews.llvm.org/rL336731. It may be alternatively fine to do a blend -> movsd/movss replacement in a separate pass if optsize.

Thoughts?

hjyamauchi marked an inline comment as done and an inline comment as not done.Nov 5 2019, 3:52 PM
hjyamauchi added inline comments.
llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138

I measured the size impact for an internal app. Turned out to be very tiny 0.0002% :) I believe we can drop this patch.

hjyamauchi abandoned this revision.Feb 4 2020, 11:26 AM