This is an archive of the discontinued LLVM Phabricator instance.

Support a list of CostPerUse values
ClosedPublic

Authored by cdevadas on Aug 29 2020, 5:23 AM.

Details

Summary

This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.

For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.

Diff Detail

Event Timeline

cdevadas created this revision.Aug 29 2020, 5:23 AM
cdevadas requested review of this revision.Aug 29 2020, 5:23 AM
madhur13490 added inline comments.
llvm/lib/CodeGen/RegAllocGreedy.cpp
790

There is an implicit upcast happening here from uint8_t to unsigned. This may lead to unexpected behavior on different platform, I'd recommend to be consistent unless absolutely necessary to use unsigned.

llvm/utils/TableGen/CodeGenRegisters.h
154

Is there any specific reason to use int64_t type for CostPerUse?

llvm/utils/TableGen/RegisterInfoEmitter.cpp
1457

This block of comment can be split and put above the relevant line for better readability. Having a block of comment doesn't really say which logical block of code is meant for.

1458

Probably a better name for RegCosts. Something like AllRegCostPerUse.

1459

I think you can use std::bitset or llvm::BitVector here. vector<bool> is less space optimal than these.

1460

What is the purpose of this extra block of 0s at the beginning?

cdevadas added inline comments.Aug 31 2020, 3:12 AM
llvm/lib/CodeGen/RegAllocGreedy.cpp
790

Will change such instances to uint8_t.

llvm/utils/TableGen/CodeGenRegisters.h
154

getValueAsListOfInts works only with int64_t type.

llvm/utils/TableGen/RegisterInfoEmitter.cpp
1460

It is for NoRegister which is the first entry in the generated register list

cdevadas updated this revision to Diff 288909.Aug 31 2020, 3:15 AM

Addressed the review comments.

asb added a comment.Sep 3 2020, 4:49 AM

It looks like others are reviewing the implementation details, but I just wanted to chime in to say that this looks like a useful feature for the RISC-V backend too. Currently we set CostPerUse for some registers in order to tweak register allocation in a way that benefits codegen when the compressed instruction set extension is enabled. Although the impact is _very_ minimal, this slightly harms targets that don't support the compressed instruction set (admittedly, this isn't really true of any shipping RISC-V processor so not a big concern). It would be good to use this vary the CostPerUse depending on target RISC-V features.

Ping @stoklund @MatzeB or anyone interested in this patch.
Appreciate it if this gets reviewed soon.

cdevadas updated this revision to Diff 296621.Oct 7 2020, 2:02 AM
cdevadas added reviewers: qcolombet, wmi.

rebase + ping

qcolombet requested changes to this revision.Oct 9 2020, 12:34 PM

Hi,

The general direction is fine, but IMHO there are needless code churn.
Why can't we stick to getCostPerUse instead of changing all the users?

Also, this needs a test case.
Take a look at llvm/test/TableGen/GlobalISelEmitter.td for instance.

Cheers,
-Quentin

llvm/include/llvm/CodeGen/RegisterClassInfo.h
70

Why do we need this array?
The information should be readily available through TRI->getCostPerUse() (maybe by adding an MF argument).

125

Could we stick to unsigned here to avoid leaking the implementation details of MinCost?

130

Again could we stick to using getCostPerUse everywhere?

llvm/include/llvm/CodeGen/TargetRegisterInfo.h
215

The allocatable part cannot change, can it?
I.e., why is this an array?

339

This should be protected, no user should care about that.

343

Could we not expose this table and have the users stick to getCostPerUse instead?

llvm/include/llvm/Target/Target.td
173

Could you call out that targets have to override TargetRegisterInfo::getRegisterCostTableIndex to use the related cost model?

173

The comment should also call out that the cost is filled with zeros for mismatching sizes of register costs.

176

Is there a way to use a type that enforce that instead of just writing it in a comment?

llvm/lib/CodeGen/RegAllocGreedy.cpp
410

Why can't we channel the new information through CostPerUse instead of adding a new API?

1124

Ditto about leaking the implementation details of something we shouldn't care (here MinCost).

1142

Ditto why can't we use getCostPerUse here?

This revision now requires changes to proceed.Oct 9 2020, 12:34 PM
cdevadas added inline comments.Oct 11 2020, 7:17 AM
llvm/include/llvm/CodeGen/RegisterClassInfo.h
125

Will do.

llvm/include/llvm/CodeGen/TargetRegisterInfo.h
215

I have changed the way tablegen integrates these values to TargetRegisterInfo. It was an array of struct earlier and now a pointer to a table for the benefit of supporting the list of register costs. I thought it could be ok to change InAllocatableClass because it is part of TargetRegisterInfoDesc class. Otherwise, one of these members should be moved out of the class.

343

It's an overhead to get the index on every cost query with getCostPerUse. This will instead, help to extract out a slice of the regcost table (as returned by getRegisterCosts) dynamically once from the tablegened register-costs-table with the help of the target handler getRegisterCostTableIndex.

llvm/include/llvm/Target/Target.td
176

I don't think Tablegen have different sized int types.

cdevadas added inline comments.Oct 16 2020, 12:40 AM
llvm/lib/CodeGen/RegAllocGreedy.cpp
1124

MinCost is actually declared as uint8_t in RegisterClassInfo.h and I think it is good to keep it uint8_t everywhere.
The comparison with CostPerUseLimit in the next line makes it all consistent since I reduced the reg-cost type to uint8_t with this patch.

cdevadas updated this revision to Diff 302237.Nov 2 2020, 3:29 AM

Added testcases.

cdevadas updated this revision to Diff 305048.Nov 13 2020, 12:08 AM

rebase + elaborated the comment in Target.td

lenary removed a subscriber: lenary.Nov 13 2020, 3:56 AM

@qcolombet can you review this patch?

I'll try to take a look by end of next week.

qcolombet requested changes to this revision.Jan 26 2021, 12:22 PM
qcolombet added inline comments.
llvm/include/llvm/CodeGen/TargetRegisterInfo.h
215

I don't really see the point here.
I initially thought it was to save space by sharing the boolean array with all instances of TargetRegisterInfoDesc, but we actually trade a bool with a pointer.

I don't understand the argument about having to move the members out of the class.

Why can't we stick with the bool is here again?

339

This was not addressed, was it?

343

Fair enough.

This revision now requires changes to proceed.Jan 26 2021, 12:22 PM
cdevadas added inline comments.Jan 27 2021, 3:46 AM
llvm/include/llvm/CodeGen/TargetRegisterInfo.h
215

Earlier, the CostPerUse and InAllocatableClass were tablegened with exactly numRegs entries each.
We could accommodate them in a single table and use the register itself to index into this table to retrieve the members during RA.

With this patch, the tablegened CostPerUse will have numCosts*numRegs entries wherein the InAllocatableClass field will remain unchanged (exactly numRegs entries).
The mismatch in their entries is the first reason I have generated the tables differently by creating two separate tables and a TargetRegisterInfoDesc instance that holds these tables.
Furthermore, an appropriate cost model will be extracted out dynamically using the getRegisterCostTableIndex and getRegisterCosts APIs so that we can still use the register to index into this dynamic table.

We should examine the XXXRegisterInfo.inc file generated with and without this patch to fully understand the new tables and the reason why I have changed the InAllocatableClass to a pointer.
If you look at the test I included in this patch, llvm/test/TableGen/RegisterInfoEmitter-regcost-list.td
There are 6 registers defined (s0 - s3 and d0 - d1) in the .td file and with the additional NoRegister there will be total 7 registers in the generated file.

I have included 2 cost models that resulted in 14 entries in the CostPerUseTable.
static const uint8_t CostPerUseTable[] = { 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 2, 3, };

For cost model 0: During RA, the ArrayRef RegCosts will get {0, 0, 0, 1, 1, 1, 1}
For cost model 1: During RA, the ArrayRef RegCosts will get {0, 0, 0, 0, 1, 2, 3}

The InAllocatableClassTable will have 7 entries irrespective of the cost model.
static const bool InAllocatableClassTable[] = { false, true, true, true, true, true, true, };

Finally, the TargetRegisterInfoDesc instance to be used by the RA.
static const TargetRegisterInfoDesc MyTargetRegInfoDesc = { // Extra Descriptors

CostPerUseTable, 2, InAllocatableClassTable};

Assume if there is only one cost model (cost model 0) in the input file and with the upstream compiler today, the tablegened register info will look like the following:
static const TargetRegisterInfoDesc MyTargetRegInfoDesc[] = { // Extra Descriptors

{ 0, false },
{ 0, true },
{ 0, true },
{ 1, true },
{ 1, true },
{ 1, true },
{ 1, true },

};

The array MyTargetRegInfoDesc[] generated with the upstream compiler would now become an object MyTargetRegInfoDesc that points to the individual tables.

339

Sorry, I missed it. Will move it under Protected.

qcolombet accepted this revision.Jan 28 2021, 11:14 AM

LGTM, just make sure to move getRegisterCostTableIndex under protected.

llvm/include/llvm/CodeGen/TargetRegisterInfo.h
215

Got it, thanks for the detailed explanation.

This revision is now accepted and ready to land.Jan 28 2021, 11:14 AM
This revision was landed with ongoing or failed builds.Jan 28 2021, 8:46 PM
This revision was automatically updated to reflect the committed changes.