This is an archive of the discontinued LLVM Phabricator instance.

Add register size info back to MCRegisterClass
ClosedPublic

Authored by rafauler on Mar 3 2021, 3:11 PM.

Details

Summary

This patch addresses the removal of register size information done in
commit c8b782c.

Without this change, there is no viable option to get register size
information outside libTarget. We need this information to run
analysis that know the register size from the MC layer, used by
BOLT.

Discussion D50285 and D47199.

Diff Detail

Event Timeline

rafauler created this revision.Mar 3 2021, 3:11 PM
rafauler requested review of this revision.Mar 3 2021, 3:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2021, 3:11 PM

Let me know if there is another way to solve this without reverting this.

bjope added a comment.Mar 4 2021, 12:22 AM

If you need this I'd rather "add the needed support" than doing it as a revert (although nice to get the history here in the review). To be more specific; I would let RegisterInfoEmitter provide the size in bits rather than the mystierious "/8" that for example results in a 1-bit register being reported as having size 0. And I'd get rid of the redundant "getSize()" method. And the change to the code comment in TargetInstrInfo::getStackSlotRange could probably be skipped.

With this patch the RegSize is set to zero for "non simple" classes (those with multiple HW modes). That would need to be documented somehow (although I guess it limits how the size information can be used, as it will be missing for some register classes).

It would be nice if you could describe a bit more about how Bolt is using this information to understand if there is any alternative solutions. And to understand why/if it is OK to not getting the information for classes with several HW modes etc.

kparzysz added a comment.EditedMar 4 2021, 5:28 AM

For targets that make use of HW modes, register sizes are specific to a subtarget.

It's hard to say what the best solution is without knowing the specifics of your case, but if we go the revert route, it would come with the limitations that @bjope outlined.

If you need this I'd rather "add the needed support" than doing it as a revert

Sounds good, let me work on that a little bit.

Regarding where is this used, the dependency of BOLT on this interface is a single line here:
https://github.com/facebookincubator/BOLT/blob/master/src/MCPlusBuilder.cpp#L445
and here
https://github.com/facebookincubator/BOLT/blob/master/src/Target/X86/X86MCPlusBuilder.cpp#L1417

BOLT offers a framework for running dataflow analyses on binaries. We use it to perform shrinkwrapping, for example. For this to happen, we need to know the register size when analyzing an MCInst to figure out how wide is the load or store to stack. With this information, we build a model of which bytes of the stack are being used and how, and if we have enough information to move load/stores to colder positions.

This register size information is exposed inside BOLT though an interface that abstracts away all target information. When writing a pass in BOLT, you may query, then, whether registers alias with each other and their size. As you can imagine, this is provided by MC. BOLT builds the size information straight out of the tablegen'd specs of the target though MC. I tried to move this logic to use TRI, but TRI is only available to code living inside libTarget. To an external tool like BOLT that relies a lot on the MC library, it is not possible to fetch it though libTarget unless we create an interface in libTarget to expose this information, which I think would be awkward.

rafauler retitled this revision from Revert "[MC] Remove PhysRegSize from MCRegisterClass" to Add register size info back to MCRegisterClass.Mar 5 2021, 3:54 PM
rafauler edited the summary of this revision. (Show Details)
rafauler updated this revision to Diff 328679.Mar 5 2021, 3:55 PM

Addressing comments

I changed the function name to getSizeInBits() so that out of tree targets that rebase LLVM past the removal of getSize() and get it added back again won't be surprised by a different size being returned by a function with the same name.

rafauler updated this revision to Diff 328687.Mar 5 2021, 4:26 PM

clang format

For targets that make use of HW modes, register sizes are specific to a subtarget.

+1

I'm not ignoring the HW mode stuff, I added a short explanation in the comments instructing users to give TargetRegisterInfo a try if they can but that's all I could think of. Last time I checked, I couldn't access TargetRegisterInfo as a library user because everything is accessible only to code living inside libTarget, which itself exposes only a limited high-level interface to the outside world, such as "run backend passes". If there is a way to get this info as a library user, I'll be happy to do it. Let me know if it is possible and I'll try to hack BOLT code to get to it.

Matt added a subscriber: Matt.Mar 12 2021, 7:30 AM
kparzysz accepted this revision.Mar 17 2021, 11:21 AM

I'm inclined to accept this as is. There is definitely a limitation here that cannot be easily worked around.

I'm wondering about moving the HW mode feature to MCSubtargetInfo at some point...

This revision is now accepted and ready to land.Mar 17 2021, 11:21 AM

I'm inclined to accept this as is. There is definitely a limitation here that cannot be easily worked around.

Well the usage in https://github.com/facebookincubator/BOLT/blob/master/src/Target/X86/X86MCPlusBuilder.cpp#L1417 is a bit strange (at least my impression when reading the code comment the goal is to find out the width of a memory access and that is done by looking at register sizes). It might give wrong result for sextloads etc if X86 is doing such things. But I'm not sure if that can be done in any other way.

So I'm inclined to accept this as well. And I think it looks much better now after the cleanups, getting rid of the weird rounding down when converting bits-to-bytes that was there in the past.

This revision was landed with ongoing or failed builds.Mar 23 2021, 3:05 PM
This revision was automatically updated to reflect the committed changes.

If you need this I'd rather "add the needed support" than doing it as a revert

Sounds good, let me work on that a little bit.

Regarding where is this used, the dependency of BOLT on this interface is a single line here:
https://github.com/facebookincubator/BOLT/blob/master/src/MCPlusBuilder.cpp#L445
and here
https://github.com/facebookincubator/BOLT/blob/master/src/Target/X86/X86MCPlusBuilder.cpp#L1417

BOLT offers a framework for running dataflow analyses on binaries. We use it to perform shrinkwrapping, for example. For this to happen, we need to know the register size when analyzing an MCInst to figure out how wide is the load or store to stack. With this information, we build a model of which bytes of the stack are being used and how, and if we have enough information to move load/stores to colder positions.

This register size information is exposed inside BOLT though an interface that abstracts away all target information. When writing a pass in BOLT, you may query, then, whether registers alias with each other and their size. As you can imagine, this is provided by MC. BOLT builds the size information straight out of the tablegen'd specs of the target though MC. I tried to move this logic to use TRI, but TRI is only available to code living inside libTarget. To an external tool like BOLT that relies a lot on the MC library, it is not possible to fetch it though libTarget unless we create an interface in libTarget to expose this information, which I think would be awkward.

@rafaelauler I have a question about that, if I want to use this patch to bolt, dose it means we need to change the code from "I->getSize()" to "I->getSizeInBits() / 8"
thanks in advance.

Yes. Also, I'll rebase BOLT soon (past this patch).