This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Implement variable-width register classes, step 1: API changes
AbandonedPublic

Authored by kparzysz on Sep 15 2016, 2:38 PM.

Details

Summary

Follow-up to the discussion in D23561: implementation of support for register classes with variable-sized registers.

Outline of the plan:

  • Make register sizes, spill slot sizes and spill slot alignment be determined by feature bits from MCSubtargetInfo. At the moment, the AMD target uses the register size at that level, so this data must be available even without MachineFunction/TargetRegisterInfo/etc.

The register size and the spill slot size can actually be different (certain cases on Hexagon), and since there are cases where the intended meaning of the RC->getSize() is the register size (parts of the AMDGPU code) as well as cases where it is taken to be spill slot size (register allocation), the distinction between these two interpretations will be made explicit.

  • The register size, spill slot size, and spill slot alignment will be acquired for a register class by functions from MCRegisterInfo, not from MCRegisterClass directly. These functions will also take MCSubtargetInfo as a parameter, to allow consultation with the feature bits.
  • The actual format of the register class data as calculated by TableGen is not implemented in this step, and it will be done in the next step, together with the implementation of the .td syntax changes.
  • This step is a NFC change to put new APIs in place. The rest of the changes should be isolated to the parts directly involved with processing the register class data, i.e. TableGen, MCRegisterInfo, and MCSubtargetInfo. It should not be committed by itself, it serves to demonstrate this part of the proposed solution.

Diff Detail

Repository
rL LLVM

Event Timeline

kparzysz updated this revision to Diff 71556.Sep 15 2016, 2:38 PM
kparzysz retitled this revision from to [RFC] Implement variable-width register classes, step 1: API changes.
kparzysz updated this object.
kparzysz set the repository for this revision to rL LLVM.
kparzysz added a subscriber: llvm-commits.
kparzysz updated this object.Sep 15 2016, 2:39 PM
kparzysz edited edge metadata.
MatzeB edited edge metadata.Sep 15 2016, 3:35 PM
  • A register in the scheme here should still have a definite encoding, so it should be possible to not require this knowledge at all in the MC layer. Are there other ways to rewrite AMDGPU (like looking which register class a register belongs to determine the size?) I would consider the register size/spillslot properties a part of the TargetRegisterInfo here.
  • I would not add proxy calls in MachineRegisterInfo. MRI is there to deal with things related to virtual registers not necessarily register classes IMO.
  • You get instances of TargetRegisterInfo from the TargetSubtargetInfo anyway, so it should not be necessary to pass the SubtargetInfo around to the various calls. Instead you should be able to create specialized TargetRegisterInfo instances for the different sizes in TargetSubtargetInfo::getRegisterInfo().
asb added a subscriber: asb.Sep 15 2016, 10:38 PM
asb added a comment.EditedSep 16 2016, 3:08 AM

Hi Krzysztof. So as discussed in D23561 the motivating problem for this work is cases (as in Hexagon HVX) where instructions with identical encodings but different RegisterClasses currently need to be defined twice. An example of this is valignb. Having a register class with a non-constant register size and alignment would solve the issue for HVX, but there's also the hope it will be useful for other targets. For these HVX instructions there is no list<dag> pattern defined. If a pattern was specified, surely even with this new functionality you'd need to have repeated instruction definitions in order to define multiple patterns, because they would need different ValueTypes? Do you have something in mind that would address this as well?

  • A register in the scheme here should still have a definite encoding, so it should be possible to not require this knowledge at all in the MC layer. Are there other ways to rewrite AMDGPU (like looking which register class a register belongs to determine the size?) I would consider the register size/spillslot properties a part of the TargetRegisterInfo here.

I published a patch for that: D24675.

  • I would not add proxy calls in MachineRegisterInfo. MRI is there to deal with things related to virtual registers not necessarily register classes IMO.

Sure, with the transition to having all this information defined at the TargetRegisterInfo level, there should not be any need for them. There isn't any need anyway even with this code, I just added it to make the changes a bit shorter.

  • You get instances of TargetRegisterInfo from the TargetSubtargetInfo anyway, so it should not be necessary to pass the SubtargetInfo around to the various calls. Instead you should be able to create specialized TargetRegisterInfo instances for the different sizes in TargetSubtargetInfo::getRegisterInfo().

Good idea.

In D24631#544549, @asb wrote:

Having a register class with a non-constant register size and alignment would solve the issue for HVX, but there's also the hope it will be useful for other targets.

That's a requirement! :)

Most HVX instructions don't have patterns defining them, since they correspond to fairly complex operations. For C/C++ programmers, the current HVX programming model is to use intrinsics, and those we do map to HVX instructions. An alternative is to use Halide (which under the covers also generates intrinsics). At the moment, the compiler does not perform any auto-vectorization to use HVX.

This explains the absence of patterns for HVX, but many core Hexagon instructions lack patterns in instruction definitions as well. Instead, we use the "Pat" class. For the purpose of this effort, that may be the preferred approach, since pats are more amenable to various types of annotations. In particular, you can use a "Predicate" object to decide whether a pat should apply or not. The solution I have in mind for the .td format is based on the predicates to determine the hardware mode, and for patterns it would amount to a syntactic sugar for multiple pats, each predicated on its own predicate for each hardware mode. An instruction definition also has a list of predicates, but they only enable or disable the single defining pattern: any alternatives need to be provided via pats.

I hope this addresses this part as well:

For these HVX instructions there is no list<dag> pattern defined. If a pattern was specified, surely even with this new functionality you'd need to have repeated instruction definitions in order to define multiple patterns, because they would need different ValueTypes? Do you have something in mind that would address this as well?

kparzysz abandoned this revision.Aug 23 2018, 9:16 AM

Already implemented.