This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Increase number of LOCRs emitted by passing regalloc hints
ClosedPublic

Authored by jonpa on Aug 16 2017, 8:03 AM.

Download Raw Diff

Details

Reviewers

qcolombet
uweigand
MatzeB
hfinkel

Summary

TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with an increase in number of LOCRs.

Diff Detail

Event Timeline

jonpa created this revision.Aug 16 2017, 8:03 AM

Below is a little table with different number of LOCRs and spill/reload instructions during experiments explained as follows:

Using just the getRegAllocationHints() implementation above, the number of locr type instructions increases by 171 on the SPEC, which means an increase from 97.4% to 99.1% (6333 -> 6504). This is the percentage of LOCRMux converted to an LOCR type instruction. (B)

As you can see, I have also done some further experiments that involves constraining the GRX32 reg class during isel if the other register class is high or low in emitSelect(). Using just this (and not giving any hints), gives an increase of 50 LOCRs (C). However, if all operands are GRX32, and they are constrained alternatingly between high and low (D), a similar result to (B) is achieved.

The most LOCRs I have gotten so far is with (E), although (F) is close with less spill.

This however turned out to be not quite true, because looking at the code size of loops, it is clear that LOCRXALTERNATE is making loops bigger. My guess is that somehow this lesser freedom forces suboptimal decisions somewhere else.

However, loops are getting better with LOCRCONSTRAIN (E), so looking at just the output, this looks valuable in addition to (B).

So, next is to remove the LOCRXALTERNATE, and then go for (B) or (E) I guess. Unless you have some input that might improve things further...?

Happy for any comments!

BTW, I don't quite understand the purpose of updateRegAllocHint(). It seems to be run during coalescing, but that's before getRegAllocationHints() has ever been called. So what is being updated?

/Jonas

Branch                                          Number of LOCRs         Number of "Spill|Reload" comments

A master                                        6333                    164102
B just reg-alloc hints:                         6504                    164169
C no hints:  LOCRCONSTRAIN                      6383                    164157
D no hints: LOCRCONSTRAIN + LOCRXALTERNATE      6506                    164020
E hints and LOCRCONSTRAIN                       6541                    164224
F hints and LOCRCONSTRAIN + LOCRXALTERNATE      6534                    164056

The getRegAllocationHints implementation makes sense to me. However, I'm wondering if we shouldn't also check for the *destination* register -- you only force one source register to the same class as the other source register, but I think we should check whether *any* of the three registers is already allocated, and then always force the other two to the same class.

I think if we do that consistently, we should be able to *guarantee* that we end up with a register assignment that is legal for the instruction, and can therefore completely remove the pass that creates a branch again.

Already constraining the the regclass in emitSelect may make sense as well, but we should verify that it still makes any difference once we've implemented the change above. I agree that the LOCRXALTERNATE variant doesn't look very useful.

Oh, and just comment on this:

BTW, I don't quite understand the purpose of updateRegAllocHint(). It seems to be run during coalescing, but that's before getRegAllocationHints() has ever been called. So what is being updated?

I understand this is about something else: maintaining the MachineRegisterInfo set/getRegAllocationHint data. It seems the ARM target sets up the MRI hints in a special pass ahead of time, and then *uses* the MRI hints in its implementation of getRegAllocationHints. However, for this to be useful the MRI hints need to be updated for the effects of register coalescing; the updateRegAllocHint allows the target to do just that.

Since your implementation of getRegAllocationHints doesn't actually make any use of the MRI hints, you don't need to implement updateRegAllocHint either.

In D36795#844373, @uweigand wrote:

Oh, and just comment on this:

BTW, I don't quite understand the purpose of updateRegAllocHint(). It seems to be run during coalescing, but that's before getRegAllocationHints() has ever been called. So what is being updated?

I understand this is about something else: maintaining the MachineRegisterInfo set/getRegAllocationHint data. It seems the ARM target sets up the MRI hints in a special pass ahead of time, and then *uses* the MRI hints in its implementation of getRegAllocationHints. However, for this to be useful the MRI hints need to be updated for the effects of register coalescing; the updateRegAllocHint allows the target to do just that.

Since your implementation of getRegAllocationHints doesn't actually make any use of the MRI hints, you don't need to implement updateRegAllocHint either.

Ah, I see - thanks for explaining.

The getRegAllocationHints implementation makes sense to me. However, I'm wondering if we shouldn't also check for the *destination* register -- you only force one source register to the same class as the other source register, but I think we should check whether *any* of the three registers is already allocated, and then always force the other two to the same class.

I tried this (see updated patch below), but found that it gave the identical results (with or without the DestMO lines which I commented out). My guess is that the the TwoAddress pass is already doing this job in processTiedPairs() by

const TargetRegisterClass *RC = MRI->getRegClass(RegB);
...
    if (TargetRegisterInfo::isVirtualRegister(RegA) &&
        TargetRegisterInfo::isVirtualRegister(RegB))
      MRI->constrainRegClass(RegA, RC);

I think if we do that consistently, we should be able to *guarantee* that we end up with a register assignment that is legal for the instruction, and can therefore completely remove the pass that creates a branch again.

Giving hints is not enough if VirtReg is GRX32, and there could still be cases where the source registers are high / low. The question is how/when should we constrain the reg classes? It doesn't seem right to do this in getRegAllocationHints(), since e.g. already Order is passed. It could be done in emitSelect(), but a bit naively (complicated cases suboptimally). It would be nice to do this after coalescing, but not sure how.

Already constraining the the regclass in emitSelect may make sense as well, but we should verify that it still makes any difference once we've implemented the change above.

I don't know exactly why, but doing only this (without hints) give some improved benchmarks without regressions. Doing only the hints give similar improvements, but with regressions... Also, a regression with the regsplit around loops patch applied could only be handled with this, so this part seems valuable until we actually manage to handle the same cases with hints.

I agree that the LOCRXALTERNATE variant doesn't look very useful.

removed

In D36795#845175, @jonpa wrote:

The getRegAllocationHints implementation makes sense to me. However, I'm wondering if we shouldn't also check for the *destination* register -- you only force one source register to the same class as the other source register, but I think we should check whether *any* of the three registers is already allocated, and then always force the other two to the same class.

I tried this (see updated patch below), but found that it gave the identical results (with or without the DestMO lines which I commented out). My guess is that the the TwoAddress pass is already doing this job in processTiedPairs() by

Ah, of course. I had forgotten that these are two-operand instructions anyway, so my suggestion doesn't actually make sense. (I must have been thinking of three-operand instructions like AHHHR.)

I think if we do that consistently, we should be able to *guarantee* that we end up with a register assignment that is legal for the instruction, and can therefore completely remove the pass that creates a branch again.

Giving hints is not enough if VirtReg is GRX32, and there could still be cases where the source registers are high / low. The question is how/when should we constrain the reg classes? It doesn't seem right to do this in getRegAllocationHints(), since e.g. already Order is passed. It could be done in emitSelect(), but a bit naively (complicated cases suboptimally). It would be nice to do this after coalescing, but not sure how.

So my thought that if both registers are still GRX32, then getRegAllocationHints for the first register would return both high and low options. Then, once regalloc chooses one of them, and getRegAllocationHints is later called on the *other* register, we know it must be in the same class, so we return only the low or only the high options. Does it not work this way?

Already constraining the the regclass in emitSelect may make sense as well, but we should verify that it still makes any difference once we've implemented the change above.

I don't know exactly why, but doing only this (without hints) give some improved benchmarks without regressions. Doing only the hints give similar improvements, but with regressions... Also, a regression with the regsplit around loops patch applied could only be handled with this, so this part seems valuable until we actually manage to handle the same cases with hints.

OK. In any case, constraining the regclass in emitSelect, in cases where we already know about restrictions, seems a good idea.

So my thought that if both registers are still GRX32, then getRegAllocationHints for the first register would return both high and low options. Then, once regalloc chooses one of them, and getRegAllocationHints is later called on the *other* register, we know it must be in the same class, so we return only the low or only the high options. Does it not work this way?

My understanding is that once one of the registers are allocated, we can only try to pass *hints* for the other register. Regalloc will then first try to use any hint, but if all of them are allocated it is still free to use other registers in GRX32. So I think if we want to make a guarantee, we must constrain the reg-class of the register. Or could it perhaps be possible to add a method to AllocationOrder such as "hardenHints()", which would actually remove non-hinted registers from the allocation order?

I experimented a bit and found that by forcing the AllocationOrder to only return hints ("hard hints"), resulted in only 3 jump/sequence expansions on SPEC. It seems that those remaining cases are due to complex cases like:

1 LOCRMux  V0_GRX32, V1_GR32
2 LOCRMux  VO_GRX32, V2_GRX32

V2 gest assigned to GRH32
...

The reason this is not handled by the regclass constraining in emitSelect() is that the reg-coalescer introduced the V1_GR32, while it was originally a GRX32 during isel.
This is also what would happen if V2 was GRH32, although that is practically not happening currently (3 cases in total during isel).

I think that to handle those cases we would have to constrain regclasses somehow after coalescing. And maybe better than giving hard hints would be to immediately after one out of two GRX32 regs gets allocated constrain the other virtreg.

I am not convinced still that making this guarantee generally is possible (without a target pre-ra pass to do this), especially not for all different kind of register allocators that are around / may appear. It seems that some kind of broader construct is needed in order to always be sure this never goes wrong. Maybe a property of a register class somehow that all operands of any MI must belong to one out of two sub regclasses...? :-/

Since there are more GRX32 constructs to implement in the SystemZ backend, this may still be worthwhile, or?

In D36795#845610, @jonpa wrote:

I think that to handle those cases we would have to constrain regclasses somehow after coalescing.

This could be done in TRI->updateRegAllocHint, possibly?

And maybe better than giving hard hints would be to immediately after one out of two GRX32 regs gets allocated constrain the other virtreg.

Not sure if there is currently any place where this can be done. Does the register allocator (all of them) even go through registers one-by-one and assigns them, or is the algorithm more complex?

I am not convinced still that making this guarantee generally is possible (without a target pre-ra pass to do this), especially not for all different kind of register allocators that are around / may appear. It seems that some kind of broader construct is needed in order to always be sure this never goes wrong. Maybe a property of a register class somehow that all operands of any MI must belong to one out of two sub regclasses...? :-/

Note that this, while appropriate for LOCRMux, would be too restrictive for certain other operations. For example, for comparisons, we can allow high-high, low-low, and also high-low compares, but not low-high compares, and similarly for add and subtract. (For comparison, the alternatives / constraints mechanism in GCC allows targets to exactly describe the valid combinations for each instruction, and the register allocator will chose any of those as appropriate.)

In D36795#847448, @uweigand wrote:

In D36795#845610, @jonpa wrote:

I think that to handle those cases we would have to constrain regclasses somehow after coalescing.

This could be done in TRI->updateRegAllocHint, possibly?

Not sure if this is ok (although it should be), but tried it and seemed to improve things a bit.

And maybe better than giving hard hints would be to immediately after one out of two GRX32 regs gets allocated constrain the other virtreg.

Not sure if there is currently any place where this can be done. Does the register allocator (all of them) even go through registers one-by-one and assigns them, or is the algorithm more complex?

IIRC, I think this is more complex with RegAllocGreedy, since it seems it does not only allocate a physical register once to an interval, but it could also later evict (cancel) that assignment to give that physreg to another interval instead.

I am not convinced still that making this guarantee generally is possible (without a target pre-ra pass to do this), especially not for all different kind of register allocators that are around / may appear. It seems that some kind of broader construct is needed in order to always be sure this never goes wrong. Maybe a property of a register class somehow that all operands of any MI must belong to one out of two sub regclasses...? :-/

Note that this, while appropriate for LOCRMux, would be too restrictive for certain other operations. For example, for comparisons, we can allow high-high, low-low, and also high-low compares, but not low-high compares, and similarly for add and subtract. (For comparison, the alternatives / constraints mechanism in GCC allows targets to exactly describe the valid combinations for each instruction, and the register allocator will chose any of those as appropriate.)

It would be nice to be able to do something like that. I suppose if we could constrain regclasses during reg-allocation we could afford to do recursive searches since we are only doing them once (in contrast to giving hints). If we are lucky this would turn out well, although we probably can't "start over" after constraining GRX32, at the point of eviction of an assignment. Well, perhaps that could be added as well. Anyway, this seems not really needed at the moment, but perhaps later if we start to see a lot more GRH32 registers.

I updated the patch and made some more builds of SPEC in order to see the effects of the various ways to tackle this:

Branch \ Statistic:                     LOCRs_lo        LOCRs_hi     RISBs   "Number of spilled live ranges"
Master                                  6382            4            225     48939
LOCRHINTS                               6523            28           60      48947
LOCRCONSTRAIN                           6429            3            179     48957
LOCRCONSTRAIN + UPDATEHINT              6464            3            144     48965
LOCRHINTS + LOCRCONSTRAIN               6558            28           25      48958
MOREMUX + LOCRCONSTRAIN                 6561		28           22      48959
HARDHINTS                               6580            27           4       48965
HARDHINTS + LOCRCONSTRAIN               6581            27           3       48971
HARDHINTS + LOCRCONSTRAIN + UPDATEHINT  6581            27           3       48974
HARDHINTS + UPDATEHINT                  6581            27           3       48970
HARDHINTS + MOREMUX                     6584		27           0       48966   :-)

It seemed that setting the RegClass in updateRegAllocHint() per your suggestion did help a bit, but didn't solve those tricky cases for some reason. I then tried to just do a bit more searching for LOCRMuxes (without recursing), and found that this did actually handle the rest (per last line in table).

Still not sure of all the implications of "hard hints" or constraining regclass in updateRegAllocHint(), just know that "it doesn't crash", and gives "good results" ;-) Any thoughts, anyone? Quentin?

HARDHINTS are currently looking very promising on preliminary benchmark results. (Have not yet tried MOREMUX, but it should also be good since it very similar).

Herald added a subscriber: javed.absar. · View Herald TranscriptAug 22 2017, 1:19 AM

jonpa edited the summary of this revision. (Show Details)Aug 22 2017, 5:32 AM

jonpa added reviewers: MatzeB, hfinkel.

updateRegAllocHint() updated per off-line discussion, which indeed gave better results than before:

New runs after improving updateRegAllocHint() to do a better job regarding physregs. (Basically it was unnecessary to check if the old reg was a physreg; all we need to know is if the NewReg is a low/high regclass...)

BUILD                                  Number of LOCRs_lo	Number of LOCRs_hi	Number of RISBs         Number of spilled live ranges
LOCRCONSTRAIN + UPDATEHINT	       6532			 3			76			48970
LOCRHINTS + LOCRCONSTRAIN + UPDATEHINT 6576			26			 9			48972

Ping!

This is my original post on llvm-dev:

Hi,

I am curious if it would be possible to use reg-alloc hints to improve code generation for SystemZ. The background is that I ran into a regression which seems to relate to code generation for conditional register moves.

The SystemZ backend uses a GRX32 register class for a LOCRMux pseudo instroction (Load On Condition of Register), in order to utilize all 32bit registers optimally. However, depending on the register assignment this pseudo will become a single load-on-condition instruction only if both source and dest registers are either low or high parts. Otherwise, LOCRMux will expand to a compare/jump sequence, which is of course less desirable. LOCR can only handle two low-parts, and LOCFHR can only handle two high-parts (GRX32 is the union of these two reg classes).

In order to increase the number of LOCR/LOCFHRs generated, I would like to tell regalloc something like "If src reg of an LOCRMux is high, try to make dst reg high", and similarly for the case where one register is in the low part.

I am not sure if it is possible to do anything about this in a simple manner, and would appreciate any help.

Thanks,

Jonas

I have now experimented a while with this, and found some different ways of handling this, which I need some feedback on. In particular, it seems promising to use "hard hints", by which I mean changing the AllocationOrder so that only hinted registers are returned. This seems to work, but it is a good idea?

If this is not acceptable, another option might be to use the updateRegAllocHints() hook by *constraining the regclass* of the virtreg directly. This is not the documented purpose of this hook, wo I wonder if it would make sense generally?

The goal is to get rid of the RISB (rotate and insert emitted by copyPhysReg(), and the results can be found in the table above, indicating the effectiveness of the various methods.

ping!

Quentin, you gave me the first advice regarding this, and it would now be very useful to hear your (or someone you know) opinion on the common-code changes involved here: "hard hints" in AllocationOrder, and setting the regclass in updateRegAllocHint().

thanks / Jonas

The generic change looks fine though it sounds surprising that anyone would want that.

The problem here is that copies are super expensive when those hints are not fulfill, right?

In D36795#865237, @qcolombet wrote:

The generic change looks fine though it sounds surprising that anyone would want that.

The problem here is that copies are super expensive when those hints are not fulfill, right?

Thanks for review. Well, instead of a conditional move implemented with one instruction, we would get a jump sequence over a block containing just one move instruction. This *may* be very bad in an inner loop. In particular, this is what happened when I applied Weis reg-split patch and got a significant regression.

SystemZ has BTW more instructions which could be implemented the same way: a "mux" pseudo of the GRX32 regclass which would then be expanded into an instruction using either high or low parts, depending on the choice of registers RA did. I don't think all those instructions waiting to be implemented in the backend would necessarily suffer as much as the load-on-condition, so it is a matter of judgment if using the "hard hints" is better or worse. If one were to judge this based on the number of spilled live ranges, this seems to work fine for the LOCR case, as can be seen in the table above. Do you think that looking at the number of spilled live ranges is a good way to consider the trade-off for using hard hints, or do you perhaps have anything to add?

It would have been nice to *guarantee* that the pseudo will get either high or low parts, like gcc will let you specify combinations of legal register operands. If that could be done, SystemZ could even remove its custom pass that handles any rare cases of mixed operands. But I suppose this would not be easy to do, or?

Do you think that looking at the number of spilled live ranges is a good way to consider the trade-off for using hard hints, or do you perhaps have anything to add?

To me, it sounds like we should tell RA that the non-hard-hints will be expensive to expand. Right now, copies are expected to be cheap regardless of what, where and how they are done and this already does not play very nice with shrink-wrapping for instance. I believe this is yet another instance of that problem.

Short term the hard hints sound fine. Long term, I don't know, I haven't thought about it.

It would have been nice to *guarantee* that the pseudo will get either high or low parts, like gcc will let you specify combinations of legal register operands. If that could be done, SystemZ could even remove its custom pass that handles any rare cases of mixed operands. But I suppose this would not be easy to do, or?

I didn't get what you would like to do.

This revision is now accepted and ready to land.Sep 11 2017, 9:53 AM

It would have been nice to *guarantee* that the pseudo will get either high or low parts, like gcc will let you specify combinations of legal register operands. If that could be done, SystemZ could even remove its custom pass that handles any rare cases of mixed operands. But I suppose this would not be easy to do, or?

I didn't get what you would like to do.

I believe that in gcc it is possible to specify legal combinations in the .md file, something like (source-reg "GR32, GRH32"), (dst-reg "GR32, GRH32"). This would then be used during regalloc so that one of the two combinations would take effect (GR32/GR32 or GRH32/GRH32). This would be very nice in LLVM in this case, since regalloc would know what to do without hints, and it could also put a guarantee so that the target did not have to catch any bad cases after the fact. I guess I am just asking if there has been any effort or plan in this direction previously or what the best way of achieving this in LLVM might be.

BTW, in this particular case, it would be enough to say that "all operands of a single MI which are all GRX32 must be allocated to the same subclass: GR32 (low) or GRH32 (high)". Not sure if this is general enough to be considered for implementing - the gcc method seems more ideal.

Your suggestion of making non-hard-hints considered expensive (via a cost function for each register in the AllocationOrder?), makes sense to me though, especially if it would help in other contexts as well. This might make for an even better final result than only allowing the hard hints.

Patch rewritten towards using hard regalloc hints with a worklist search over LOCRMux GRX32 registers.

Experiments with mixed GRX32 allocation order (%R0-%R5) as well as trying to get more high-high LOCRMux allocations whenever possible have been removed. Benchmarking has shown that the first version (without these two extras) is still the best version, it seems. The idea was to get more high-high allocations.

There is a great unbalance between low-low and high-high allocations (nearly all become low-low), but the important thing is that all the mixed allocations have been eliminated.

No test regressions.

Looks basically good to me, just a couple of cosmetic comments inline.

lib/Target/SystemZ/SystemZRegisterInfo.cpp
30	No, it isn't :-) Either the comment or the code is wrong.
120	I think it might simplify the code to just inline getHintRC_LOCRMux here ... it uses so many of the local variables defined here, and doesn't really serve any separate purpose, so it doesn't make much sense to have it as a separate function. In fact, maybe even getConstrainedRC32_LOCRMux should be inlined as well.
134	Maybe better "return TargetRegisterInfo:: ..." in case the default is ever changed to return hard hints in some cases.

Minor updates of SystemZ parts per review, but one point left under discussion.

lib/Target/SystemZ/SystemZRegisterInfo.cpp
30	Aah, the comment was rotten.
120	The reason for having the LOCRMux handling separate, is that I had in mind the other Mux cases as well, that we might want to investigate. So, it is better to have it inlined until we actually use more Mux instructions as opposed to have the code somewhat ready? Or is it better to inline even with the other instructions handlings added?
134	aah, yes.

Added test case. This at least fails on trunk and passes with this patch, even though I guess it may perhaps randomly pass on different revisions over time also without this patch...

Functions inlined per review.

patch impact on risb/locr instruction counts on SPEC as of latest performance measurement:

risblg         :                 8624                 8442     -182
locrhe         :                  926                  999      +73
risbhg         :                 2057                 1995      -62
locrl          :                 1626                 1658      +32
locrle         :                  453                  482      +29
locre          :                 1039                 1064      +25
...
Spill|Reload   :               165135               165218      +83

--stats:

1 systemz-II                   - Number of LOCRMux jump-sequences (lower is better)

(one LOCRMux not handled with patch, I think this is due to a regalloc eviction, which would be more complex to handle)

jonpa marked 3 inline comments as done.Nov 9 2017, 1:28 AM

See two minor inline comments. Otherwise, this now LGTM. Thanks!

lib/Target/SystemZ/SystemZRegisterInfo.cpp
38	Should this also use hasSubClassEq, just for symmetry with the case above?
103	Should we now pass Matrix also back to the default implementation?

jonpa added inline comments.Nov 9 2017, 7:20 AM

lib/Target/SystemZ/SystemZRegisterInfo.cpp
38	I take it that while GR32BitRegClass has the ADDR32Bit sub class, which GRH32BitRegClass does not, you prefer to keep this general for the future and so on?

uweigand added inline comments.Nov 9 2017, 7:39 AM

lib/Target/SystemZ/SystemZRegisterInfo.cpp
38	Yes, exactly ... just in case there will be subclasses later.

Fixed per review.

lib/Target/SystemZ/SystemZRegisterInfo.cpp
38	Right. Fixed.
103	Oops. Fixed with NFC.

Thanks for review. Commited as r317879.

Revision Contents

Path

Size

include/

llvm/

Target/

TargetRegisterInfo.h

5 lines

lib/

CodeGen/

AllocationOrder.h

8 lines

AllocationOrder.cpp

5 lines

TargetRegisterInfo.cpp

9 lines

Target/

ARM/

ARMBaseRegisterInfo.h

2 lines

ARMBaseRegisterInfo.cpp

7 lines

SystemZ/

SystemZInstrInfo.cpp

6 lines

SystemZRegisterInfo.h

7 lines

SystemZRegisterInfo.cpp

81 lines

test/

CodeGen/

SystemZ/

cond-move-04.mir

75 lines

Diff 122400

include/llvm/Target/TargetRegisterInfo.h

Show First 20 Lines • Show All 772 Lines • ▼ Show 20 Lines	public:

/// Get the dimensions of register pressure impacted by this register unit.		/// Get the dimensions of register pressure impacted by this register unit.
/// Returns a -1 terminated array of pressure set IDs.		/// Returns a -1 terminated array of pressure set IDs.
virtual const int *getRegUnitPressureSets(unsigned RegUnit) const = 0;		virtual const int *getRegUnitPressureSets(unsigned RegUnit) const = 0;

/// Get a list of 'hint' registers that the register allocator should try		/// Get a list of 'hint' registers that the register allocator should try
/// first when allocating a physical register for the virtual register		/// first when allocating a physical register for the virtual register
/// VirtReg. These registers are effectively moved to the front of the		/// VirtReg. These registers are effectively moved to the front of the
/// allocation order.		/// allocation order. If true is returned, regalloc will try to only use
		/// hints to the greatest extent possible even if it means spilling.
///		///
/// The Order argument is the allocation order for VirtReg's register class		/// The Order argument is the allocation order for VirtReg's register class
/// as returned from RegisterClassInfo::getOrder(). The hint registers must		/// as returned from RegisterClassInfo::getOrder(). The hint registers must
/// come from Order, and they must not be reserved.		/// come from Order, and they must not be reserved.
///		///
/// The default implementation of this function can resolve		/// The default implementation of this function can resolve
/// target-independent hints provided to MRI::setRegAllocationHint with		/// target-independent hints provided to MRI::setRegAllocationHint with
/// HintType == 0. Targets that override this function should defer to the		/// HintType == 0. Targets that override this function should defer to the
/// default implementation if they have no reason to change the allocation		/// default implementation if they have no reason to change the allocation
/// order for VirtReg. There may be target-independent hints.		/// order for VirtReg. There may be target-independent hints.
virtual void getRegAllocationHints(unsigned VirtReg,		virtual bool getRegAllocationHints(unsigned VirtReg,
ArrayRef<MCPhysReg> Order,		ArrayRef<MCPhysReg> Order,
SmallVectorImpl<MCPhysReg> &Hints,		SmallVectorImpl<MCPhysReg> &Hints,
const MachineFunction &MF,		const MachineFunction &MF,
const VirtRegMap *VRM = nullptr,		const VirtRegMap *VRM = nullptr,
const LiveRegMatrix *Matrix = nullptr)		const LiveRegMatrix *Matrix = nullptr)
const;		const;

/// A callback to allow target a chance to update register allocation hints		/// A callback to allow target a chance to update register allocation hints
▲ Show 20 Lines • Show All 365 Lines • Show Last 20 Lines

lib/CodeGen/AllocationOrder.h

	Show All 26 Lines
	class VirtRegMap;			class VirtRegMap;
	class LiveRegMatrix;			class LiveRegMatrix;

	class LLVM_LIBRARY_VISIBILITY AllocationOrder {			class LLVM_LIBRARY_VISIBILITY AllocationOrder {
	SmallVector<MCPhysReg, 16> Hints;			SmallVector<MCPhysReg, 16> Hints;
	ArrayRef<MCPhysReg> Order;			ArrayRef<MCPhysReg> Order;
	int Pos;			int Pos;

				// If HardHints is true, only Hints will be returned.
				bool HardHints;

	public:			public:

	/// Create a new AllocationOrder for VirtReg.			/// Create a new AllocationOrder for VirtReg.
	/// @param VirtReg Virtual register to allocate for.			/// @param VirtReg Virtual register to allocate for.
	/// @param VRM Virtual register map for function.			/// @param VRM Virtual register map for function.
	/// @param RegClassInfo Information about reserved and allocatable registers.			/// @param RegClassInfo Information about reserved and allocatable registers.
	AllocationOrder(unsigned VirtReg,			AllocationOrder(unsigned VirtReg,
	const VirtRegMap &VRM,			const VirtRegMap &VRM,
	const RegisterClassInfo &RegClassInfo,			const RegisterClassInfo &RegClassInfo,
	const LiveRegMatrix *Matrix);			const LiveRegMatrix *Matrix);

	/// Get the allocation order without reordered hints.			/// Get the allocation order without reordered hints.
	ArrayRef<MCPhysReg> getOrder() const { return Order; }			ArrayRef<MCPhysReg> getOrder() const { return Order; }

	/// Return the next physical register in the allocation order, or 0.			/// Return the next physical register in the allocation order, or 0.
	/// It is safe to call next() again after it returned 0, it will keep			/// It is safe to call next() again after it returned 0, it will keep
	/// returning 0 until rewind() is called.			/// returning 0 until rewind() is called.
	unsigned next(unsigned Limit = 0) {			unsigned next(unsigned Limit = 0) {
	if (Pos < 0)			if (Pos < 0)
	return Hints.end()[Pos++];			return Hints.end()[Pos++];
				if (HardHints)
				return 0;
	if (!Limit)			if (!Limit)
	Limit = Order.size();			Limit = Order.size();
	while (Pos < int(Limit)) {			while (Pos < int(Limit)) {
	unsigned Reg = Order[Pos++];			unsigned Reg = Order[Pos++];
	if (!isHint(Reg))			if (!isHint(Reg))
	return Reg;			return Reg;
	}			}
	return 0;			return 0;
	}			}

	/// As next(), but allow duplicates to be returned, and stop before the			/// As next(), but allow duplicates to be returned, and stop before the
	/// Limit'th register in the RegisterClassInfo allocation order.			/// Limit'th register in the RegisterClassInfo allocation order.
	///			///
	/// This can produce more than Limit registers if there are hints.			/// This can produce more than Limit registers if there are hints.
	unsigned nextWithDups(unsigned Limit) {			unsigned nextWithDups(unsigned Limit) {
	if (Pos < 0)			if (Pos < 0)
	return Hints.end()[Pos++];			return Hints.end()[Pos++];
				if (HardHints)
				return 0;
	if (Pos < int(Limit))			if (Pos < int(Limit))
	return Order[Pos++];			return Order[Pos++];
	return 0;			return 0;
	}			}

	/// Start over from the beginning.			/// Start over from the beginning.
	void rewind() { Pos = -int(Hints.size()); }			void rewind() { Pos = -int(Hints.size()); }

	Show All 10 Lines

lib/CodeGen/AllocationOrder.cpp

	Show All 25 Lines

	#define DEBUG_TYPE "regalloc"			#define DEBUG_TYPE "regalloc"

	// Compare VirtRegMap::getRegAllocPref().			// Compare VirtRegMap::getRegAllocPref().
	AllocationOrder::AllocationOrder(unsigned VirtReg,			AllocationOrder::AllocationOrder(unsigned VirtReg,
	const VirtRegMap &VRM,			const VirtRegMap &VRM,
	const RegisterClassInfo &RegClassInfo,			const RegisterClassInfo &RegClassInfo,
	const LiveRegMatrix *Matrix)			const LiveRegMatrix *Matrix)
	: Pos(0) {			: Pos(0), HardHints(false) {
	const MachineFunction &MF = VRM.getMachineFunction();			const MachineFunction &MF = VRM.getMachineFunction();
	const TargetRegisterInfo *TRI = &VRM.getTargetRegInfo();			const TargetRegisterInfo *TRI = &VRM.getTargetRegInfo();
	Order = RegClassInfo.getOrder(MF.getRegInfo().getRegClass(VirtReg));			Order = RegClassInfo.getOrder(MF.getRegInfo().getRegClass(VirtReg));
	TRI->getRegAllocationHints(VirtReg, Order, Hints, MF, &VRM, Matrix);			if (TRI->getRegAllocationHints(VirtReg, Order, Hints, MF, &VRM, Matrix))
				HardHints = true;
	rewind();			rewind();

	DEBUG({			DEBUG({
	if (!Hints.empty()) {			if (!Hints.empty()) {
	dbgs() << "hints:";			dbgs() << "hints:";
	for (unsigned I = 0, E = Hints.size(); I != E; ++I)			for (unsigned I = 0, E = Hints.size(); I != E; ++I)
	dbgs() << ' ' << PrintReg(Hints[I], TRI);			dbgs() << ' ' << PrintReg(Hints[I], TRI);
	dbgs() << '\n';			dbgs() << '\n';
	}			}
	});			});
	#ifndef NDEBUG			#ifndef NDEBUG
	for (unsigned I = 0, E = Hints.size(); I != E; ++I)			for (unsigned I = 0, E = Hints.size(); I != E; ++I)
	assert(is_contained(Order, Hints[I]) &&			assert(is_contained(Order, Hints[I]) &&
	"Target hint is outside allocation order.");			"Target hint is outside allocation order.");
	#endif			#endif
	}			}

lib/CodeGen/TargetRegisterInfo.cpp

Show First 20 Lines • Show All 354 Lines • ▼ Show 20 Lines	bool TargetRegisterInfo::shouldRewriteCopySrc(const TargetRegisterClass *DefRC,
unsigned DefSubReg,		unsigned DefSubReg,
const TargetRegisterClass *SrcRC,		const TargetRegisterClass *SrcRC,
unsigned SrcSubReg) const {		unsigned SrcSubReg) const {
// If this source does not incur a cross register bank copy, use it.		// If this source does not incur a cross register bank copy, use it.
return shareSameRegisterFile(*this, DefRC, DefSubReg, SrcRC, SrcSubReg);		return shareSameRegisterFile(*this, DefRC, DefSubReg, SrcRC, SrcSubReg);
}		}

// Compute target-independent register allocator hints to help eliminate copies.		// Compute target-independent register allocator hints to help eliminate copies.
void		bool
TargetRegisterInfo::getRegAllocationHints(unsigned VirtReg,		TargetRegisterInfo::getRegAllocationHints(unsigned VirtReg,
ArrayRef<MCPhysReg> Order,		ArrayRef<MCPhysReg> Order,
SmallVectorImpl<MCPhysReg> &Hints,		SmallVectorImpl<MCPhysReg> &Hints,
const MachineFunction &MF,		const MachineFunction &MF,
const VirtRegMap *VRM,		const VirtRegMap *VRM,
const LiveRegMatrix *Matrix) const {		const LiveRegMatrix *Matrix) const {
const MachineRegisterInfo &MRI = MF.getRegInfo();		const MachineRegisterInfo &MRI = MF.getRegInfo();
std::pair<unsigned, unsigned> Hint = MRI.getRegAllocationHint(VirtReg);		std::pair<unsigned, unsigned> Hint = MRI.getRegAllocationHint(VirtReg);

// Hints with HintType != 0 were set by target-dependent code.		// Hints with HintType != 0 were set by target-dependent code.
// Such targets must provide their own implementation of		// Such targets must provide their own implementation of
// TRI::getRegAllocationHints to interpret those hint types.		// TRI::getRegAllocationHints to interpret those hint types.
assert(Hint.first == 0 && "Target must implement TRI::getRegAllocationHints");		assert(Hint.first == 0 && "Target must implement TRI::getRegAllocationHints");

// Target-independent hints are either a physical or a virtual register.		// Target-independent hints are either a physical or a virtual register.
unsigned Phys = Hint.second;		unsigned Phys = Hint.second;
if (VRM && isVirtualRegister(Phys))		if (VRM && isVirtualRegister(Phys))
Phys = VRM->getPhys(Phys);		Phys = VRM->getPhys(Phys);

// Check that Phys is a valid hint in VirtReg's register class.		// Check that Phys is a valid hint in VirtReg's register class.
if (!isPhysicalRegister(Phys))		if (!isPhysicalRegister(Phys))
return;		return false;
if (MRI.isReserved(Phys))		if (MRI.isReserved(Phys))
return;		return false;
// Check that Phys is in the allocation order. We shouldn't heed hints		// Check that Phys is in the allocation order. We shouldn't heed hints
// from VirtReg's register class if they aren't in the allocation order. The		// from VirtReg's register class if they aren't in the allocation order. The
// target probably has a reason for removing the register.		// target probably has a reason for removing the register.
if (!is_contained(Order, Phys))		if (!is_contained(Order, Phys))
return;		return false;

// All clear, tell the register allocator to prefer this register.		// All clear, tell the register allocator to prefer this register.
Hints.push_back(Phys);		Hints.push_back(Phys);
		return false;
}		}

bool TargetRegisterInfo::canRealignStack(const MachineFunction &MF) const {		bool TargetRegisterInfo::canRealignStack(const MachineFunction &MF) const {
return !MF.getFunction()->hasFnAttribute("no-realign-stack");		return !MF.getFunction()->hasFnAttribute("no-realign-stack");
}		}

bool TargetRegisterInfo::needsStackRealignment(		bool TargetRegisterInfo::needsStackRealignment(
const MachineFunction &MF) const {		const MachineFunction &MF) const {
Show All 30 Lines

lib/Target/ARM/ARMBaseRegisterInfo.h

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	public:

const TargetRegisterClass *		const TargetRegisterClass *
getLargestLegalSuperClass(const TargetRegisterClass *RC,		getLargestLegalSuperClass(const TargetRegisterClass *RC,
const MachineFunction &MF) const override;		const MachineFunction &MF) const override;

unsigned getRegPressureLimit(const TargetRegisterClass *RC,		unsigned getRegPressureLimit(const TargetRegisterClass *RC,
MachineFunction &MF) const override;		MachineFunction &MF) const override;

void getRegAllocationHints(unsigned VirtReg,		bool getRegAllocationHints(unsigned VirtReg,
ArrayRef<MCPhysReg> Order,		ArrayRef<MCPhysReg> Order,
SmallVectorImpl<MCPhysReg> &Hints,		SmallVectorImpl<MCPhysReg> &Hints,
const MachineFunction &MF,		const MachineFunction &MF,
const VirtRegMap *VRM,		const VirtRegMap *VRM,
const LiveRegMatrix *Matrix) const override;		const LiveRegMatrix *Matrix) const override;

void updateRegAllocHint(unsigned Reg, unsigned NewReg,		void updateRegAllocHint(unsigned Reg, unsigned NewReg,
MachineFunction &MF) const override;		MachineFunction &MF) const override;
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

lib/Target/ARM/ARMBaseRegisterInfo.cpp

	Show First 20 Lines • Show All 274 Lines • ▼ Show 20 Lines
	static unsigned getPairedGPR(unsigned Reg, bool Odd, const MCRegisterInfo *RI) {			static unsigned getPairedGPR(unsigned Reg, bool Odd, const MCRegisterInfo *RI) {
	for (MCSuperRegIterator Supers(Reg, RI); Supers.isValid(); ++Supers)			for (MCSuperRegIterator Supers(Reg, RI); Supers.isValid(); ++Supers)
	if (ARM::GPRPairRegClass.contains(*Supers))			if (ARM::GPRPairRegClass.contains(*Supers))
	return RI->getSubReg(*Supers, Odd ? ARM::gsub_1 : ARM::gsub_0);			return RI->getSubReg(*Supers, Odd ? ARM::gsub_1 : ARM::gsub_0);
	return 0;			return 0;
	}			}

	// Resolve the RegPairEven / RegPairOdd register allocator hints.			// Resolve the RegPairEven / RegPairOdd register allocator hints.
	void			bool
	ARMBaseRegisterInfo::getRegAllocationHints(unsigned VirtReg,			ARMBaseRegisterInfo::getRegAllocationHints(unsigned VirtReg,
	ArrayRef<MCPhysReg> Order,			ArrayRef<MCPhysReg> Order,
	SmallVectorImpl<MCPhysReg> &Hints,			SmallVectorImpl<MCPhysReg> &Hints,
	const MachineFunction &MF,			const MachineFunction &MF,
	const VirtRegMap *VRM,			const VirtRegMap *VRM,
	const LiveRegMatrix *Matrix) const {			const LiveRegMatrix *Matrix) const {
	const MachineRegisterInfo &MRI = MF.getRegInfo();			const MachineRegisterInfo &MRI = MF.getRegInfo();
	std::pair<unsigned, unsigned> Hint = MRI.getRegAllocationHint(VirtReg);			std::pair<unsigned, unsigned> Hint = MRI.getRegAllocationHint(VirtReg);

	unsigned Odd;			unsigned Odd;
	switch (Hint.first) {			switch (Hint.first) {
	case ARMRI::RegPairEven:			case ARMRI::RegPairEven:
	Odd = 0;			Odd = 0;
	break;			break;
	case ARMRI::RegPairOdd:			case ARMRI::RegPairOdd:
	Odd = 1;			Odd = 1;
	break;			break;
	default:			default:
	TargetRegisterInfo::getRegAllocationHints(VirtReg, Order, Hints, MF, VRM);			TargetRegisterInfo::getRegAllocationHints(VirtReg, Order, Hints, MF, VRM);
	return;			return false;
	}			}

	// This register should preferably be even (Odd == 0) or odd (Odd == 1).			// This register should preferably be even (Odd == 0) or odd (Odd == 1).
	// Check if the other part of the pair has already been assigned, and provide			// Check if the other part of the pair has already been assigned, and provide
	// the paired register as the first hint.			// the paired register as the first hint.
	unsigned Paired = Hint.second;			unsigned Paired = Hint.second;
	if (Paired == 0)			if (Paired == 0)
	return;			return false;

	unsigned PairedPhys = 0;			unsigned PairedPhys = 0;
	if (TargetRegisterInfo::isPhysicalRegister(Paired)) {			if (TargetRegisterInfo::isPhysicalRegister(Paired)) {
	PairedPhys = Paired;			PairedPhys = Paired;
	} else if (VRM && VRM->hasPhys(Paired)) {			} else if (VRM && VRM->hasPhys(Paired)) {
	PairedPhys = getPairedGPR(VRM->getPhys(Paired), Odd, this);			PairedPhys = getPairedGPR(VRM->getPhys(Paired), Odd, this);
	}			}

	// First prefer the paired physreg.			// First prefer the paired physreg.
	if (PairedPhys && is_contained(Order, PairedPhys))			if (PairedPhys && is_contained(Order, PairedPhys))
	Hints.push_back(PairedPhys);			Hints.push_back(PairedPhys);

	// Then prefer even or odd registers.			// Then prefer even or odd registers.
	for (unsigned Reg : Order) {			for (unsigned Reg : Order) {
	if (Reg == PairedPhys \|\| (getEncodingValue(Reg) & 1) != Odd)			if (Reg == PairedPhys \|\| (getEncodingValue(Reg) & 1) != Odd)
	continue;			continue;
	// Don't provide hints that are paired to a reserved register.			// Don't provide hints that are paired to a reserved register.
	unsigned Paired = getPairedGPR(Reg, !Odd, this);			unsigned Paired = getPairedGPR(Reg, !Odd, this);
	if (!Paired \|\| MRI.isReserved(Paired))			if (!Paired \|\| MRI.isReserved(Paired))
	continue;			continue;
	Hints.push_back(Reg);			Hints.push_back(Reg);
	}			}
				return false;
	}			}

	void			void
	ARMBaseRegisterInfo::updateRegAllocHint(unsigned Reg, unsigned NewReg,			ARMBaseRegisterInfo::updateRegAllocHint(unsigned Reg, unsigned NewReg,
	MachineFunction &MF) const {			MachineFunction &MF) const {
	MachineRegisterInfo *MRI = &MF.getRegInfo();			MachineRegisterInfo *MRI = &MF.getRegInfo();
	std::pair<unsigned, unsigned> Hint = MRI->getRegAllocationHint(Reg);			std::pair<unsigned, unsigned> Hint = MRI->getRegAllocationHint(Reg);
	if ((Hint.first == (unsigned)ARMRI::RegPairOdd \|\|			if ((Hint.first == (unsigned)ARMRI::RegPairOdd \|\|
	▲ Show 20 Lines • Show All 517 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrInfo.cpp

Show All 10 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "SystemZInstrInfo.h"		#include "SystemZInstrInfo.h"
#include "MCTargetDesc/SystemZMCTargetDesc.h"		#include "MCTargetDesc/SystemZMCTargetDesc.h"
#include "SystemZ.h"		#include "SystemZ.h"
#include "SystemZInstrBuilder.h"		#include "SystemZInstrBuilder.h"
#include "SystemZSubtarget.h"		#include "SystemZSubtarget.h"
		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/LiveInterval.h"		#include "llvm/CodeGen/LiveInterval.h"
#include "llvm/CodeGen/LiveIntervalAnalysis.h"		#include "llvm/CodeGen/LiveIntervalAnalysis.h"
#include "llvm/CodeGen/LiveVariables.h"		#include "llvm/CodeGen/LiveVariables.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineInstr.h"		#include "llvm/CodeGen/MachineInstr.h"
#include "llvm/CodeGen/MachineMemOperand.h"		#include "llvm/CodeGen/MachineMemOperand.h"
Show All 13 Lines
#include <iterator>		#include <iterator>

using namespace llvm;		using namespace llvm;

#define GET_INSTRINFO_CTOR_DTOR		#define GET_INSTRINFO_CTOR_DTOR
#define GET_INSTRMAP_INFO		#define GET_INSTRMAP_INFO
#include "SystemZGenInstrInfo.inc"		#include "SystemZGenInstrInfo.inc"

		#define DEBUG_TYPE "systemz-II"
		STATISTIC(LOCRMuxJumps, "Number of LOCRMux jump-sequences (lower is better)");

// Return a mask with Count low bits set.		// Return a mask with Count low bits set.
static uint64_t allOnes(unsigned int Count) {		static uint64_t allOnes(unsigned int Count) {
return Count == 0 ? 0 : (uint64_t(1) << (Count - 1) << 1) - 1;		return Count == 0 ? 0 : (uint64_t(1) << (Count - 1) << 1) - 1;
}		}

// Reg should be a 32-bit GPR. Return true if it is a high register rather		// Reg should be a 32-bit GPR. Return true if it is a high register rather
// than a low register.		// than a low register.
static bool isHighReg(unsigned int Reg) {		static bool isHighReg(unsigned int Reg) {
▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	void SystemZInstrInfo::expandLOCRPseudo(MachineInstr &MI, unsigned LowOpcode,
unsigned SrcReg = MI.getOperand(2).getReg();		unsigned SrcReg = MI.getOperand(2).getReg();
bool DestIsHigh = isHighReg(DestReg);		bool DestIsHigh = isHighReg(DestReg);
bool SrcIsHigh = isHighReg(SrcReg);		bool SrcIsHigh = isHighReg(SrcReg);

if (!DestIsHigh && !SrcIsHigh)		if (!DestIsHigh && !SrcIsHigh)
MI.setDesc(get(LowOpcode));		MI.setDesc(get(LowOpcode));
else if (DestIsHigh && SrcIsHigh)		else if (DestIsHigh && SrcIsHigh)
MI.setDesc(get(HighOpcode));		MI.setDesc(get(HighOpcode));
		else
		LOCRMuxJumps++;

// If we were unable to implement the pseudo with a single instruction, we		// If we were unable to implement the pseudo with a single instruction, we
// need to convert it back into a branch sequence. This cannot be done here		// need to convert it back into a branch sequence. This cannot be done here
// since the caller of expandPostRAPseudo does not handle changes to the CFG		// since the caller of expandPostRAPseudo does not handle changes to the CFG
// correctly. This change is defered to the SystemZExpandPseudo pass.		// correctly. This change is defered to the SystemZExpandPseudo pass.
}		}

// MI is an RR-style pseudo instruction that zero-extends the low Size bits		// MI is an RR-style pseudo instruction that zero-extends the low Size bits
▲ Show 20 Lines • Show All 1,620 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZRegisterInfo.h

Show All 38 Lines	public:
/// This is currently only used by LOAD_STACK_GUARD, which requires a non-%r0		/// This is currently only used by LOAD_STACK_GUARD, which requires a non-%r0
/// register, hence ADDR64.		/// register, hence ADDR64.
const TargetRegisterClass *		const TargetRegisterClass *
getPointerRegClass(const MachineFunction &MF,		getPointerRegClass(const MachineFunction &MF,
unsigned Kind=0) const override {		unsigned Kind=0) const override {
return &SystemZ::ADDR64BitRegClass;		return &SystemZ::ADDR64BitRegClass;
}		}

		bool getRegAllocationHints(unsigned VirtReg,
		ArrayRef<MCPhysReg> Order,
		SmallVectorImpl<MCPhysReg> &Hints,
		const MachineFunction &MF,
		const VirtRegMap *VRM,
		const LiveRegMatrix *Matrix) const override;

// Override TargetRegisterInfo.h.		// Override TargetRegisterInfo.h.
bool requiresRegisterScavenging(const MachineFunction &MF) const override {		bool requiresRegisterScavenging(const MachineFunction &MF) const override {
return true;		return true;
}		}
bool requiresFrameIndexScavenging(const MachineFunction &MF) const override {		bool requiresFrameIndexScavenging(const MachineFunction &MF) const override {
return true;		return true;
}		}
bool trackLivenessAfterRegAlloc(const MachineFunction &MF) const override {		bool trackLivenessAfterRegAlloc(const MachineFunction &MF) const override {
Show All 25 Lines

lib/Target/SystemZ/SystemZRegisterInfo.cpp

	//===-- SystemZRegisterInfo.cpp - SystemZ register information ------------===//			//===-- SystemZRegisterInfo.cpp - SystemZ register information ------------===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "SystemZRegisterInfo.h"			#include "SystemZRegisterInfo.h"
	#include "SystemZInstrInfo.h"			#include "SystemZInstrInfo.h"
	#include "SystemZSubtarget.h"			#include "SystemZSubtarget.h"
	#include "llvm/CodeGen/LiveIntervalAnalysis.h"			#include "llvm/CodeGen/LiveIntervalAnalysis.h"
				#include "llvm/ADT/SmallSet.h"
	#include "llvm/CodeGen/MachineInstrBuilder.h"			#include "llvm/CodeGen/MachineInstrBuilder.h"
	#include "llvm/CodeGen/MachineRegisterInfo.h"			#include "llvm/CodeGen/MachineRegisterInfo.h"
	#include "llvm/CodeGen/TargetFrameLowering.h"			#include "llvm/CodeGen/TargetFrameLowering.h"
				#include "llvm/CodeGen/VirtRegMap.h"

	using namespace llvm;			using namespace llvm;

	#define GET_REGINFO_TARGET_DESC			#define GET_REGINFO_TARGET_DESC
	#include "SystemZGenRegisterInfo.inc"			#include "SystemZGenRegisterInfo.inc"

	SystemZRegisterInfo::SystemZRegisterInfo()			SystemZRegisterInfo::SystemZRegisterInfo()
	: SystemZGenRegisterInfo(SystemZ::R14D) {}			: SystemZGenRegisterInfo(SystemZ::R14D) {}

				// Given that MO is a GRX32 operand, return either GR32 or GRH32 if MO
				// somehow belongs in it. Otherwise, return GRX32.
				static const TargetRegisterClass *getRC32(MachineOperand &MO,
				uweigandUnsubmitted Not Done Reply Inline Actions No, it isn't :-) Either the comment or the code is wrong. uweigand: No, it isn't :-) Either the comment or the code is wrong.
				jonpaAuthorUnsubmitted Done Reply Inline Actions Aah, the comment was rotten. jonpa: Aah, the comment was rotten.
				const VirtRegMap *VRM,
				const MachineRegisterInfo *MRI) {
				const TargetRegisterClass *RC = MRI->getRegClass(MO.getReg());

				if (SystemZ::GR32BitRegClass.hasSubClassEq(RC) \|\|
				MO.getSubReg() == SystemZ::subreg_l32)
				return &SystemZ::GR32BitRegClass;
				if (SystemZ::GRH32BitRegClass.hasSubClassEq(RC) \|\|
				uweigandUnsubmitted Done Reply Inline Actions Should this also use hasSubClassEq, just for symmetry with the case above? uweigand: Should this also use hasSubClassEq, just for symmetry with the case above?
				jonpaAuthorUnsubmitted Done Reply Inline Actions I take it that while GR32BitRegClass has the ADDR32Bit sub class, which GRH32BitRegClass does not, you prefer to keep this general for the future and so on? jonpa: I take it that while GR32BitRegClass has the ADDR32Bit sub class, which GRH32BitRegClass does…
				uweigandUnsubmitted Done Reply Inline Actions Yes, exactly ... just in case there will be subclasses later. uweigand: Yes, exactly ... just in case there will be subclasses later.
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Right. Fixed. jonpa: Right. Fixed.
				MO.getSubReg() == SystemZ::subreg_h32)
				return &SystemZ::GRH32BitRegClass;

				if (VRM && VRM->hasPhys(MO.getReg())) {
				unsigned PhysReg = VRM->getPhys(MO.getReg());
				if (SystemZ::GR32BitRegClass.contains(PhysReg))
				return &SystemZ::GR32BitRegClass;
				assert (SystemZ::GRH32BitRegClass.contains(PhysReg) &&
				"Phys reg not in GR32 or GRH32?");
				return &SystemZ::GRH32BitRegClass;
				}

				assert (RC == &SystemZ::GRX32BitRegClass);
				return RC;
				}

				bool
				SystemZRegisterInfo::getRegAllocationHints(unsigned VirtReg,
				ArrayRef<MCPhysReg> Order,
				SmallVectorImpl<MCPhysReg> &Hints,
				const MachineFunction &MF,
				const VirtRegMap *VRM,
				const LiveRegMatrix *Matrix) const {
				const MachineRegisterInfo *MRI = &MF.getRegInfo();
				const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
				if (MRI->getRegClass(VirtReg) == &SystemZ::GRX32BitRegClass) {
				SmallVector<unsigned, 8> Worklist;
				SmallSet<unsigned, 4> DoneRegs;
				Worklist.push_back(VirtReg);
				while (Worklist.size()) {
				unsigned Reg = Worklist.pop_back_val();
				if (!DoneRegs.insert(Reg).second)
				continue;

				for (auto &Use : MRI->use_instructions(Reg))
				// For LOCRMux, see if the other operand is already a high or low
				// register, and in that case give the correpsonding hints for
				// VirtReg. LOCR instructions need both operands in either high or
				// low parts.
				if (Use.getOpcode() == SystemZ::LOCRMux) {
				MachineOperand &TrueMO = Use.getOperand(1);
				MachineOperand &FalseMO = Use.getOperand(2);
				const TargetRegisterClass *RC =
				TRI->getCommonSubClass(getRC32(FalseMO, VRM, MRI),
				getRC32(TrueMO, VRM, MRI));
				if (RC && RC != &SystemZ::GRX32BitRegClass) {
				for (MCPhysReg Reg : Order)
				if (RC->contains(Reg) && !MRI->isReserved(Reg))
				Hints.push_back(Reg);
				// Return true to make these hints the only regs available to
				// RA. This may mean extra spilling but since the alternative is
				// a jump sequence expansion of the LOCRMux, it is preferred.
				return true;
				}

				// Add the other operand of the LOCRMux to the worklist.
				unsigned OtherReg =
				(TrueMO.getReg() == Reg ? FalseMO.getReg() : TrueMO.getReg());
				if (MRI->getRegClass(OtherReg) == &SystemZ::GRX32BitRegClass)
				Worklist.push_back(OtherReg);
				}
				}
				}

				return TargetRegisterInfo::getRegAllocationHints(VirtReg, Order, Hints, MF,
				uweigandUnsubmitted Done Reply Inline Actions Should we now pass Matrix also back to the default implementation? uweigand: Should we now pass Matrix also back to the default implementation?
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions Oops. Fixed with NFC. jonpa: Oops. Fixed with NFC.
				VRM, Matrix);
				}

	const MCPhysReg *			const MCPhysReg *
	SystemZRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {			SystemZRegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
	if (MF->getSubtarget().getTargetLowering()->supportSwiftError() &&			if (MF->getSubtarget().getTargetLowering()->supportSwiftError() &&
	MF->getFunction()->getAttributes().hasAttrSomewhere(			MF->getFunction()->getAttributes().hasAttrSomewhere(
	Attribute::SwiftError))			Attribute::SwiftError))
	return CSR_SystemZ_SwiftError_SaveList;			return CSR_SystemZ_SwiftError_SaveList;
	return CSR_SystemZ_SaveList;			return CSR_SystemZ_SaveList;
	}			}

	const uint32_t *			const uint32_t *
	SystemZRegisterInfo::getCallPreservedMask(const MachineFunction &MF,			SystemZRegisterInfo::getCallPreservedMask(const MachineFunction &MF,
	CallingConv::ID CC) const {			CallingConv::ID CC) const {
	if (MF.getSubtarget().getTargetLowering()->supportSwiftError() &&			if (MF.getSubtarget().getTargetLowering()->supportSwiftError() &&
	MF.getFunction()->getAttributes().hasAttrSomewhere(			MF.getFunction()->getAttributes().hasAttrSomewhere(
				uweigandUnsubmitted Done Reply Inline Actions I think it might simplify the code to just inline getHintRC_LOCRMux here ... it uses so many of the local variables defined here, and doesn't really serve any separate purpose, so it doesn't make much sense to have it as a separate function. In fact, maybe even getConstrainedRC32_LOCRMux should be inlined as well. uweigand: I think it might simplify the code to just inline getHintRC_LOCRMux here ... it uses so many of…
				jonpaAuthorUnsubmitted Done Reply Inline Actions The reason for having the LOCRMux handling separate, is that I had in mind the other Mux cases as well, that we might want to investigate. So, it is better to have it inlined until we actually use more Mux instructions as opposed to have the code somewhat ready? Or is it better to inline even with the other instructions handlings added? jonpa: The reason for having the LOCRMux handling separate, is that I had in mind the other Mux cases…
	Attribute::SwiftError))			Attribute::SwiftError))
	return CSR_SystemZ_SwiftError_RegMask;			return CSR_SystemZ_SwiftError_RegMask;
	return CSR_SystemZ_RegMask;			return CSR_SystemZ_RegMask;
	}			}

	BitVector			BitVector
	SystemZRegisterInfo::getReservedRegs(const MachineFunction &MF) const {			SystemZRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
	BitVector Reserved(getNumRegs());			BitVector Reserved(getNumRegs());
	const SystemZFrameLowering *TFI = getFrameLowering(MF);			const SystemZFrameLowering *TFI = getFrameLowering(MF);

	if (TFI->hasFP(MF)) {			if (TFI->hasFP(MF)) {
	// R11D is the frame pointer. Reserve all aliases.			// R11D is the frame pointer. Reserve all aliases.
	Reserved.set(SystemZ::R11D);			Reserved.set(SystemZ::R11D);
	Reserved.set(SystemZ::R11L);			Reserved.set(SystemZ::R11L);
				uweigandUnsubmitted Done Reply Inline Actions Maybe better "return TargetRegisterInfo:: ..." in case the default is ever changed to return hard hints in some cases. uweigand: Maybe better "return TargetRegisterInfo:: ..." in case the default is ever changed to return…
				jonpaAuthorUnsubmitted Not Done Reply Inline Actions aah, yes. jonpa: aah, yes.
	Reserved.set(SystemZ::R11H);			Reserved.set(SystemZ::R11H);
	Reserved.set(SystemZ::R10Q);			Reserved.set(SystemZ::R10Q);
	}			}

	// R15D is the stack pointer. Reserve all aliases.			// R15D is the stack pointer. Reserve all aliases.
	Reserved.set(SystemZ::R15D);			Reserved.set(SystemZ::R15D);
	Reserved.set(SystemZ::R15L);			Reserved.set(SystemZ::R15L);
	Reserved.set(SystemZ::R15H);			Reserved.set(SystemZ::R15H);
	▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

test/CodeGen/SystemZ/cond-move-04.mir

This file was added.

				# RUN: llc -mtriple=s390x-linux-gnu -mcpu=z13 -start-before=greedy %s -o - \
				# RUN: \| FileCheck %s
				#
				# Test that regalloc manages (via regalloc hints) to avoid a LOCRMux jump
				# sequence expansion.

				--- \|

				declare i8* @foo(i8*, i32 signext, i32 signext) local_unnamed_addr

				define i8* @fun(i8* returned) {
				br label %2

				; <label>:2: ; preds = %6, %1
				%3 = zext i16 undef to i32
				switch i32 %3, label %4 [
				i32 15, label %6
				i32 125, label %5
				]

				; <label>:4: ; preds = %2
				br label %6

				; <label>:5: ; preds = %2
				br label %6

				; <label>:6: ; preds = %5, %4, %2
				%7 = phi i32 [ 4, %2 ], [ undef, %4 ], [ 10, %5 ]
				%8 = call i8* @foo(i8* undef, i32 signext undef, i32 signext %7)
				br label %2
				}

				...

				# CHECK: locr
				# CHECK-NOT: risblg

				---
				name: fun
				alignment: 2
				tracksRegLiveness: true
				registers:
				- { id: 0, class: gr32bit }
				- { id: 1, class: gr64bit }
				- { id: 2, class: grx32bit }
				- { id: 3, class: grx32bit }
				- { id: 4, class: grx32bit }
				- { id: 5, class: grx32bit }
				- { id: 6, class: grx32bit }
				- { id: 7, class: gr64bit }
				- { id: 8, class: gr64bit }
				- { id: 9, class: gr64bit }
				- { id: 10, class: gr64bit }
				- { id: 11, class: gr32bit }
				frameInfo:
				hasCalls: true
				body: \|
				bb.0 (%ir-block.1):
				%3 = LHIMux 0
				%2 = LHIMux 4
				%5 = LHIMux 10

				bb.1 (%ir-block.2):
				CHIMux %3, 0, implicit-def %cc
				%0 = LOCRMux undef %0, %5, 14, 6, implicit %cc
				%0 = LOCRMux %0, %2, 14, 6, implicit killed %cc
				ADJCALLSTACKDOWN 0, 0
				%7 = LGFR %0
				%r3d = LGHI 0
				%r4d = COPY %7
				CallBRASL @foo, undef %r2d, killed %r3d, killed %r4d, csr_systemz, implicit-def dead %r14d, implicit-def dead %cc, implicit-def dead %r2d
				ADJCALLSTACKUP 0, 0
				J %bb.1

				...