This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Introduce a simple instruction selector.
ClosedPublic

Authored by ab on Jul 14 2016, 11:19 AM.

Details

Summary
This is hopefully mostly straightforward, with various open questions
peppered around as FIXMEs.

This also includes a couple changes to MRI (around constrainRegClass and
gvreg sizes).  I'm not sure any of them is a good idea, but beyond that,
they cause anything that currently links in CodeGen to fail to link
(because GlobalISel is a separate library).
I'm not sure if this is a layering violation, or whether we should force
users to also link in GlobalISel (if they want to use GlobalISel).
Probably the former, in which case: any ideas where to put the changes?

Also, I don't think it's currently possibly to test the emission of
copies of defs without doing the fixup on non-generic instructions
(see FIXME).  Ideas welcome, otherwise I can just remove the code
for now.

Diff Detail

Repository
rL LLVM

Event Timeline

ab updated this revision to Diff 64013.Jul 14 2016, 11:19 AM
ab retitled this revision from to [GlobalISel] Introduce a simple instruction selector..
ab updated this object.
ab added reviewers: qcolombet, t.p.northover.
ab added subscribers: arsenm, eli.friedman, MatzeB, llvm-commits.

I just looked at the MachineRegisterInfo parts. This is also a highlevel comment, not necessarily about this patch (which just continues the current direction): It seems to me like generic regs and "normal" vregs don't have that much in common, they mostly share the same kind of MachineOperand and the same def/use lists. Apart from that they contain different information (register bank + size vs. register class). Wouldn't it make sense to introduce a new MachineOperand kind for them and separate them from "normal" vregs?

lib/CodeGen/GlobalISel/InstructionSelect.cpp
38 ↗(On Diff #64013)

Twine Message?

arsenm added inline comments.Jul 14 2016, 11:48 AM
lib/CodeGen/GlobalISel/InstructionSelect.cpp
22 ↗(On Diff #64013)

instruction-select

27 ↗(On Diff #64013)

Use DEBUG_TYPE macro

27–32 ↗(On Diff #64013)

You can use INITIALIZE_PASS instead of repeating BEGIN/END

43–44 ↗(On Diff #64013)

Single quotes around '\n'

71–72 ↗(On Diff #64013)

Braces

lib/CodeGen/MachineRegisterInfo.cpp
47 ↗(On Diff #64013)

1 bit condition registers?

lib/Target/AArch64/AArch64InstructionSelector.cpp
88 ↗(On Diff #64013)

I don't think targets should need to have their own DEBUG here

157 ↗(On Diff #64013)

Single quotes around '\n'

echristo added inline comments.
include/llvm/Target/TargetSubtargetInfo.h
93–95 ↗(On Diff #64013)

Do you really intend for this to be a subtarget feature?

lib/CodeGen/MachineRegisterInfo.cpp
42 ↗(On Diff #64013)

These ifdefs look like they should be made a bit more generic in some way?

142–145 ↗(On Diff #64013)

Sadly, don't do this with commented out code. Perhaps something to fix first?

lib/Target/AArch64/AArch64InstructionSelector.cpp
103 ↗(On Diff #64013)

These seem a bit ... overly assert-y.

test/CodeGen/AArch64/GlobalISel/arm64-instructionselect.mir
3 ↗(On Diff #64013)

Comment this a bit more please, what you're checking for etc.

t.p.northover added inline comments.Jul 14 2016, 3:33 PM
lib/Target/AArch64/AArch64InstructionSelector.cpp
135 ↗(On Diff #64013)

This can probably be based on I.getType() rather than singling out some particular register.

143 ↗(On Diff #64013)

From here on seems like it would be better off outside the switch. It applies to the vast majority of instructions.

155 ↗(On Diff #64013)

Do we have a policy on implicit ops on generic MachineInstrs? I'm guessing we don't expect there to be any, but if not we might want getNumExplicitOperands instead.

dberris added inline comments.Jul 15 2016, 8:37 AM
include/llvm/CodeGen/GlobalISel/InstructionSelector.h
32–35 ↗(On Diff #64013)

Is a boolean sufficient for conveying the failure scenario for selection?

How do specific instruction selections convey the nuance of the failure? Could there be multiple passes on the same function caused by a failure to select -- maybe because there's not enough information yet?

For example, if an instruction is ignored by the selector, or replaced by a no-op, does it have to return failure? What will that cause the InstructionSelect pass to do? I'm not sure if that documentation has to happen in this interface or in the InstructionSelect pass instead.

lib/CodeGen/GlobalISel/InstructionSelect.cpp
62–63 ↗(On Diff #64013)

Somewhat related to my question above on the InstructionSelector interface -- maybe we should delegate the selection error messaging to the implementation?

lib/CodeGen/GlobalISel/InstructionSelect.cpp
61 ↗(On Diff #64013)

Won't this will fail if the target decides to delete the instruction? I think targets should be allowed to delete the instruction, because updating MachineIntr in place is very tedious especially if you need to re-order the operands.

lib/CodeGen/GlobalISel/InstructionSelect.cpp
61 ↗(On Diff #64013)

Nevermind, this looks OK.

ab updated this revision to Diff 64541.Jul 19 2016, 12:22 PM
ab edited edge metadata.
ab marked 13 inline comments as done.
  • Extract gvreg parts of constrainRegClass into a wrapper in RegisterBankInfo (this also avoids the CodeGen->GISel layering problem)
  • Add a short explanation for the added tests
  • Various cleanups
ab added a comment.Jul 19 2016, 12:25 PM

I just looked at the MachineRegisterInfo parts. This is also a highlevel comment, not necessarily about this patch (which just continues the current direction): It seems to me like generic regs and "normal" vregs don't have that much in common, they mostly share the same kind of MachineOperand and the same def/use lists. Apart from that they contain different information (register bank + size vs. register class). Wouldn't it make sense to introduce a new MachineOperand kind for them and separate them from "normal" vregs?

I agree; I had a few ideas that I wrote down in the MRI FIXME. The one that I currently prefer is to have a type for gvregs, a different one for vregs (so, no 'unsigned' anymore), and overload the relevant parts of MRI/... based on that. WDYT?

include/llvm/CodeGen/GlobalISel/InstructionSelector.h
32–35 ↗(On Diff #64013)

The way I see it, the contract is very simple: after select(I) succeeds, 'I' cannot be generic anymore, and we cannot have introduced other generic instructions. Anything else is a failure from which we cannot recover.

So, specifically, if a non-generic instruction is ignored or deleted, all is well. But a generic instruction cannot be ignored, only deleted. Does that make sense?

If you're imagining things like:

%a = foo
%b = bar %a

and (ignoring ordering) failing to select %a, but being able to select %b: I don't think that's acceptable, the assumption being that every instruction at selection time is selectable.

We talked about introducing some sort of verifier for this, but I don't have a full answer to that yet.

include/llvm/Target/TargetSubtargetInfo.h
93–95 ↗(On Diff #64013)

No, and I don't think the other GISel passes should be either, only the *Info/Selector classes should. Do you agree?

lib/CodeGen/GlobalISel/InstructionSelect.cpp
62–63 ↗(On Diff #64013)

Hmm, I figured that would pretty much duplicate the exact same error message in each implementation. Do you envision useful differences?

lib/Target/AArch64/AArch64InstructionSelector.cpp
103 ↗(On Diff #64013)

Isn't it? I changed it to pass those when constructing the selector in the subtarget; there's more awkwardness involved there (RBI vs Subtarget), would appreciate another look!

143 ↗(On Diff #64013)

My gut feeling was to avoid generalizing for now, and make sure that better abstractions emerge as we support more instructions.

But for this specific piece of logic, I agree it's probably going to be very common. I extract it out into a "fixup operands" helper, let me know what you think.

155 ↗(On Diff #64013)

I assumed we don't get implicit ops, but it's good that you mention that: I added an assert for now; we can revisit if we end up with a good reason to use them.

ab marked an inline comment as done.Jul 19 2016, 12:58 PM
ab added inline comments.
lib/CodeGen/MachineRegisterInfo.cpp
138–141 ↗(On Diff #64541)

Got around to investigating; fixed (and added the assert) in r276012.

dberris added inline comments.Jul 19 2016, 9:50 PM
include/llvm/CodeGen/GlobalISel/InstructionSelector.h
33–36 ↗(On Diff #64541)

I'm thinking about this from the xray point of view, so let me expound a little.

Consider when for example we see a pseudo instruction like: PATCHABLE_FUNCTION_ENTER or PATCHABLE_RET -- right now it's selected on a per-platform basis when lowering. If we were to change this to become generic instructions instead (i.e. not pseudo instructions), we can then select the appropriate lowering by default (say, for PATCHABLE_FUNCTION_ENTER it would just disappear, and PATCHABLE_RET will just unpack the machine opcodes it wraps), and specialise on a per-platform basis.

We're thinking of cases where say we have other instructions that we'd like to conditionally prepend or append with the appropriate sleds, without having to explicitly mark them at a higher level too (for example, looking at tail exits, sibling exits, or even before/after calls to a whitelist of functions).

Maybe the question is whether XRay sled emission should happen at selection time, or stay in lowering as it is today?

lib/CodeGen/GlobalISel/InstructionSelect.cpp
63–64 ↗(On Diff #64541)

I think adding more nuance to it might be useful -- say, instead of just 'cannot select an instruction' it could be that there might be cases that the instruction was malformed (some operands were outside some range, or some invariant was violated -- i.e. this should only appear once in a function but it appears multiple times). Maybe that's a legalisation issue though and the error might be better there. Maybe it'd be fine to extend later when we have more concrete cases where it matters.

ab updated this revision to Diff 64915.Jul 21 2016, 9:50 AM
ab marked an inline comment as done.
  • Rebase onto LLT
  • Fail instead of asserting on currently unsupported constructs
  • Enforce a few additional constraints (in particular: generic ops can only have vreg operands, that have to be on the same bank)
  • Don't constrain with COPYs (as we can't test this right now)
ab added inline comments.Jul 21 2016, 10:08 AM
include/llvm/CodeGen/GlobalISel/InstructionSelector.h
34–37 ↗(On Diff #64915)

The XRay perspective is an interesting one. Have you considered treating this as a legalization problem? So, you insert your generic instructions at IR->MI translation time; the target declares how it wants to handle them. If they're not legal, do your common, target-independent, lowering. Would that make sense?

In general, I'd rather not have selection be "tentative", with fallbacks and whatnot. I think one shared goal is to be able to definitively tell whether a given instruction will be selectable or not, without actually trying to select it. The hope is to then be able to statically figure out the constructs that "cannot select".

include/llvm/Target/TargetSubtargetInfo.h
93–95 ↗(On Diff #64915)

Sorry, I misread when I first answered. Let me try again ;)

Yes, the InstructionSelector is intentionally specific to a subtarget. Right now (especially for AArch64), there's not much to specialize, but I'm hoping that'll let us separate out feature-specific code (so instead of a monster TLI, we could maybe have a hierarchy, say avx512>avx>sse, if that makes sense?)

lib/CodeGen/GlobalISel/InstructionSelect.cpp
64–65 ↗(On Diff #64915)

That's an interesting point; first, do you agree that any of these invariants being violated is a bug? (be it in the legalizer, or in the IR translator, or ...)

The general idea is: an extended machine verifier - specialized for each phase, via MF properties - would enforce these invariant. Anything coming into the selector has to be selectable. It's the responsibility of the legalizer (and target-specific transforms, if applicable) to create well-formed instructions by then.

qcolombet edited edge metadata.Jul 22 2016, 9:13 AM

Hi Ahmed,

Looks mostly good to me. Couple of inlined comment. Mainly, I believe we can avoid the ifdef in MRI::setRegClass and the getInstructionSelect may live somewhere else and take an additional parameter.

Cheers,
-Quentin

include/llvm/CodeGen/GlobalISel/InstructionSelector.h
35 ↗(On Diff #64915)

More accurately, I has been replaced by a sequence of instructions where this condition applies.
But I can live with that shortcut :).

include/llvm/CodeGen/TargetPassConfig.h
229 ↗(On Diff #64915)

Add a comment for that method.
Also add an additional hook for inject PreGlobalInstructionSelect passes. (Could be a follow-up patch.)

include/llvm/Target/TargetSubtargetInfo.h
93–95 ↗(On Diff #64915)

FWIW, that’s how I envisioned the thing, that being said, maybe we should have a different interface elsewhere, like getInstructionSelector(MF)?
The idea would be to have the avx512>avx etc. distinction but also things that depends on the function attribute like optimize for code size. The MF allows us to access the function attributes and the sub target.

lib/CodeGen/GlobalISel/InstructionSelect.cpp
51 ↗(On Diff #64915)

Should this fixme appear in the IRTranslator instead?

52 ↗(On Diff #64915)

Could you assert on an unchanged number of block within the loop or something, so that we know when we have to properly handle the case?

69 ↗(On Diff #64915)

Agreed on all the FIXMEs, plus we should set a property after-isel on the function. That’s how we will selectively enable the new checks.

lib/CodeGen/MachineRegisterInfo.cpp
60 ↗(On Diff #64915)

I would just clear the whole map at the end of the select pass. Will avoid the ifdef and the cost for the extract logic.

ab updated this revision to Diff 65370.Jul 25 2016, 10:22 AM
ab edited edge metadata.
ab marked 6 inline comments as done.

Rebase and address Quentin's review comments:

  • clear all vreg sizes once, at the end of isel; assert that the class agrees with that size
  • expand on some comments
  • add TargetPassConfig::addPreGlobalInstructionSelect callback
include/llvm/Target/TargetSubtargetInfo.h
93–95 ↗(On Diff #64915)

You're right, I had not considered specializing on optsize. I'll have to think about the right solution for this. In the meantime, is yet another FIXME palatable? ;)

lib/CodeGen/GlobalISel/InstructionSelect.cpp
51 ↗(On Diff #64915)

I'm not sure; I figured, select being the final step, anything that doesn't belong elsewhere ought to be set here.

52 ↗(On Diff #64915)

Good idea; added an assert.

lib/CodeGen/MachineRegisterInfo.cpp
60 ↗(On Diff #64915)

Eh, I'm no fan of either approach, but clearing once is certainly more efficient.

But really, I'm not sure the separate map is the best solution to begin with; it seems redundant and brittle. What do you think of querying the MI for the def size? (or maybe invert that relationship, and keep the type with the vreg info in the first place?).

qcolombet added inline comments.Jul 25 2016, 1:41 PM
lib/CodeGen/MachineRegisterInfo.cpp
60 ↗(On Diff #65370)

Actually what I had in mind was to get the size from the type with, like you said, querying the MI for the def size (via getUniqueDef). However, we would still need to access this information from the vreg.
As having the type on the vreg, no, that does not work. Vregs can change type based on their uses (like and s64, add <2 x s32>).

But yeah, long term, this code shouldn't be needed. Add a comment about that?

ab added inline comments.Jul 25 2016, 1:50 PM
lib/CodeGen/MachineRegisterInfo.cpp
60 ↗(On Diff #65370)

Actually what I had in mind was to get the size from the type with, like you said, querying the MI for the def size (via getUniqueDef). However, we would still need to access this information from the vreg.
As having the type on the vreg, no, that does not work. Vregs can change type based on their uses (like and s64, add <2 x s32>).

Right; I suppose you could see a vreg as having an "initial" type, the one on the def. The type(s) of an MI is then just the type(s) of its def(s). I admit it's also somewhat awkward, but at least avoids duplicating the info in a separate, seldom accessed, data structure. Thoughts?

But yeah, long term, this code shouldn't be needed. Add a comment about that?

SGTM, will do. While I'm there, anything else?

SGTM, will do. While I'm there, anything else?

I'd move constrainSelectedInstRegOperands in the based class (as a static) to have it available for all the targets.

Other than that LGTM, but please address the other comments first.

Cheers,
-Quentin

lib/CodeGen/MachineRegisterInfo.cpp
60 ↗(On Diff #65370)

at least avoids duplicating the info in a separate [...] data structure.

Right now, yes, this is duplicated, but with what I had in mind, this is an indirect access: vreg -> getDef -> getType -> getSize. That's not pretty, but that works.

The type(s) of an MI is then just the type(s) of its def(s)

Honestly, the main reason why I discarded this design was because I thought it would have been confusing to ask for the type of some operand and get something not related to the operand. More over, the concept of having typed register is strange to me.
The bottom line is that it may indeed, be more efficient (fewer indirections), but in terms of concept I don't like it. If we manage to spin that in a reasonable manner, that would work.

Anyhow, this is something we need to clean-up at some point (do we actually need the size in the end?), but for now, let us stick on that.

This revision was automatically updated to reflect the committed changes.
ab added a comment.Jul 27 2016, 7:55 AM

Thanks all for the reviews! Committed r276875.

I'd move constrainSelectedInstRegOperands in the based class (as a static) to have it available for all the targets.

Done.

Other than that LGTM, but please address the other comments first.

Thanks, let me know if I missed anything. We'll still need to discuss the vreg size/type and subtarget/MF issues, but I don't think those are specific to any individual pass.