Page MenuHomePhabricator

TargetMachine: Use LLVMTargetMachine in CodeGen
AbandonedPublic

Authored by MatzeB on Oct 2 2017, 4:01 PM.

Details

Summary

While preparing a talk about CodeGen I looked at the TargetMachine/LLVMTargetMachine classes; reading the doxygen comments it seems that the intention is that TargetMachine is a general interface to be used by all targets, while LLVMTargetMachine is a specialization for targets using code from lib/CodeGen.

In practice some methods that were only useful for lib/CodeGen based targets sneaked into TargetMachine anyway. The problem as it turns out is that most of the time we only had a reference to TargetMachine instead of LLVMTargetMachine at hand.

This commit:

  • Changes most reference in CodeGen from TargetMachine to LLVMTargetMachine
  • Moves some methods from TargetMachine to LLVMTargetMachine: getSubtarget, targetSchedulesPostRAScheduling, usesPhysRegsForPEI, useIPRA
  • Add two new methods into TargetMachine for the two cases where non-CodeGen code was directly accessing CodeGen interfaces: useAA(), getTargetLowering()

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB created this revision.Oct 2 2017, 4:01 PM
chandlerc edited edge metadata.Oct 2 2017, 5:57 PM

I generally like the direction of the cleanup. A few high level questions:

  1. You say that this is to distinguish targets that *do* use LLVM's lib/CodeGen from those that do not. Do you mean entire targets that do not? Or do you mean just *components* of targets that do not depend on it like the target's MC layer?
  1. If in #1 you mean actual targets -- are there any left now that the C backend is gone? What are these targets? I wonder if we could simplify things by collapsing a no longer necessary layer.
  1. If we can't eliminate the distinction here completely, are there better names than TargetMachine and LLVMTargetMachine that we could give these? I'm guessing a good name will be informed by the answers to the above two questions and the reason we're keeping the distinction in place.
  1. Any ideas how we could improve comments (or the design) to avoid regressing this (again) in the future?
echristo edited edge metadata.Oct 2 2017, 7:20 PM

Hi Matthias,

Similar to Chandler's comments I think we should look at folding these two together rather than keeping them split apart. I started working a little bit in this direction at one point, but admittedly got distracted by other things. A more concrete thought here is that we could either move all of the bits over into the TargetMachine and for something that doesn't need it then that target could simply just not initialize or define those initialization routines (I don't recall anything that doesn't currently need). It might end up being a bit more verbose in a couple of targets, but seems like a reasonable start.

Thoughts?

-eric

MatzeB added a comment.Oct 2 2017, 7:31 PM

I generally like the direction of the cleanup. A few high level questions:

  1. You say that this is to distinguish targets that *do* use LLVM's lib/CodeGen from those that do not. Do you mean entire targets that do not? Or do you mean just *components* of targets that do not depend on it like the target's MC layer?

Yes I mean entire targets.

  1. If in #1 you mean actual targets -- are there any left now that the C backend is gone? What are these targets? I wonder if we could simplify things by collapsing a no longer necessary layer.
  • There is none in open source, and I don't know of any internal ones at my company.
  • According to the mailing list this may be good for SPIR-V (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114593.html) which doesn't really need any of the CodeGen infrastructure.
  • Indeed collapsing the two classes would slightly simplify the code, and I'd be perfectly fine with that.
  • On the other hand it's kinda neat to have a CodeGen internal interface with LLVMTargetMachine and an outside interface with TargetMachine
  1. If we can't eliminate the distinction here completely, are there better names than TargetMachine and LLVMTargetMachine that we could give these? I'm guessing a good name will be informed by the answers to the above two questions and the reason we're keeping the distinction in place.

I guess CodeGenTarget instead of LLVMTarget would be in line with this being for lib/CodeGen.

  1. Any ideas how we could improve comments (or the design) to avoid regressing this (again) in the future?

I think if we think about this as CodeGen internal and external interfaces or if people actually start using it for SPIR-V it may be fine... But again if you rather feel to merge the two and get rid of the distinction that's fine with me too.

MatzeB added a comment.Oct 2 2017, 7:34 PM

Hi Matthias,

Similar to Chandler's comments I think we should look at folding these two together rather than keeping them split apart. I started working a little bit in this direction at one point, but admittedly got distracted by other things. A more concrete thought here is that we could either move all of the bits over into the TargetMachine and for something that doesn't need it then that target could simply just not initialize or define those initialization routines (I don't recall anything that doesn't currently need). It might end up being a bit more verbose in a couple of targets, but seems like a reasonable start.

Thoughts?

See my response to Chandler. Right now it feels neat to have an internal/external interface. But that's probably because I spent the time preparing the patch :) Merging the two together to have one less concept/class doesn't sound too bad either (and needs less changes). Let me sleep over it.

MatzeB added a comment.Oct 2 2017, 7:36 PM

And for what it's worth for some of the methods I moved around here:

  • usesPhysRegsForPEI() this one should be possible to eliminate with some refactoring where we factor out register scavenging into a separate pass from PEI. (That's somewhere on my TODO list not very high up though).
  • useIPRA() this one could just as well be a know in TargetPassConfig.
  1. If in #1 you mean actual targets -- are there any left now that the C backend is gone? What are these targets? I wonder if we could simplify things by collapsing a no longer necessary layer.
  • There is none in open source, and I don't know of any internal ones at my company.
  • According to the mailing list this may be good for SPIR-V (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114593.html) which doesn't really need any of the CodeGen infrastructure.
  • Indeed collapsing the two classes would slightly simplify the code, and I'd be perfectly fine with that.
  • On the other hand it's kinda neat to have a CodeGen internal interface with LLVMTargetMachine and an outside interface with TargetMachine

I think we should probably just merge these.

My understanding is that the consensus from the entire community was that SPIR-V really should be leveraging the same common legalization infrastructure as the rest of the targets. Notably, other highly abstract targets like WebAssembly seem to be succeeding with that model.

If we want to introduce a target that doesn't share the core LLVM code generation layer, I'm actually still open to that, but I don't think we should maintain unnecessary abstractions waiting for that day to arrive. I think that sets up the right incentives: if we're going to go down the path of splitting out interfaces in this way, that comes with a cost and we should evaluate that cost against the benefit it provides to those actual use cases. Otherwise it seems too easy to endlessly invent arguments on one side or the other because they are all hypothetical.

But this is always a judgement call, so of course I'm interested in where others see this. It seems like Eric is largely agreeing. Quentin? Craig? Hal? Others?

hfinkel edited edge metadata.Oct 2 2017, 8:05 PM
  1. If in #1 you mean actual targets -- are there any left now that the C backend is gone? What are these targets? I wonder if we could simplify things by collapsing a no longer necessary layer.
  • There is none in open source, and I don't know of any internal ones at my company.
  • According to the mailing list this may be good for SPIR-V (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114593.html) which doesn't really need any of the CodeGen infrastructure.
  • Indeed collapsing the two classes would slightly simplify the code, and I'd be perfectly fine with that.
  • On the other hand it's kinda neat to have a CodeGen internal interface with LLVMTargetMachine and an outside interface with TargetMachine

I think we should probably just merge these.

My understanding is that the consensus from the entire community was that SPIR-V really should be leveraging the same common legalization infrastructure as the rest of the targets. Notably, other highly abstract targets like WebAssembly seem to be succeeding with that model.

If we want to introduce a target that doesn't share the core LLVM code generation layer, I'm actually still open to that, but I don't think we should maintain unnecessary abstractions waiting for that day to arrive. I think that sets up the right incentives: if we're going to go down the path of splitting out interfaces in this way, that comes with a cost and we should evaluate that cost against the benefit it provides to those actual use cases. Otherwise it seems too easy to endlessly invent arguments on one side or the other because they are all hypothetical.

But this is always a judgement call, so of course I'm interested in where others see this. It seems like Eric is largely agreeing. Quentin? Craig? Hal? Others?

I think we should ask about this on llvm-dev. Are there other out-of-tree targets where it's still useful to have direct lowering of the IR? Does Emscripten use this? We might check with the folks from Intel/Xilinx re: their FPGA backends. Another aspect of this is that the motivation for doing this may be going away: You used to need to do this in order to get lowering without legalization (or, perhaps register allocation). We now have virtual-register-only backends. With Global ISel, can we get lowering without legalization at all?

MatzeB added a comment.Oct 2 2017, 8:38 PM

Merging the two isn't too hard either: D38489

qcolombet edited edge metadata.Oct 3 2017, 3:24 PM

I think we should probably just merge these.

Agreed.

MatzeB abandoned this revision.Oct 12 2017, 5:23 PM