- Teach mlir-translate to use custom triple and data layout.
- Change convert-to-rocdlir pass to pass AMDGPU-specific triple and target layout string.
- Amend test case to check alloca on non-zero addrspace.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks, I have been discussing a similar change with @aartbik just today. Please add the test on translation and feel free to land.
As an improvement (fine for a follow-up), could you please expose the triple and data layout as mlir-translate command line options?
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
---|---|---|
81 | Could you add a test that the LLVM module indeed has the specified triple and target and put it under test/Target ? |
@ftynse I just revised the test. Will submit a follow-up patch for mlir-translate command line options. I don't have authority to land the patch though.
@ftynse I used to submit PRs to MLIR on GitHub. It's actually my first patch on Phabricator so I don't have access push to master. Would you mind help me make the commit? Thanks a lot.
I've got several other patches under review right now. Once I'm more acquainted with the process I'll request write access.
mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h | ||
---|---|---|
54 | I am not convinced by this API: the datalayout is not something we should inject blindly after the fact. This is something that should be defined in MLIR in the first place, otherwise anything done before will make assumption about it. | |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
83 | Please don't use auto here, use StringRef instead. Also having these hard-coded here is a bit strange: can we get this from LLVM instead? |
mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h | ||
---|---|---|
54 | @mehdi_amini I'm not quite sure if introducing the concept of triple and data layout in MLIR is the right approach. Up until now, all dialects (including llvm dialect) do not require any special treatment for target triple or data layout. It's perfect fine to use std.alloca, or llvm.alloca to model allocating on-stack variables in std and llvm dialect. The only thing which is broken at this moment, is when we translate llvm dialect to llvm IR we couldn't get the proper instruction. | |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
83 | @mehdi_amini I just changed the patch to use StringRef. On most of the targets on LLVM there is an computeDataLayout function which get you the data layout string given a triple. But none of those functions are exposed as public functions. I want to restrict the scope of the patch to be within MLIR so I hard-coded the string. |
mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h | ||
---|---|---|
54 |
Well I'm not sure the assumption made in here is always correct, for example on the alignment of everything. | |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
83 | I expect that a TargetMachine can be created from a Triple and in turn it exposes access to a DataLayout? There has to be a public way to get this out of LLVM otherwise every frontend would duplicate this information. |
mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h | ||
---|---|---|
54 | @mehdi_amini alignment is an optional attribute for std.alloca and llvm.alloca. So it's up to the IR builder to figure it out, when an alignment is not specified. And that would correspond to the topic how to properly specify Target Triple and Data Layout. And I think we can move this discussion to the next thread wrt should a TargetMachine be instantiated. | |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
83 | @mehdi_amini It's definitely possible to create a TargetMachine so we can get Triple and DataLayout. However mlir-translate is designed to produce LLVM IR, not machine ISA, so currently a TargetMachine is not constructed. Would it be feasible if target triple and data layout be set as command-line options for mlir-translate? This approach was also suggested by @ftynse . |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
---|---|---|
83 | Most of the LLVM applications I know follow the idiom of:
So indeed they create a TargetMachine, specify target triple and data layout, because machine instructions are expected. On the other hand, the implementation of mlir-translate doesn't employ a TargetMachine. All of the IR coming out from mlir-translate right now, doesn't carry triple or data layout at all. This patch is the first attempt to allow an llvm Module coming out from mlir-translate to carry non-empty triple & data layout. Would it be feasible if target triple and data layout be set as command-line options for mlir-translate, just like what's there for opt? This approach was also suggested by @ftynse . |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
---|---|---|
83 | Correctness of transformations in MLIR wrt data layout is a separate topic, that needs to be handled within MLIR and likely requires non-trivial design, so I feel quite strongly about not blocking this diff on it. (1) We currently don't have any guarantee at all, so potentially _all_ transformations for _any_ layout are wrong. This patch does not make it any worse; if anything, it makes debugging easier because you see the layout in some cases. (2) MLIR translation is a front-end from LLVM's point of view. The front-end should specify a layout, how -- it's up to the front-end. This is no different from typing data layout in the textual module, or from doing casts with vector-type builtins in C. My thinking is that we ultimately need "triple" and "layout" associated with MLIR's incarnation of the LLVM module. E.g., as attributes on the ModuleOp that comes out of *->LLVM dialect conversion. Then mlir-translate can pick it up transparently, and work in both directions. This requires some untangling in the llvm conversion passes, so in the meantime, we can have mlir-translate set the triple and layout. Certainly, we can think about having a target description with a layout in MLIR itself, but while we are in the process of thinking, progress is necessary, lest we get stuck in the analysis paralysis. |
mlir/lib/Target/LLVMIR/ConvertToROCDLIR.cpp | ||
---|---|---|
83 |
Yes, but it does so by patching it after the fact, which as far as I know is not correct: the module has been generated with assumptions from a Datalayout (implicitly) and you're actually force-changing it. The topic of the TargetMachine is independent, and I don't really get your point: if any LLVM frontend requires a TM to be able to set the DataLayout (I don't know if there is an alternative...) then I don't see why mlir-translate shouldn't do the same.
Right now I see this as implementing something deliberately incorrect. // TODO(pr1234): adding a datalayout after the fact is incorrect and we need to have it exposed // from the beginning and propagated during the lowering to LLVM dialect instead.
I don't see any paralysis here, unless you include "no one is working on it"? |
Yes, but it does so by patching it after the fact, which as far as I know is not correct: the module has been generated with assumptions from a Datalayout (implicitly) and you're actually force-changing it.
The topic of the TargetMachine is independent, and I don't really get your point: if any LLVM frontend requires a TM to be able to set the DataLayout (I don't know if there is an alternative...) then I don't see why mlir-translate shouldn't do the same.
That is why I recommended that mlir-translate take the triple and the data layout if necessary, so it can derive the necessary parts of the LLVM module. This is still dangerous, but will start the bottom-up progression of the data layout until we hit the right position for it in the MLIR.
Right now I see this as implementing something deliberately incorrect.
The flow we already have is also incorrect. I see this patch as less problematic as long as it does not claim it corrects the current flow.
I am not opposed to taking shortcuts, but I'd like these to be clearly identified, for example there should be a tracking bug and big TODO statement:
Sure, we agree here, let's do this.
I don't see any paralysis here, unless you include "no one is working on it"?
I abuse the term a bit, but you know how it works in MLIR: any substantial discussion will be repeatedly side-tracked into considering an ever-increasing number of cases (or deciding they are not worth considering) and paralyze the progress here for a significant amount of time because analysis is happening somewhere.
What do you see being "derived" on the LLVM module by mlir-translate with this information that wouldn't already be materialized in the LLVM dialect?
Right now I see this as implementing something deliberately incorrect.
The flow we already have is also incorrect. I see this patch as less problematic as long as it does not claim it corrects the current flow.
I am not opposed to taking shortcuts, but I'd like these to be clearly identified, for example there should be a tracking bug and big TODO statement:
Sure, we agree here, let's do this.
@whchung how does that sounds?
What do you see being "derived" on the LLVM module by mlir-translate with this information that wouldn't already be materialized in the LLVM dialect?
Data layout from the triple.
We should then move this earlier in the pipeline, data layout is actually necessary in std->llvm conversion to properly convert index, among others. So you are right that it should be materialized in the LLVM dialect.
Address review comments from @mehdi_amini and @ftynse. Add TODO in the revised interface.
Right now I see this as implementing something deliberately incorrect.
The flow we already have is also incorrect. I see this patch as less problematic as long as it does not claim it corrects the current flow.
I am not opposed to taking shortcuts, but I'd like these to be clearly identified, for example there should be a tracking bug and big TODO statement:
Sure, we agree here, let's do this.
@whchung how does that sounds?
@mehdi_amini Agree. I've revised the patch so it has a TODO statement in the new interface. I'm applying for a new account on Bugzilla so if you could create one that would be very helpful.
@ftynse I agree with your assessment that data layout is best conveyed in std->llvm conversion process. I myself am seeing sub-optimal codes generated due to how index types get produced.
I am not convinced by this API: the datalayout is not something we should inject blindly after the fact. This is something that should be defined in MLIR in the first place, otherwise anything done before will make assumption about it.