Details
Diff Detail
- Repository
- rL LLVM
- Build Status
- Buildable 17349 - Build 17349: arc lint + arc unit 
Event Timeline
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 440 | Should the default be selectImpl? | |
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 440 | Eventually yes, but enabling it for everything now could lead to crashes and miscompiles that will prevent the SelectionDAG fallback from activating. I would like to get a full piglit run working with GISel for simple shaders and SelectionDAG for everything else before we make selectImpl() the default. | |
Update TableGen patterns so that they will automatically fold immediates into
the instruction.
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 478 | We don't fold immediates in the DAG selector now, so why do this here? | |
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 478 | In fact, I remember there are some edge cases where folding immediates directly doesn't work due to moveToVALU. Basically, if you initially generate an instruction like S_ADD with an immediate and that instruction then gets moved to VALU it becomes a VOP3-encoded V_ADD which cannot have an immediate (because of the restriction that instructions can be at most 64 bits long), leading to machine instruction verifier errors. The question is whether the same issue may apply here when going through GlobalISel in corner cases, even if we try to be better at selection the correct ALU directly. I think it makes sense to be a bit conservative initially. | |
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 478 | 
 Because we can? Ideally doing folding in during ISel would allow us to eliminate or greatly simplify the SIFoldOperands pass, but admittedly, global-isel is so new I don't know if that would be possible. I can go either way on folding during ISel, but it would at least be nice to have a target feature to enable this. How do you feel about selecting directly to _e32 variants when we can instead of _e64, which also something that global-isel is doing? | |
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 478 | 
 Actually, thinking about this more I would prefer to have global-isel do immediate folding by default with a target option to disable it. global-isel is still very experimental for AMDGPU, and I would like to take advantage of as many new features as possible while it's still in the experimental state. This will also help us learn what's possible and what isn't. | |
| 478 | In theory global-isel won't just be better at selecting the correct ALU, it will *always* select the correct ALU, so we shouldn't have to worry about corner cases like this, but this is all in theory until we actually have a more complete global-isel implementation. | |
Hmm, you may have a point about trying to push the boundaries of what's possible by folding immediately. It would certainly be nice if it was possible.
| lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
|---|---|---|
| 478 | I don't particularly like either. I don't think doing these during selection eliminates the need to do these things later. Other passes will be introducing and changing instructions, which will introduce new places these need to happen. It's true most of the places that need this now are junk created by SIFixSGPRCopies hacks. I also think for good debug info constants are supposed to be materialized into a register before use. There are also multiple use and code size considerations when an immediate is used multiple times, and it's probably not appropriate for a single instruction pattern to be considering those, and a proper optimization pass would have to do it. There is some stuff that has become problematic we do now in SIFoldOperands (the way clamp folding is implemented for example starts to break down with packed operations) that would be better to move when we have more semantically pure pseudoinstructions. Similarly, SDWA would probably be easier do handle during isel (although it has some of the other code size consideration issues as immediates) Doing folding (especially for encoding shrinking) must be done later, since things like frame index elimination introduce new constants to deal with very late. | |
I've dropped the immediate folding for now.  This is blocking the rest of
my global isel patches, so I would rather just get this simple version
committed.  We can always bring back the immediate folding if we think
it will be useful.
LGTM. At this point I would be more interested in what source and dest modifier folding looks like in global isel
Should the default be selectImpl?