This allows to avoid problems with glue on extraction of lo16 bit of DS_READ. Note there is a small improvement in ds groupping on pre-gfx9.
There are some cleanup left, its better to made it in separate patch.
Paths
| Differential D81275
[AMDGPU] Move default initialization of M0 register after the instruction selection Needs RevisionPublic Authored by vpykhtin on Jun 5 2020, 9:45 AM.
Details
Diff Detail
Unit TestsFailed Event TimelineComment Actions Overall this is probably fine to move. FYI I was hoping to move in a different direction for m0 initialization, where we would stop reserving it and use ordinary copies to initialize it. This would allow us to move/eliminate the custom M0 init optimizations. Can this remove the SI_INIT_M0 instruction now?
Comment Actions The SI_INIT_M0 is still used for instructions that I mention in hasNonDefaultM0. Comments say it was introduced to produce S_MOV_B32 m0 so the CSE could join them. There're a few instructions that set M0 out of intrinsic arguments like V_INTERP, they produce COPY to M0.
Comment Actions Updated patch. Everything is done except I decided to left readsM0 check for the no-ret atomics case. Comment Actions I just had the realization this may be more appropriate to place in EmitInstrWithCustomInserter, rather than AdjustInstrPostInstrSelection. Since we have to specially treat most of the other special case DS instructions, you'll avoid the need to blacklist more of them. There might be an issue with where the verifier runs between isel and finalize-isel though
This revision now requires changes to proceed.Aug 17 2023, 4:00 PM
Revision Contents
Diff 268850 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
llvm/lib/Target/AMDGPU/DSInstructions.td
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/test/CodeGen/AMDGPU/ds_read2.ll
llvm/test/CodeGen/AMDGPU/insert-subvector-unused-scratch.ll
|
Could you just set hasPostISelHook to 0 for the special cases?