Page MenuHomePhabricator

[AMDGPU] Move default initialization of M0 register after the instruction selection
Needs ReviewPublic

Authored by vpykhtin on Jun 5 2020, 9:45 AM.

Details

Summary

This allows to avoid problems with glue on extraction of lo16 bit of DS_READ. Note there is a small improvement in ds groupping on pre-gfx9.

There are some cleanup left, its better to made it in separate patch.

Diff Detail

Event Timeline

vpykhtin created this revision.Jun 5 2020, 9:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2020, 9:45 AM

Overall this is probably fine to move. FYI I was hoping to move in a different direction for m0 initialization, where we would stop reserving it and use ordinary copies to initialize it. This would allow us to move/eliminate the custom M0 init optimizations.

Can this remove the SI_INIT_M0 instruction now?

llvm/lib/Target/AMDGPU/DSInstructions.td
55

Could you just set hasPostISelHook to 0 for the special cases?

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
10790

Capitalize

10792

You could just use this in an assert if you disabled the post isel hook for the special cases

10821–10822

Should not rely on the memory operands. isDS() would be more reliable

10825

Make a member function of SITargetLowering and avoid the need to find Subtarget etc.

10828

Check the GDS bit

10853

TII->isDS()?

vpykhtin marked 2 inline comments as done.Jun 6 2020, 1:10 AM

The SI_INIT_M0 is still used for instructions that I mention in hasNonDefaultM0. Comments say it was introduced to produce S_MOV_B32 m0 so the CSE could join them.

There're a few instructions that set M0 out of intrinsic arguments like V_INTERP, they produce COPY to M0.

llvm/lib/Target/AMDGPU/DSInstructions.td
55

I thought to do it this way, the hook is also used for converting to no-ret atomics, I need to check if there is a case for atomic but not for default M0 init.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
10792

need to check with intersection on no-ret atomics.

vpykhtin updated this revision to Diff 269538.Jun 9 2020, 7:26 AM

Updated patch. Everything is done except I decided to left readsM0 check for the no-ret atomics case.

vpykhtin marked 8 inline comments as done.Jun 9 2020, 7:28 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/DSInstructions.td
55

I decided to left readM0 check to distinguish from no-ret atomics usage.

I just had the realization this may be more appropriate to place in EmitInstrWithCustomInserter, rather than AdjustInstrPostInstrSelection. Since we have to specially treat most of the other special case DS instructions, you'll avoid the need to blacklist more of them. There might be an issue with where the verifier runs between isel and finalize-isel though

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
10853

Shouldn't rely on getNumMemOperands. SelectionDAG has been known to occasionally drop memory operands