Page MenuHomePhabricator

[AMDGPU] Move default initialization of M0 register after the instruction selection
Needs ReviewPublic

Authored by vpykhtin on Jun 5 2020, 9:45 AM.



This allows to avoid problems with glue on extraction of lo16 bit of DS_READ. Note there is a small improvement in ds groupping on pre-gfx9.

There are some cleanup left, its better to made it in separate patch.

Diff Detail

Event Timeline

vpykhtin created this revision.Jun 5 2020, 9:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2020, 9:45 AM

Overall this is probably fine to move. FYI I was hoping to move in a different direction for m0 initialization, where we would stop reserving it and use ordinary copies to initialize it. This would allow us to move/eliminate the custom M0 init optimizations.

Can this remove the SI_INIT_M0 instruction now?


Could you just set hasPostISelHook to 0 for the special cases?




You could just use this in an assert if you disabled the post isel hook for the special cases


Should not rely on the memory operands. isDS() would be more reliable


Make a member function of SITargetLowering and avoid the need to find Subtarget etc.


Check the GDS bit



vpykhtin marked 2 inline comments as done.Jun 6 2020, 1:10 AM

The SI_INIT_M0 is still used for instructions that I mention in hasNonDefaultM0. Comments say it was introduced to produce S_MOV_B32 m0 so the CSE could join them.

There're a few instructions that set M0 out of intrinsic arguments like V_INTERP, they produce COPY to M0.


I thought to do it this way, the hook is also used for converting to no-ret atomics, I need to check if there is a case for atomic but not for default M0 init.


need to check with intersection on no-ret atomics.

vpykhtin updated this revision to Diff 269538.Jun 9 2020, 7:26 AM

Updated patch. Everything is done except I decided to left readsM0 check for the no-ret atomics case.

vpykhtin marked 8 inline comments as done.Jun 9 2020, 7:28 AM
vpykhtin added inline comments.

I decided to left readM0 check to distinguish from no-ret atomics usage.

I just had the realization this may be more appropriate to place in EmitInstrWithCustomInserter, rather than AdjustInstrPostInstrSelection. Since we have to specially treat most of the other special case DS instructions, you'll avoid the need to blacklist more of them. There might be an issue with where the verifier runs between isel and finalize-isel though


Shouldn't rely on getNumMemOperands. SelectionDAG has been known to occasionally drop memory operands