This is an archive of the discontinued LLVM Phabricator instance.

[Spill2Reg][5/9] Code generation part 2.
Needs ReviewPublic

Authored by vporpo on Jan 26 2022, 5:54 PM.



Spill2Reg can now emit spill and reload instructions.
This will not generate correct code, as it does not keep track of live regs.

Diff Detail

Event Timeline

vporpo created this revision.Jan 26 2022, 5:54 PM
vporpo requested review of this revision.Jan 26 2022, 5:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2022, 5:54 PM
wxiao3 added a subscriber: wxiao3.Jan 26 2022, 7:19 PM
vporpo updated this revision to Diff 403833.Jan 27 2022, 4:33 PM

Added tests for NoImplicitFloat and for no sse.

vporpo updated this revision to Diff 405810.Feb 3 2022, 3:08 PM

Updated tests.

arsenm added a subscriber: arsenm.Feb 3 2022, 3:30 PM
arsenm added inline comments.

Is this assuming you can only spill one register to one vector register? What if you can place multiple values in different subregisters?

vporpo added inline comments.Feb 3 2022, 3:56 PM

Yes, for now we can only spill one register to the first lane of one vector register. The reason is that if we want to spill to another lane other than the first one in x86 then we need to use the PINSR/PEXTR instructions rather than MOVD which have a higher latency and use more uops. But yeah, I think it is still worth extending it to spill to more lanes in the future.

Here is the relevant data from Agner Fog's instruction tables:

                  uops   uops      uops
                  fused  unfused   each   latency  throughput
                  domain domain    port
MOVD mm/x r32/64    1     1        p5       2       1
MOVD r32/64 mm/x    1     1        p0       2       1

PINSRD/Q x,r,i      2     2        2p5      3       2
PEXTRB/W/D/Q r,x,i  2     2       p0 p5     3       1
arsenm added inline comments.Feb 4 2022, 7:10 AM

The reason I ask is because I'm very interested in using something more like this for AMDGPU. We currently have 2 custom, convoluted mechanisms for handling "spills" to registers. I'm wondering if we could adapt this pass to one of them, but it would require a broader notion of how/where the registers can be spilled (and there might be some additional liveness hazards)

vporpo retitled this revision from [Spill2Reg] Code generation part 2. to [Spill2Reg][5/9] Code generation part 2..Feb 4 2022, 9:49 AM
vporpo added inline comments.Feb 4 2022, 10:49 AM

Yeah that would make sense. I think we can make code generation a bit more sophisticated than it is now and have some target specific components decide which register and lane we should use on each spill.
Could you point me to the two mechanisms so I can take a look?

arsenm added inline comments.Feb 7 2022, 6:42 AM

I don't think either will be much use to you. One of the mechanisms doesn't really use vector registers in the same sense as a subregister, and also relies on reserved registers

vporpo updated this revision to Diff 439160.Jun 22 2022, 2:33 PM


Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 2:33 PM
Herald added a subscriber: jsji. · View Herald Transcript
Carrot added inline comments.Jul 1 2022, 2:31 PM

All test cases have the option -spill2reg-mem-instrs=0. It looks to me more like a debug purpose option. A more practical and positive performance impact value should be larger than 0. Could you add a test case for it.

vporpo added inline comments.Jul 1 2022, 3:38 PM

Yeah this is basically disabling the heuristic so that we can check the functionality even in small tests.
I am actually including some end-to-end tests with -spill2reg-mem-instrs set to default in the followup patch ( But yes, I agree, I should add one more test here to exercise this option.

vporpo updated this revision to Diff 441839.Jul 1 2022, 6:29 PM

Added spill2reg_simple_3.mir test to check -spill2reg-mem-instrs.

vporpo updated this revision to Diff 449359.Aug 2 2022, 11:27 AM

Updated checks in spill2reg_simple_2.mir