This is an archive of the discontinued LLVM Phabricator instance.

[Spill2Reg][8/9] Added code generation support for 8/16bit spills/reloads in x86
Needs ReviewPublic

Authored by vporpo on Jan 26 2022, 6:08 PM.

Details

Reviewers
Carrot
arsenm

Diff Detail

Event Timeline

vporpo created this revision.Jan 26 2022, 6:08 PM
vporpo requested review of this revision.Jan 26 2022, 6:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2022, 6:08 PM
craig.topper added inline comments.
llvm/test/CodeGen/X86/spill2reg_end_to_end_8bit.ll
71

This code looks invalid. movb can't have a %eax destination

wxiao3 added a subscriber: wxiao3.Jan 26 2022, 7:18 PM
vporpo added inline comments.Jan 26 2022, 10:13 PM
llvm/test/CodeGen/X86/spill2reg_end_to_end_8bit.ll
71

Yes, this patch needs more work. I need to update it.

vporpo updated this revision to Diff 403890.Jan 27 2022, 10:33 PM

Refactored the patch and fixed illegal instructions.

craig.topper added inline comments.Jan 27 2022, 10:59 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
9800

Should use the AVX or AVX512 opcodes when available?

Are you limiting to VR128 regclass or do you allow VR128X with AVX512?

llvm/test/CodeGen/X86/spill2reg_end_to_end_16bit.ll
71

Why is X86FixupBWInsts.cpp failing to convert this load to movzwl?

vporpo added inline comments.Feb 1 2022, 10:25 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
9800

I am trying to avoid the frequency throttling associated with using wide vector instructions. But I guess AVX should be safe ? Not sure about AVX512 though, won't it cause frequency drops if I use that instead?
I am currently using VR128.

llvm/test/CodeGen/X86/spill2reg_end_to_end_16bit.ll
71

XF86FixupBWInsts.cpp:208 checks whether the super reg %eax is live after movw D0(%rip), %ax. In the original code it is not live, but in our case it is because movd is reading from %eax. So it bails and does not replace the register.
Hmm not sure how we could fix this though.

vporpo updated this revision to Diff 405150.Feb 1 2022, 10:30 PM

Removed the subregs from the movd instructions.
Also added simple mir tests for 8/16 bits.

rcorcs added a subscriber: rcorcs.Feb 1 2022, 11:02 PM
vporpo updated this revision to Diff 405815.Feb 3 2022, 3:13 PM

Updated tests.

vporpo retitled this revision from [Spill2Reg] Added code generation support for 8/16bit spills/reloads in x86 to [Spill2Reg][8/9] Added code generation support for 8/16bit spills/reloads in x86.Feb 4 2022, 9:51 AM
vporpo updated this revision to Diff 450045.Aug 4 2022, 10:18 AM

Rebased.

Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2022, 10:18 AM
Carrot added inline comments.Aug 15 2022, 2:27 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
9849

Does it cause partial write whole read hazard?

vporpo added inline comments.Aug 15 2022, 3:38 PM
llvm/lib/Target/X86/X86InstrInfo.cpp
9849

It might, though I don't know what the overhead is. Do you know how I could
measure it? This code shows up in the test CodeGen/X86/spill2reg_end_to_end_8bit.ll but it runs ~7% faster with spill2reg enabled on a skylake machine, but perhaps this is not a good test to evaluate it.

Do you think it is worth adding a flag that disables it by default?