This is an archive of the discontinued LLVM Phabricator instance.

[X86] When doing callee save/restore for k-registers make sure we don't use KMOVQ on non-BWI targets
ClosedPublic

Authored by craig.topper on Feb 6 2018, 3:45 PM.

Details

Summary

If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass.

Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug.

Fixes PR36256

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Feb 6 2018, 3:45 PM
craig.topper edited the summary of this revision. (Show Details)

Does the intel_ocl_bicc calling convention vary depending on whether the CPU has BWI? If it does, we need more tests, to show code generation works correctly when BWI is enabled.

I can't find any public documentation for this calling convention. Adding a few more people. @zvi @delena @DavidKreitzer do you know how this calling convention is supposed to work?

For the interrupt calling convention it seems gcc errors if sse, mmx, and 80387 are not explicitly disabled.
delena added inline comments.Feb 6 2018, 10:46 PM
lib/Target/X86/X86InstrInfo.cpp
6922 ↗(On Diff #133094)

if you'll check STI and choose KMOVW here instead of KMOVD it will be enough.

craig.topper added inline comments.Feb 6 2018, 11:06 PM
lib/Target/X86/X86InstrInfo.cpp
6922 ↗(On Diff #133094)

I considered that but it would also hide PR36254 and maybe other future bugs. So I thought it was best to try to fix the issue closer to its source.

delena added inline comments.Feb 6 2018, 11:50 PM
lib/Target/X86/X86InstrInfo.cpp
6922 ↗(On Diff #133094)

But we do this for all other cases in this function - we check target and generate the right instruction.
IMO, accumulating logic in one place is always better than spreading it over the code.

craig.topper added inline comments.Feb 7 2018, 12:06 AM
lib/Target/X86/X86InstrInfo.cpp
6922 ↗(On Diff #133094)

None of the cases are changing the width of the save (except maybe BNDRRegClass). If we use KMOVW here we're saving 2 bytes into an 8 byte spill slot. With the fix in this patch we get the correct spill size.

The memory operand on the instruction would still show 8 bytes. It will print "8-byte spill" in the debug output. If we had any instructions that allowed load folding of k-registers(I know we don't) would it mess those up too?

Does the intel_ocl_bicc calling convention vary depending on whether the CPU has BWI? If it does, we need more tests, to show code generation works correctly when BWI is enabled.

I think this calling convention is used to compile code that accesses some opencl library methods. The library itself is not compiled with llvm, but rather icc. So this issue may not manifest in any real world code.

But you're right we should check both CPUs

delena accepted this revision.Feb 7 2018, 12:41 PM
This revision is now accepted and ready to land.Feb 7 2018, 12:41 PM
This revision was automatically updated to reflect the committed changes.