This is an archive of the discontinued LLVM Phabricator instance.

x86 interrupt calling convention: only save xmm registers if the target supports SSE
ClosedPublic

Authored by phil-opp on Feb 14 2017, 12:26 PM.

Details

Summary

The existing code always saves the xmm registers for 64-bit targets even if the target doesn't support SSE (which is common for kernels). Thus, the compiler inserts movaps instructions which lead to CPU exceptions when an interrupt handler is invoked.

This commit fixes this bug by returning a register set without xmm registers from getCalleeSavedRegs and getCallPreservedMask for such targets.

Diff Detail

Event Timeline

phil-opp created this revision.Feb 14 2017, 12:26 PM
tari edited edge metadata.Feb 14 2017, 12:44 PM

LGTM. I thought we already handled the no-SSE case, but looks like only for the 32-bit subtarget.

Please can you confirm if this fixes PR26413 (and if possible add a test)?

PR26413 is a different problem: The generated code for targets with SSE support uses movaps instructions instead of movups instructions, which leads to alignment exceptions because the stack is only 8-byte aligned.

This is my first time hacking on LLVM, so I don't know how to test this.

aaboud edited edge metadata.Feb 15 2017, 2:16 AM

Looks Good to me.
Can you add a LIT test?

See test\CodeGen\X86\x86-64-intrcc.ll for reference.

phil-opp updated this revision to Diff 88536.Feb 15 2017, 7:34 AM

I added a regression test.

(By the way, it seems like the test_isr_clobbers clobbers test of x86-64-intrcc.ll is broken, since the CHECK-SSE-NEXT commands are invalid.)

aaboud accepted this revision.Feb 15 2017, 8:06 AM

(By the way, it seems like the test_isr_clobbers clobbers test of x86-64-intrcc.ll is broken, since the CHECK-SSE-NEXT commands are invalid.)

I am aware of that, and I uploaded a fix as part of D22044.

This revision is now accepted and ready to land.Feb 15 2017, 8:06 AM

I don't have commit access. Could someone commit it for me?

I only have a minor comment about the test.

Given how simple is function @test_isr_sse_clobbers, I wouldn't be surprised if the codegen with/without -O0 is the same.
If so, then you should be able to get rid of CHECK0; at the moment, CHECK and CHECK0 are basically equivalent classes of checks.

I also suggest to automatically generate CHECK lines using 'update_llc_test_checks.py' (it is up to you).
Since the body of @test_isr_sse_clobbers is very small, I don't expect to see many automatically generated CHECK lines anyway.

phil-opp updated this revision to Diff 88590.Feb 15 2017, 12:17 PM

You were right: The generated code with and without -O0 is identical.

I removed the CHECK0 lines and regenerated the assertions using the python script.

andreadb accepted this revision.Feb 16 2017, 3:52 AM

Could someone commit this for me, please?

Could someone commit this for me, please?

Sure. I will commit this patch for you.

This revision was automatically updated to reflect the committed changes.