Page MenuHomePhabricator

[X86] Add FPCW as a register and start using it as an implicit use on floating point instructions.

Authored by craig.topper on Feb 4 2019, 9:22 PM.



FPCW contains the rounding mode control which we manipulate to implement fp to integer conversion by changing the roudning mode, storing the value to the stack, and then changing the rounding mode back. Because we didn't model FPCW and its dependency chain, other instructions could be scheduled into the middle of the sequence.

This patch introduces the register and adds it as an implciit def of FLDCW and implicit use of the FP binary arithmetic instructions and store instructions. There are more instructions that need to be updated, but this is a good start. I believe this fixes at least the reduced test case from PR40529.

Diff Detail


Event Timeline

craig.topper created this revision.Feb 4 2019, 9:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2019, 9:22 PM
dim added a subscriber: dim.Feb 5 2019, 2:49 AM

For what it's worth, this builds and tests correctly for me, and also appears to solve the test cases from bug 40206.

290 ↗(On Diff #185234)

It looks like you recently renamed this to fpsr, so this must be rebased.

Really rebase

Is it possible to get PR40206 + PR40529 test coverage?

Well I added the test for pr40529.ll already but I guess I didn't really reproduce the bug when doing so. Let me try again to see if I can find the right CPU name and triple to get it to really be wrong.

Rebase after fixing the test case for pr40529

craig.topper marked an inline comment as done.Feb 6 2019, 12:53 PM
craig.topper added inline comments.
73 ↗(On Diff #185608)

This looks like we were still miscompiling this test despite this bug being closed

dim added inline comments.Feb 6 2019, 12:57 PM
73 ↗(On Diff #185608)

Hm yes, in rL321424 @RKSimon added a pr34080-2.ll test case, but didn't touch this one, for some reason.

Add FPCW to more instructions.

RKSimon accepted this revision.Feb 7 2019, 4:37 AM

LGTM but a PR40206 test case as well would be useful

73 ↗(On Diff #185608)

I think it shows the limits of the fix I put in - this approach seems a lot more thorough

This revision is now accepted and ready to land.Feb 7 2019, 4:37 AM

LGTM but a PR40206 test case as well would be useful

Both PR40529 and PR40206 are caused by a multiply instruction being moved between the fldcw. They aren't really different. The case I put in for pr40529 is the same reduced test case that Dimitry put in both bugs.

This revision was automatically updated to reflect the committed changes.