This is an archive of the discontinued LLVM Phabricator instance.

[builtins] Implement rounding mode support for i386/x86_64
ClosedPublic

Authored by kongyi on Nov 5 2019, 3:54 PM.

Diff Detail

Event Timeline

kongyi created this revision.Nov 5 2019, 3:54 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 5 2019, 3:54 PM
Herald added subscribers: llvm-commits, Restricted Project, mgorny. · View Herald Transcript
nickdesaulniers added inline comments.Nov 6 2019, 9:52 AM
compiler-rt/lib/builtins/i386/fp_mode.c
20

This could be a local const uint32_t (local to __fe_raise_inexact).

44–46

switch (cw & X87_RMODE_MASK). LLVM will optimize such expressions in a switch.

67

Should you use env.__mxcsr here? If so, then the preprocessor guard differs from the above.

70

Do *any* of these asm statements in this patch require volatile qualification?

craig.topper added inline comments.Nov 6 2019, 10:16 AM
compiler-rt/lib/builtins/i386/fp_mode.c
44–46

By the time its gets to IR and goes through mem2reg/SROA it should look the same right?

compiler-rt/lib/builtins/i386/fp_mode.c
44–46

Yep. Just slightly more concise.

kongyi marked 3 inline comments as done.Nov 6 2019, 11:16 AM
kongyi added inline comments.
compiler-rt/lib/builtins/i386/fp_mode.c
67

No, these are different. One is FPU environment and this one is for the SSE unit.

70

Yes, otherwise the entire code block may be optimised away.

kongyi updated this revision to Diff 228109.Nov 6 2019, 11:22 AM

gentle ping...

craig.topper added inline comments.Nov 11 2019, 7:40 PM
compiler-rt/lib/builtins/i386/fp_mode.c
62

Why do we need fnclex here?

63

Do we need an fwait after this to trigger the exception interrupt?

kongyi updated this revision to Diff 228961.Nov 12 2019, 2:29 PM
kongyi marked 3 inline comments as done.
kongyi added inline comments.
compiler-rt/lib/builtins/i386/fp_mode.c
62

Android's feraiseexecept has this, but it is indeed unnecessary. Removed.

63

Yes. Added.

compiler-rt/lib/builtins/i386/fp_mode.c
38

It seems that fenv_t would be defined in <cfenv>, can you include that instead of redefining it?
http://www.cplusplus.com/reference/cfenv/fenv_t/

Looks like all of the above constants are defined as enums there, too.

Would it be better to just do a dummy inexact division operation to trigger than inexact exception instead of manipulating the environment?

kongyi marked an inline comment as done.Nov 13 2019, 3:20 PM

Would it be better to just do a dummy inexact division operation to trigger than inexact exception instead of manipulating the environment?

I assumed this would have better performance than dummy inexact division operation. Do you want me to verify it?

Would it be better to just do a dummy inexact division operation to trigger than inexact exception instead of manipulating the environment?

I assumed this would have better performance than dummy inexact division operation. Do you want me to verify it?

Would it be better to just do a dummy inexact division operation to trigger than inexact exception instead of manipulating the environment?

I assumed this would have better performance than dummy inexact division operation. Do you want me to verify it?

I can't imagine fldenv/fstenv are very fast instructions. I wouldn't be surprised if they serialized the core and prevented speculation past them.

kongyi updated this revision to Diff 229640.Nov 15 2019, 2:10 PM
kongyi updated this revision to Diff 229641.Nov 15 2019, 2:13 PM

Remove unused include

nickdesaulniers accepted this revision.Nov 18 2019, 10:01 AM

No more nits from me. Please wait for LGTM from ctopper for functional review.

This revision is now accepted and ready to land.Nov 18 2019, 10:01 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Nov 18 2019, 3:32 PM
thakis added inline comments.
compiler-rt/lib/builtins/CMakeLists.txt
242

It looks a bit unusual that a file that works on both 32-bit and 64-bit x86 is in the i386 folder. All other files in the i386 folder are 32-bit only as far as I can tell.

kongyi marked an inline comment as done.Nov 18 2019, 3:36 PM
kongyi added inline comments.
compiler-rt/lib/builtins/CMakeLists.txt
242

I'm working on this, https://reviews.llvm.org/D69688

hans added a subscriber: hans.Nov 19 2019, 12:42 AM
hans added inline comments.
compiler-rt/lib/builtins/i386/fp_mode.c
20

This won't work with MSVC. See e.g. http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54305

I've reverted in a19f0eec94e.