User Details
- User Since
- Jul 31 2019, 5:45 AM (227 w, 3 d)
May 30 2023
May 26 2023
Small formatting fix for the failed pre-merge check.
May 25 2023
Updated comment.
May 16 2023
More specifically, the cases where results differ are the tests that use a large exponent and can trigger a underflow/overflow when the rounding mode is changed to FE_UPWARD or FE_DOWNWARD. For example:
May 11 2023
This was built and tested using the LLVM embedded toolchain for Arm which uses picolibc.
May 10 2023
Mar 15 2023
Nov 2 2022
Sorry for the quiet on this. I'm going to abandon this for the moment, as what I eventually found was that there was some ambiguity in the ARM ABI regarding half-floats which would be better to address first, so that the attributes can map directly. There is currently only one ARM build attribute for denormals which reads as though it affects all precisions, but may not have been updated after half-float support was added. Since that maps to denormal-fp-math, which also controls all precisions, both may need splitting rather than just the function level attribute.
Jul 8 2022
There are currently no Arm specific changes, this is just being able to more accurately describe the floating point environment via attributes in the case where singles and doubles should be flushed, but not halves.
Jul 7 2022
Jun 30 2022
LGTM too, thanks.
Jun 29 2022
Jun 28 2022
Thanks, I'd started looking at how to include fcmp but you were quicker.
Jun 23 2022
That fneg gets handled differently came up during the review, https://reviews.llvm.org/D116952#3417339 and I found it similarly didn't get flushed on Arm.
That's interesting. Is test_float_fneg_pzero_f32_out() the only one to have a problem? All the fneg tests do the same thing because fneg should not be affected by the denormal-fp-math attribute, so the result should always be the input negated.
Jun 20 2022
Jun 17 2022
Jun 16 2022
Ping
May 17 2022
Tests moved out and pre-committed in https://reviews.llvm.org/D125807
May 6 2022
Sorry, the instcombine tests are from the previous version and shouldn't have been included in the diff.
May 3 2022
I've moved the tests to instsimplify, and expanded them out to cover more cases and additional instructions. While there may be some overlap, this structures it a bit better and ensures cases don't get conflated: depending on the instruction some zero results can be obtained from either the input getting zeroed or the output getting zeroed, so it's better to test both separately.
Apr 29 2022
Apr 8 2022
I've removed FNeg from the changes; indeed it wasn't affected by denormal mode wherever I tried. That did allow me to refactor slightly to wrap around just ConstantFoldBinaryOpOperands, and fall back to that when dealing with a constant expression or functionless instruction just as before. So for the moment the denormal mode information is only applied in situations where it is available, which is one step forward at least.
Mar 30 2022
The issue I found was trying to use getDefaultDenormalModeForType during constant folding to account for denormals (https://reviews.llvm.org/D116952). Setting denormal-fp-math to a non-IEEE mode without specifying denormal-fp-math-f32 still results in the denormal-fp-math-f32 attribute being present (even if unsued elsewhere), which leads to the wrong result for targets that do not support denormal-fp-math-f32.
When I originally checked the other calls to ConstantFoldBinaryOpOperands did not look like they would potentially be handling floating point instructions, although on second look, I missed InstructionSimplify::foldOrCommuteConstant. The same approach should work there too though, so I can expand the patch to cover that usage.
Mar 29 2022
Added a test that checks attributes based on the -fdenormal-fp-math and -fdenormal-fp-math-f32 flags.
Mar 28 2022
I've updated the patch with a new version which now takes the denormal handling mode from the function attribute, and adjusted the title/summary to reflect this. This supports both different settings for inputs and outputs to the instruction, as well as whether values get flushed positive zero or the sign is preserved.
Feb 8 2022
Thanks for taking a look, it does appear I misunderstood a few things.
Jan 27 2022
Ping for any other comments.
Jan 20 2022
I've updated the patch to use TargetTransformInfo to determine whether the target supports flushing to zero. I'm slightly wary, as there is an warning discouraging using TargetTransformInfo, but I unsure of an alternative.
Jan 12 2022
Thanks for highlighting that. Producing the appropriate result for the hardware was what I meant, so based on that I would have to rework this to handle different targets.
Jan 10 2022
May 14 2021
May 13 2021
Apr 28 2021
Apr 16 2021
Removed one duplicated line.
I've updated the patch to fix the test failures, and slightly reworked the driver code to avoid the above iterator invalidation. I've also added a comment there to clarify what it is doing: individually determining whether the sha2 and aes features should be enabled and explicitly setting them, since they can be controlled both by crypto and their specific feature. Using the last occurance of either in the vector ensures whatever options are passed to -mcpu/-march are evaluated in the correct order.
Mar 22 2021
Oct 31 2019
Oct 23 2019
Updated with the new name for the option.
Oct 22 2019
I think -f[no-]force-dwarf-frame suitably describes the behavior, and looks in line with other options. I'll update the patch shortly unless anyone else has any other input.
Oct 17 2019
I added the negative option more as a way to disable the flag, since I'm currently looking at cases where it may want to be turned on by default (and a negative option would then allow you to only get .eh_frame in cases where you'd get both .debug_frame/.eh_frame).
I already spotted the line in CommandFlags.inc needs formatting with a couple of breaks. Also the help text in Options.td could be clearer. In particular, -gno-dwarf-frame shouldn't suggest .debug_frame won't be generated at all (since -g might still emit it). It should probably be more along the lines of:
Oct 11 2019
I've modified the patch so that the new flag will ensure the cfi instructions are actually present to be emitted as well. I went ahead and renamed the flag -gdwarf-frame too, to better reflect that it's dealing with the debug information you'd otherwise get with -g, and is meant to specifically put the information in a .debug_frame section and not .eh_frame.
Sep 6 2019
I was actually torn myself on whether to put the flag in the g group or not, so I'm happy to rename it. As far as I could find, no compiler has an existing option to control this: instead armcc always includes a debug_frame section by default to follow Arm's Dwarf specification. Having it as an option seems more flexible than forcing a different behavior.
Sep 5 2019
Aug 15 2019
Ping. @compnerd any other changes before this could be accepted?
Aug 8 2019
Adjusted the formatting on some comment lines, and added FIXMEs for all the constraints that require additional validation to clarify what is still needed and where.