Details
- Reviewers
andrew.w.kaylor cameron.mcinally spatel
Diff Detail
Event Timeline
Ah, I didn't realize there were test differences from this change. It should probably be combined with D69988...
Attn: @craig.topper @eli.friedman. The FMF behavior looks like it will take a performance hit with the switch to default IEEE conformance. Is this something that needs to be publicized on llvm-dev before merging? Users who don't care will want to add an explicit denormal-fp-math attribute to their code.
Nit: the IR tests may be overly brittle -- checking for specific registers. It might be good to use update_test_checks.py so that we get the regex'd register expressions.
I think this patch is dependent on D69979 (set the denorm attributes based on other clang behavior). We need a test-suite end-to-end test or at least have clang tests reference the expected codegen tests, so we know that handshake between front-end and codegen is working as expected.
Nit: the IR tests may be overly brittle -- checking for specific registers. It might be good to use update_test_checks.py so that we get the regex'd register expressions.
Agree - if possible, regenerate/add tests as NFC changes before this patch.
llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll | ||
---|---|---|
101 ↗ | (On Diff #244283) | Does the "preseve" typo affect anything? |
llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll | ||
---|---|---|
101 ↗ | (On Diff #244283) | These aren't hitting the expansion, so I guess it doesn't. I assume this has something to do with not using reciprocal-estimates? |
llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll | ||
---|---|---|
101 ↗ | (On Diff #244283) | Yes, this file isn't well-specified about what it's trying to test for. I updated it here: That should remove it from this patch? Independent of that - should we have an assert or warning that would catch typos in the attribute string? |
llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll | ||
---|---|---|
101 ↗ | (On Diff #244283) | I remember implementing a verifier check for this before ,but I can't seem to find the patch |
llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll | ||
---|---|---|
101 ↗ | (On Diff #244283) | It would be good to revive that patch for safety. But this patch doesn't need to be gated by that improvement (IIUC, these diffs disappear if you rebase?). |
clang/lib/CodeGen/CGCall.cpp | ||
---|---|---|
1754 | I haven't followed the FP32 patches as closely as others, so I might have missed the reasoning here. As-is, we may need to explicitly say that FP32 is IEEE while everything is else is not (although I assume that would be a strange config). Do we have a test for that? | |
clang/test/Driver/denormal-fp-math.c | ||
11 | Remove 'TODO', but restore 'ieee'? |
clang/lib/CodeGen/CGCall.cpp | ||
---|---|---|
1754 | denormal-fp-math sets the mode for all types. denormal-fp-math-f32 overrides it in the case of f32. It's still possible to configure a function to assume flushing for non-f32, and ieee for f32. We do have backend tests for the strange IEEE f32, non-IEEE other case in the backend but I'm not sure for the frontend flag handling |
clang/lib/CodeGen/CGCall.cpp | ||
---|---|---|
1754 | Now that I think of it, I don't think it's possible to specify the backwards case since I only allowed -denormal-fp-math-f32 as a cc1 flag. The only user facing way to set it is through -cl-denorms-are-zero, which depends on the target handling, which doesn't want to flush non-f32 |
llvm/test/CodeGen/X86/sqrt-fastmath-mir.ll | ||
---|---|---|
115–117 | Double-check my understanding: we have the "ieee,ieee" attribute here, but we would never expect to see that in practice? |
llvm/test/CodeGen/X86/sqrt-fastmath-mir.ll | ||
---|---|---|
115–117 | Yes, ieee,ieee is the default, but explicitly setting it is also valid. In the AMDGPU specific patches, I switched the default based on calling convention for graphics shaders (but I like this less over time, and may remove this detail) |
I haven't followed the FP32 patches as closely as others, so I might have missed the reasoning here.
As-is, we may need to explicitly say that FP32 is IEEE while everything is else is not (although I assume that would be a strange config).
Do we have a test for that?
Is that better than making the FP32 attribute behave like the other FP attribute (assume IEEE unless otherwise specified)?