Details
Diff Detail
Event Timeline
Hi Sjoerd,
Looking at the vminmaxnm-safe.ll test file, there are many other condition code tests and patterns. Are these also relevant to FP16 as well?
cheers,
sam
Hi Sam, thanks for checking this and pointing this out. These instructions should be selected when 'fast math' is enabled. Tests in vminmaxnm-safe.ll are checking that this won't happen when fast math is not enabled, which I forgot to add. So I've added new file fp16-vminmaxnm-safe.ll. I've also moved the actual codegen tests to new file fp16-vminmaxnm.ll, because some cases were missing and there are a lot of them.
Cheers.
test/CodeGen/ARM/fp16-vminmaxnm-safe.ll | ||
---|---|---|
154 | Could you school an FP noob and explain why we don't expect two min instructions? |
Could you school an FP noob and explain why we don't expect two min instructions?
The VMINNM and VMAXNM variants handles NaNs in consistence with the IEEE754-2008 spec, the variants without NM apparently not.
Long story short, the initial selection dag of the "safe" tests contain nodes:
t14: f16 = fminnum t10, ConstantFP:f16<APFloat(18944)> t18: f16 = fminnan t14, ConstantFP:f16<APFloat(20544)>
whereas the "fast math" tests contain:
t14: f16 = fminnum t10, ConstantFP:f16<APFloat(18944)> t24: f16 = fminnum t14, ConstantFP:f16<APFloat(20544)>
as the "safe" test contains 1 fminnum node and 1 fminnan, we expect only 1 vminnm.
I forgot to mention, that due to your last comment, I had a closer look at the generated code and spotted a missed optimisation. Therefore I've also added a new pattern, as it was not always generating a vmin/vmax pattern. I've changed the "safe" tests to check the actual expected pattern; before the tests were not really great because they were not testing the full pattern.
Hi Sjoerd,
Would you mind generating the tests with the llc_update script please? In their current state, they're not showing that codegen is correct. And thanks for the schooling!
cheers,
Could you school an FP noob and explain why we don't expect two min instructions?